Response Bias: a Special Opportunity

Clark Leavitt, The Ohio State University
ABSTRACT - The importance of response bias is an unresolved issue within psychology at the present time. However, evidence seems to point to two conclusions: (1) several kinds of bias can have an important effect under a variety of conditions both in the form of spurious relations and lack of sensitivity; (2) there are definite strategies for dealing effectively with the three most common types of bias. The paper discusses the social desirability bias, the acquiescence bias and the evaluative bias. These three biases are treated as constructs and the evidence for each is looked at from the point of view of construct validity. The paper recommends a strategy for dealing with each during the construction phase of questionnaires and scales.
[ to cite ]:
Clark Leavitt (1977) ,"Response Bias: a Special Opportunity", in NA - Advances in Consumer Research Volume 04, eds. William D. Perreault, Jr., Atlanta, GA : Association for Consumer Research, Pages: 401-404.

Advances in Consumer Research Volume 4, 1977   Pages 401-404

RESPONSE BIAS: A SPECIAL OPPORTUNITY

Clark Leavitt, The Ohio State University

ABSTRACT -

The importance of response bias is an unresolved issue within psychology at the present time. However, evidence seems to point to two conclusions: (1) several kinds of bias can have an important effect under a variety of conditions both in the form of spurious relations and lack of sensitivity; (2) there are definite strategies for dealing effectively with the three most common types of bias. The paper discusses the social desirability bias, the acquiescence bias and the evaluative bias. These three biases are treated as constructs and the evidence for each is looked at from the point of view of construct validity. The paper recommends a strategy for dealing with each during the construction phase of questionnaires and scales.

INTRODUCTION

Consumer behavior researchers address themselves to several major problems that require accurate prediction of consumer response: For example, the estimation of demand for particular items by different segments of the public. The special opportunity to improve the prediction of demand comes about because currently we do little to reduce the predictive association with components of response of an artifactual nature, namely, biased response.

Response bias is generally associated with systematic error of measurement as opposed to unsystematic or chance error. The latter is covered by the idea of reliability. Granted that response bias is the systematic type, there are two somewhat different ways in which it has been partitioned within psychometric theory. Traditionally it has been regarded as "constant error" - as an annoying distortion in measurement that must he guarded against in various ways. An example would be order effects which are minimized by using counterbalanced orders.

A second way to approach response bias is to look at it as a result of a construct. Thus the social desirability bias is conceived as a manifestation of the respondent's need for approval in the test situation.

Either approach will lead to an improvement in predictive accuracy but the second approach has the advantage of greater generalizability because it entails a greater understanding of the underlying processes in order to achieve purification of prediction.

Essentially this purification is accomplished by improving the constructs used in understanding consumer behavior. Construct validity is a key concept in the methodology of applied behavior research because in the long run greater progress will be made by focusing on construct validity than by concern for immediate predictive validity which otherwise changes from one time to another. Unless supported by theory, what is working pragmatically today will lose its effectiveness tomorrow. What works with one group will fail with different people or different products.

I shall take the second approach here and look at response bias as an aspect of construct validity. I shall begin with a brief review of construct validity. Next I will define response bias more fully. The remainder of the paper will be devoted to an explication of three of the most common forms of response bias that can be controlled by psychometric procedures. They also mediate other demand characteristics that are sources of bias, such as experimenter expectancy, but these will not be discussed. The focus will be on ways of constructing measuring instruments that minimize all determinants of bias no matter what their cause by attempting to understand and deal directly with the mediating constructs.

CONSTRUCT VALIDITY

To demonstrate the meaning and validity of a construct requires that it be related to other constructs. The power of constructs rests in the existence of an interrelated network of constructs; no single construct is worth much by itself in behavioral science. By themselves our constructs are too weak to be disconfirmed. In relating constructs to each other, measurement bias is an especially important threat because both constructs are vulnerable to the same kind of distortion by the same kind of response bias.

A study by Wilkie, McCann and Reibstein (1974) nicely illustrates both the construct approach to response bias and one of the pitfalls in analyzing its effects. They demonstrated that instructions designed to emphasize evaluation produced a more highly correlated set of multi-attribute ratings than instructions designed to elicit a more descriptive orientation on the part of raters. They also demonstrated that the more evaluative rating correlated higher with another evaluative rating (brand preference ranking) in another part of the interview. Based on the common tendency for different responses of all kinds from the same respondent, especially in the same questionnaire, to correlate this attribute-preference correlation is a clear demonstration of the ubiquity of the halo-effect. Yet the authors interpreted the finding as indicating improved predictive ability of the evaluative response. This is a common fallacy in consumer behavior research.

Construct validity refers to the consistency among measures of different constructs. Consistency or association is a relative idea - association among some constructs implies lack of association with others. If A is associated with B there must be a C with which A is not associated. Such is required to make our measurements of association meaningful. We may validate the construct of social desirability by showing that it correlates with projective measures of need for approval and does not correlate with projective measures of need for power.

Note then that construct validity is not a characteristic of tests - that is, of items or questions. Rather it is inferred from response associations among measures over different items, people and situations. Alternative explanations of measurement consistencies can stem from any of these sources or their interactions. Construct validation, then, involves the eliminations of these alternative explanations or threats to validity. The general strategy is to propose counter-explanations to observed consistency and independence and to disconfirm these alternative explanations in favor of the construct explanation. Response biases may be alternatives that need to be disconfirmed by appropriate psychometric procedures.

Figure 1 shows a more or less typical finding in which two groups who differ on a psychological trait are compared on their reaction to an independent variable -physical attractiveness of a message source in this particular case. There is no interpretative problem. The open subjects are influenced by the attractiveness of the source and the cautious subjects are not. This occurs under a condition of low expertness. Now observe what happens under a condition of high expertness.

FIGURE 1

LOW EXPERTNESS

FIGURE 2

HIGH EXPERTNESS

Figure 2 shows that the open subjects are uniformly more influenced by all sources. (The lines are not significantly different across attractiveness.) If the psychological variable of open versus cautious processing were not based on a balanced scale the results in Figure 2 would be very hard to interpret because yea-saying provides a reasonable alternative explanation: open subjects say yes to the attitude scale and to the measure of open-processing. However, because it is a balanced scale, it is much more likely that open subjects are more affected by the message from the high expert source. [Thanks to Benoy Joseph for these data.]

KINDS OF RESPONSE BIAS

A response bias is a person's tendency to respond in a particular situation to aspects of measures other than their content in the special sense intended the researcher's definition of the construct being measured. They include the tendency to respond desirably, acquiescently, agreeably, cooperatively, defensively, deviantly, reactively, evaluatively, feelingly, extremely, guessingly, critically, variably and positively.

A bias is a construct that can account for response consistency in an alternative manner to the construct being investigated. A bias is a trait, set, style, or motive like any other construct measured by questionnaires. It differs only in the fact that the researcher ordinarily wants to avoid measuring it and may even be unaware of it. I shall focus on three of these that seem to be especially potent threats to validity interpretations based on tests of the kind frequently used in consumer research. These three biases are the tendencies to respond desirably, acquiescently (2 kinds) and evaluatively.

Social Desirability

In an early effort to control the desirability bias, Edwards (1953) constructed a scale in which respondents were required to choose among alternatives judged equal in social desirability, thus eliminating desirability as a factor determining the choice. A different approach is that of Crowne and Marlowe (1964) who developed a Social Desirability Scale (SDS) composed of items defined as being "socially desirable, nonreflective of pathology and improbable of occurrence." Items correlating highly with the SDS can be eliminated from other construct measures as a way of eliminating the biasing effects of the need for approval. Thus inter-item correlations that provide evidence for construct validity will not be inflated by spurious correlations based on the common social desirability of the items. Jackson (1971) has proposed using a specific correction term in item selection in which the social desirability component is subtracted from the item-test correlation as a correction term in item selection (the differential reliability index).

How important is social desirability in a practical sense? There seems to be general agreement that it exerts at least a mild spurious effect on prediction. Klassen et al. (1975), for example, report a .31 correlation with ratings of life satisfaction. This suggests that satisfaction ratings might be particularly sensitive to the desirability artifact.

Acquiescence

Bentler, Jackson and Messick (1971) have distinguished two forms of this response tendency: agreement bias and acceptance bias. Agreement bias refers to the tendency to say yes or no to any statement. It was first publicized by Couch and Keniston (1960) who felt that "yea saying" was a significant personality trait. Acceptance bias is rooted in the tendency to see any item as applicable to the object being judged. Thus if respondents are asked if they are happy, they may say "yes" because (1) they feel that they are happy rather than sad. If they have a strong tendency to accept all items as self-descriptive they also would endorse sad in addition to happy. If they were yea-sayers, they likewise would endorse both happy and sad in accordance with the agreement bias. The critical distinction between the tests of the two groups comes from the additional use of negative reversal items like "not happy." Those with an agreement set would endorse that item but those with an acceptance set would not.

The following table adapted from Bentler et al. (1971) illustrates the kind of effect to be expected from the presence or absence of the target construct and the two response biases.

TABLE

The main strategy for dealing with acquiescence is to use a balanced scale in which the negative response is subtracted from the positive. In the above example, those who legitimately check yes for happy would check no for both not happy and sad since these are presumably true polar opposites. On the other hand, the acceptance biased people (who illegitimately checked happy) would end up with a neutral score on happy when their agreement with sad was subtracted. The agreement bias would be controlled by subtracting either agree-sent with not happy or with sad. Thus a balanced scale with sensible item reversals will eliminate inflationary effects of either bias. Note that this does not occur when only items that have been reversed by adding not are used.

TABLE 1

SEMANTIC ADJECTIVE DIFFERENTIAL CORRELATIONS

Table 1 illustrates the possible magnitude of effects of acquiescence. In this particular case the pattern of correlations among the three Osgood dimensions is radically changed and would lead to an entirely different interpretation of the construct networks.

There are many unresolved psychometric issues here but the importance of scale balance seems evident. The importance of this issue has been pointed out by many contributions to the consumer behavior literature beginning with the work of Wells (1960) and continuing through the present (Becker and Myers, 1970).

Evaluative Bias

This is perhaps the oldest and best known of the three biases but, like the two preceding ones, it has been redefined recently as a construct. The "halo effect" has been around for a long time in the traditional role of an error to be gotten rid of. In this role it might be defined as a lack of discrimination among attributes when rating people, objects, or self. The rater adopts an overall set about how good the object is and rates each attribute in these terms. The halo is often determined by liking but can be the result of many demand characters of the situation - for example, the need to make a choice.

Evaluation is particularly important to consumer researchers because it is identified with the affective component of attitude and is essentially what is measured by most choice measures such as intention-to-buy and ranked preferences. This leads some consumer researchers to use one method variation in an attempt to validate a different method version of an evaluative scale.

The shift to regarding halo as a construct rather than a simple error came with Osgood's study of semantic dimensions of meaning and his factor analytic isolation of an evaluative factor.

Evaluation (E) demonstrates the value of the construct approach taken in this paper versus the more traditional error approach. Osgood regarded E as a dimension of semantic meaning on the basis of a factor analytic study which also isolated several other dimensions. There has been some accumulation of evidence that E is so strong or ambiguous that it sometimes obscures other such dimensions of processing (see Mehrabian and Russell, 1974). Such an error effect can be dealt with by using scales that have high factor loadings on the in- tended dimension and low loadings on the other dimensions, especially E. It is extremely important that an adequate sampling of items be used in development of the factors in order to clarify the true nature of the processing dimension and also, in order to find enough that have high unique loadings, that a large number of trial items be used. There is evidence of a reciprocal relation between content loadings and various kinds of error loadings in the case of acquiescence. Items with unique high loadings tend to have lower acquiescence. As the specific factor loading increases, probably the less vulnerable the item is to error bias of all kinds, including E. A high loading on activity, for example, and a low loading on E will obviously minimize evaluative bias.

Evaluative response is a bias when used uncritically but it may be the best predictor of a criterion. On the other hand, it may be a poor predictor, especially of behavioral criteria. One reason is that the evaluative response is situation specific. A test situation is complex and burdensome and a simple decision strategy regarding how to answer questions can save the respondent time and effort. Thus any difficulty with or ambiguity in the questionnaire may facilitate evaluation responding. The consumer decides, that brand X is generally good or bad and responds to every item according to its good or bad implications in order to save the time and effort required to estimate its standing on each attribute or in order to respond to demand artifacts. In a later choice situation more realistic considerations may dominate response. E responding is likely to be affected by a wide array of pressures that distort response ranging from experimenter expectation to ambiguity. To take an extreme position, we might regard the E scale as a deception scale or lie scale such as those on the MMPI where positive responses indicate invalidity of responses to other items. Thus a unique profile on several more substantive dimensions may be more predictive of actual behavior than scores on the E dimension.

SUMMARY

Response biases have been viewed here as constructs requiring the same network of association for validation as any other idea dealt with at the level of construct validity. The advantage of this approach is that it facilitates a more general, less situation-specific understanding of the causes of distortion in tests and questionnaires. This in turn leads to more scope in the strategies used to counteract these effects.

Three such strategies were discussed. (1) Elimination of items coordinating high with a biasing trait. This strategy was illustrated by use of a scale to measure social desirability which is then correlated with potential items, eliminating those with high correlations. (2) Reversal of items to produce a balanced scale. Balancing a scale with reversed scoring of true reversed items such as "sad" for "happy" will eliminate the effects of acquiescence of both kinds - agreements and acceptance. (3) Factor analysis of component dimensions of response and elimination of items with low "purity," especially those that load on the E dimension.

By examining the operation of desirability, acquiescence, and evaluation, most of the crude response bias can be eliminated from our instruments. Since at this stage of development of the art of predicting choice, scales are frequently not even assessed for reliability, let alone for alternative explanations of validity, a great deal of room for improvement exists. The payoff is increased accuracy in prediction of consumer demand. Even though the cost of more extensive pretesting is not negligible, there will he a high cost-benefit ratio.

REFERENCES

B. W. Becker and J. E. Myers, "Yeasaying Response Style," Journal of Advertising Research, 6(1970), 31-37.

P. M. Bentler, Douglas N. Jackson and Samuel Messick, "Identification of Content and Style: A Two-dimensional Interpretation of Acquiescence," Psychological Bulletin, 3(1971), 186-204.

Arthur Couch and Kenneth Keniston, "Yeasayers and Naysayers: Agreeing Response Set as a Personality Variable,'' Journal of Abnormal and Social Psychology, 60 (1960), 151-174.

Douglas P. Crowne and David Marlowe, The Approval Motive. (Wiley, 1964).

Alien L. Edwards, Manual for the Edwards Personal Preference Schedule. (New York Psychological Corp., 1953).

Douglas N. Jackson, "Multimethod Factor Analysis in the Evaluation of Convergent and Discriminant Validity," Psychological Bulletin, 1 (July 1969), 30-49.

Douglas N. Jackson, "The Dynamics of Structural Personality Tests," Psychological Review, 3(1971), 229-248.

Diedre Klassen, Robijn K. Hornstra and Peter B. Anderson, "Influence of Social Desirability on Symptoms' and Mood Reporting in a Community Survey," Journal of Consulting and Clinical Psychology, 4(1975), 448-452.

Albert Mehrabian and James A. Russell, An Approach to Environmental Psychology. (Cambridge MIT Press, 1974).

W. D. Wells, "The Influence of Yeasaying Response Style," Journal of Advertising Research, 2(1960), 1-12.

William L. Wilkie, John M. McCann and David J. Reibstein, "Halo Effects in Brand Belief Measurement: Implications for Attitude Model Development," Scott Ward and Peter Wright (Eds.) Advances In Consumer Research, Association for Consumer Research, 1(1974), 280-90.

----------------------------------------