Effects of Multiple Measurement Operations on Consumer Judgment: Measurement Reliability Or Reactivity?

Frank R. Kardes, University of Cincinnati
Chris T. Allen, University of Cincinnati
Manuel J. Pontes, University of Florida
ABSTRACT - An experiment was conducted to investigate the effects of multiple measurement operations on judgment. Although multiple measures are needed to assess measurement reliability, the results indicate that the use of multiple measures can also change the nature of the construct undergoing assessment. Implications of the results for managing the reliability/reactivity tradeoff are discussed.
[ to cite ]:
Frank R. Kardes, Chris T. Allen, and Manuel J. Pontes (1993) ,"Effects of Multiple Measurement Operations on Consumer Judgment: Measurement Reliability Or Reactivity?", in NA - Advances in Consumer Research Volume 20, eds. Leigh McAlister and Michael L. Rothschild, Provo, UT : Association for Consumer Research, Pages: 280-283.

Advances in Consumer Research Volume 20, 1993      Pages 280-283


Frank R. Kardes, University of Cincinnati

Chris T. Allen, University of Cincinnati

Manuel J. Pontes, University of Florida


An experiment was conducted to investigate the effects of multiple measurement operations on judgment. Although multiple measures are needed to assess measurement reliability, the results indicate that the use of multiple measures can also change the nature of the construct undergoing assessment. Implications of the results for managing the reliability/reactivity tradeoff are discussed.

Few measurement procedures have achieved such widespread acceptance as has the use of multiple measurement operations. The use of multiple scales has become a standard practice that is rarely, if ever, questioned. The reason for this is clear: multiple scales must be used to assess measurement reliability (Churchill 1979; Cook and Campbell 1979; Peter 1979). However, recent work suggests that exposure to measurement scales can induce respondents to form judgments that would not have been generated had no measurement instruments been administered (Feldman and Lynch 1988).

How can measurement influence judgment? Consider the processes involved in responding to measurement instruments. When a consumer is asked to express a judgment about a product or service, this person is likely to (a) search the environment and memory for relevant information, (b) construe the judgmental implications of this information, (c) integrate the information to arrive at a single overall value, and (d) communicate this judgment in a manner assumed to be meaningful to the researcher (Feldman and Lynch 1988; Wyer and Srull 1989). These search, construal, integration, and communication processes increase the consumer's focus on a specific judgment.

We suggest that this increased focus can influence both the strength as well as the content of a judgment. The content of a judgment refers to its position along a subjective continuum. This position is mapped onto a set of ordered categories presented by the researcher (e.g., a seven-point scale with end-points labeled "Extremely bad" and "Extremely good"). The strength of a judgment refers to its accessibility from memory (Fazio 1989). Although several studies have examined measurement effects on judgment content (for reviews, see Feldman and Lynch 1988; Fischhoff 1991), few studies have examined measurement effects on judgment strength.

The distinction between content and strength is critical because though prior research on evaluative judgments (i.e., attitudes) centers primarily on attitude valence (for reviews, see Cohen and Chakravarti 1990; Tesser and Shaffer 1990), recent empirical evidence suggests that "strength is more important than simple favorability in determining whether attitudes are successful predictor variables" (Raden 1985). Weak attitudes are poor predictors of overt behavior, whereas strong attitudes are good predictors (Abelson 1988; Berger and Mitchell 1989; Fazio 1989; Fazio, Powell, and Williams 1989; Raden 1985). Thus, the strength of an attitude is an important moderator of the relationship between attitudes and behavior.

Related to the finding that attitudes vary in strength is the distinction between attitudes and nonattitudes (Converse 1970). This distinction stems from the observation that an individual will often respond to an item embedded in a survey even when the individual has no prior attitude toward this topic (Converse 1970; Feldman and Lynch 1988; Schuman and Presser 1981). Moreover, survey respondents are willing to express opinions toward fictitious issues, for which no prior attitudes can exist (Bishop, Tuchfarber, and Oldendick 1986). Thus, individuals may appear to have attitudes that do not actually exist in any a priori fashion.

How are responses generated when prior attitudes are unavailable from memory? Converse (1970) maintains that these responses reflect random error. However, Feldman and Lynch (1988) review evidence implying that these responses are generated systematically. The accessibility-diagnosticity model (Feldman and Lynch 1988; Lynch, Marmorstein, & Weigold 1988) suggests that if an answer cannot be retrieved directly from memory, the answer is computed on the basis of other information available from memory. Feldman and Lynch (1988) use the term "self-generated validity" to describe this process because if an answer is unavailable from memory, the answer can be constructed - on the spot - following exposure to the survey question. Thus, measurement instruments can prompt individuals to form judgments that would not have been formed otherwise.

This problem is likely to be compounded when individuals are exposed to multiple measurement instruments. Exposure to a single scale can induce respondents to generate a judgment that would not have been generated otherwise. However, repeated exposure to a measurement instrument can prompt respondents to form a new judgment and activate it repeatedly. Prior research has shown that repeated activation (retrieval) increases the strength of a judgment (Fazio, Sanbonmatsu, Powell, and Kardes 1986; Powell and Fazio 1984).

In these studies, attitude strength was operationalized in terms of attitude accessibility, or the speed with which an attitude can be retrieved from memory in response to an attitudinal inquiry. Justification for this procedure is provided by extensive empirical evidence indicating that manipulations of the strength of the association between an object and an evaluation result in a corresponding change in response latency: as attitude strength increases, faster response latencies are obtained (for a review, see Fazio 1989). Similarly, as the number of attitude scales provided increases, the number of times the focal attitude is expressed increases. To express an attitude one must access the attitude from memory, and repeated expression requires repeated attitude activation. Repeated activation results in faster response latencies to subsequent attitudinal inquiries (Fazio et al. 1986; Powell and Fazio 1984). Hence, exposure to repeated measures induces respondents to activate their attitudes repeatedly, and repeated activation increases attitude accessibility.

The purpose of the present study was to examine repeated measurement effects involving product-related stimuli (i.e., brand names such as Coke, Pepsi, Crest, Colgate, etc.) as opposed to everyday objects (e.g,., gift, music, cake, guns, crime, Republicans, Democrats; see Fazio et al. 1986) or political issues (e.g., abortion, equal rights, gun control, nuclear power; see Powell and Fazio 1984). More importantly, the present study was designed to test the hypothesis that repeated measurement effects on attitude strength are moderated by attitude crystallization. Crystallized attitudes are well-defined attitudes that exist prior to measurement (Schuman and Presser 1981).



Uncrystallized attitudes are poorly-articulated and poorly-defined (Chaiken and Baldwin 1981), less stable over time (Davidson and Jaccard 1979; Schwartz 1978), and are less resistant to persuasion (Krosnick 1988; Wu and Shaffer 1987). Because uncrystallized (versus crystallized) attitudes are relatively malleable, they should also be more susceptible to repeated measurement effects.



Subjects performed a computer-administered brand evaluation task with respect to a list of 56 different brands belonging to one of seven different product categories (e.g., soft drinks, candy bars, shampoos). Evaluation latencies toward each of these products were assessed in each of three consecutive experimental sessions. Following these sessions, subjects were asked to estimate the frequency with which they purchased each of the 56 products. Because attitudes based on direct behavioral experience (i.e., trial) are less ambiguous than attitudes based on indirect experience (e.g., advertising, word-of-mouth communications; see Fazio and Zanna 1981; Smith and Swinyard 1983), purchase frequency ratings provide a useful measure of prior attitude crystallization. On the basis of these ratings, brands to which subjects were loyal and brands to which subjects were nonloyal were identified separately for each subject (Jacoby and Kyner 1973). Hence, a 2 (Loyalty) X 3 (Measurement exposure level) repeated measures design was employed (both factors were within-subjects factors). It was predicted that mean evaluation latencies should decrease as exposure level increases, and that this repeated measurement effect should be more pronounced in nonloyal than in loyal conditions.

Subjects and Stimuli

Subjects were 31 undergraduates who received $5 for participating. Subjects were asked to judge 56 different target products. These products were members of seven different product classes (soft drinks, beer, soaps, shampoos, deodorants, candy bars, and toothpastes). Each product class was comprised of 7 to 13 different brands. For each product category, a wide variety of brands were selected to maximize the likelihood of including brands to which subjects were likely to be loyal (e.g., Coke, Pepsi, Michelob, Miller, Snickers, Three Muskateers, Crest, Colgate, Ivory, Dial) and nonloyal (e.g., Tab, Mountain Dew, Samuel Adams Lager, Bass Ale, Heath Bar, Mr. Goodbar, Aquafresh, Aim, Safeguard, Shield).


The Response Latency Task. Upon arrival, subjects were told that they would be asked to participate in a computer-administered survey regarding their personal opinions toward several different products. Subjects were told that a single brand name would appear on the monitor and their task was to press a key labeled "good" or a key labeled "bad" as quickly as possible to indicate their judgment of the product (key order was counterbalanced across subjects). Subjects were instructed to maximize both the speed and the accuracy of their responses. The presentation was controlled by an IBM personal computer. The order in which the products were presented was randomized for each subject. Each brand name remained visible on the screen until the subject responded, and a 3-second interval separated each trial. The subject's response and latency of response (from brand name onset to response) were recorded automatically to the nearest centisecond.

This task was preceded by a set of practice trials designed to familiarize subjects with the procedure. The brand names employed in the practice trials pertained to product classes other than those used in the experiment. In the experiment, subjects judged 56 different brand names (Session 1). Each subject received a different random order of presentation of the 56 brand names; most subjects completed this task in 15 to 20 minutes. Immediately after completing this task, subjects judged the same 56 brand names, presented in a different random order (Session 2). This task was repeated again in Session 3. Order of presentation of brand names was randomized across subjects and across sessions. Most subjects completed all three sessions in less than 45 minutes.

Purchase Frequency Estimates. Following the response latency task, a paper-and-pencil questionnaire containing purchase frequency scales was administered. Subjects were asked to indicate how frequently or infrequently they purchased each brand during the past year. To ensure that these estimates would be comparable across product classes, they were asked to indicate how frequently they purchased each brand, when they purchased an item from a given product category. An eleven-point scale from "0" (never C 0%) to "10" (always C 100%) was provided for each brand. A key with a verbal label for each of the eleven scale points was provided (adapted from Juster 1966).

On the basis of these purchase frequency estimates, brands to which subjects were loyal versus nonloyal were identified. Specifically, for each subject and for each product class, the brand with the highest purchase frequency rating was assigned to the loyal category (provided that the rating was equal to or greater than the scale mid-point). Brands that were never purchased (i.e., brands that received a purchase frequency rating of 0) and brands that were rarely purchased (i.e., brands that received a purchase frequency rating of 1) were assigned to the nonloyal category. A one-way analysis of variance performed on these ratings indicated that purchase frequency ratings were greater in loyal than in nonloyal conditions (Ms=7.92 vs. 0.79), F(1, 60)=2194.06, p<.0001. Hence, the procedure used to assign brands to loyal versus nonloyal conditions was very effective.


Evaluation latencies (measured in centiseconds, reported in milliseconds) as a function of brand loyalty and repeated measurement are presented in Table 1. An analysis of variance performed on evaluation latencies yielded significant main effects for Loyalty, F(1, 30)=32.50, p< .001, and for Exposure level, F(2, 60)=16.72, p<.001. Loyal consumers possess more accessible attitudes than nonloyal consumers. Attitude accessibility also increases with repeated measurement. Most importantly, these effects were qualified by a significant Loyalty X Exposure level interaction, F(2, 60)=88.08, p<.001.

Follow-up tests of the Loyalty X Repeated measurement interaction revealed that the simple main effect for Repeated measurement was more pronounced in nonloyal, F(2, 60)=46.94, p<.001, than in loyal conditions, F(2, 60)=16.33, p<.001. In loyal conditions, evaluation latencies were faster after two (versus one) exposures to measurement, F(1, 60)=20.67, p<.001, but not after three (versus two) exposures (F< 1). In contrast, in nonloyal conditions, evaluation latencies decreased following two (versus one) exposures, F(1, 60)=53.52, p<.001, and following three (versus two) exposures, F(1, 60)=4.67, p<.05. Hence, evaluation latencies decreased with repeated measurement, and this effect was more pronounced for nonloyal consumers.


When a consumer is exposed to a response scale, the relevant judgment is activated from memory. When multiple responses are called for, the judgment is activated repeatedly. Repeated activation increases the subsequent accessibility of the judgment from memory. However, this effect is more pronounced for nonloyal consumers because their initial judgments tend to be uncrystallized. Consequently, their initial judgments are particularly susceptible to repeated measurement effects. The judgments of loyal consumers, however, tend to be better-defined, more stable, and less malleable; consequently, their judgments are less susceptible to repeated measurement effects.

Prior research on measurement effects has focused on effects due to social desirability concerns, evaluation apprehension, sensitization to experimental manipulations (Cook and Campbell 1979), self-generated validity (Feldman and Lynch 1988), and question wording (Krosnick and Schuman 1988; Schuman and Presser 1981). Essentially, these studies centered on the effects of measurement on judgment valence. We focused on the effects of measurement on judgment strength or accessibility. Exposing subjects to multiple measures artificially enhances the readiness with which the judgment can subsequently be accessed from memory. This enhancement is artificial because it would not have occurred in the absence of repeated measurement.

The present set of results is especially impressive given that a "minimal" manipulation (Prentice and Miller 1992) of repeated measurement was employed. Subjects were merely asked to categorize each brand as "good" or "bad." Relatively little cognitive effort, on the part of the respondent, is required to perform this simple cognitive task. Nevertheless, strong effects of repeated measurement were observed. It could be argued that even stronger effects may be observed if respondents are asked to perform a more complex cognitive task (Prentice and Miller 1992). For example, completing the same seven-point scales repeatedly should require greater cognitive effort than completing the same dichotomous scales repeatedly. Moreover, completing different but converging seven-point scales repeatedly should require greater cognitive effort than completing the same seven-point scales repeatedly. Because accessibility increases with cognitive effort (Tyler, Hertel, McCallum, and Ellis 1979), the effects of repeated measurement on accessibility should increase with the complexity of the measurement instrument.

One implication of this research is that if multi-item scales are required to operationalize attitudes adequately (see, e.g., Bagozzi, Tybout, Craig, and Sternthal 1979), and if exposure to multi-item scales enhances the strength of attitudes, then it is difficult to separate the effects of improved measurement versus inflated strength on attitude-behavior correspondence. We suggest that both factors are important. That is, the ability to predict overt behavior from attitude increases when multi-item (versus single-item) scales are employed because (a) the attitude is better conceptualized and operationalized, and (b) attitude accessibility is enhanced artificially due to the effects of repeated measurement.

We suggest that marketing researchers should use enough scale items to operationalize a construct adequately, but not so many scales that the construct undergoing assessment is likely to be altered dramatically. The optimal number of scales is likely to vary from constuct to construct (depending on the complexity of the construct) and from sample to sample (depending on the prior knowledge and experience of the respondent). Complex multidimensional constructs require multidimensional measures. However, exposure to multidimensional measures may induce respondents to form judgments that would not have been formed otherwise or may lead to the strengthening of pre-existing judgments. Respondents with relatively uncrystallized judgments are particularly susceptible to construction and repeated measurement effects. Finally, we suggest that a "more is better" philosophy of measurement may seriously compromise the reliability and validity of a measurement instrument by inducing respondents to construct new judgments or by increasing the strength of pre-existing judgments.


Abelson, Robert P. (1988), "Conviction, American Psychologist, 43 (April), 267-275.

Bagozzi, Richard P., Alice M. Tybout, C. Samuel Craig, and Brian Sternthal (1979), "The Construct Validity of the Tripartite Classification of Attitudes," Journal of Marketing Research, 16 (February), 88-95.

Berger, Ida E. and Andrew A. Mitchell (1989), "The Effect of Advertising on Attitude Accessibility, Attitude Confidence, and the Attitude-Behavior Relationship," Journal of Consumer Research, 16 (December), 269-279.

Bishop, George F., Alfred J. Tuchfarber, and Robert W. Oldendick (1986), "Opinions on fictitious issues: The pressure to answer survey questions," Public Opinion Quarterly, 50, 240-250.

Chaiken, Shelly and Mark W. Baldwin (1981), "Affective-Cognitive Consistency and the Effect of Salient Behavioral Information on the Self-Perception of Attitudes. Journal of Personality and Social Psychology, 41 (July), 1-12.

Churchill, Gilbert A. (1979), "A Paradigm for Developing Better Measures of Marketing Constructs," Journal of Marketing Research, 16 (February), 64-73.

Cohen, Joel B. and Dipanker Chakravarti (1990), "Consumer Psychology," Annual Review of Psychology, 41, 243-288.

Converse, Philip E. (1970), "Attitudes and Non-Attitudes: Continuation of a Dialogue," in The Quantitative Analysis of Social Problems, ed. Edward R. Tufte, Reading, MA: Addison-Wesley, 168-189.

Cook, Thomas D. and Donald T. Campbell (1979), Quasi-Experimentation: Design and Analysis Issues for Field Settings. Boston: Houghton Mifflin.

Davidson, Andrew R. and James J. Jaccard (1979), "Variables that Moderate the Attitude-Behavior Relation: Results of a Longitudinal Survey," Journal of Personality and Social Psychology, 37, 1364-1376.

Fazio, Russell H. (1989). On the Power and Functionality of Attitudes: The Role of Attitude Accessibility," in Attitude Structure and Function, eds. Anthony R. Pratkanis, Steven J. Breckler, Anthony G. Greenwald, Hillsdale, NJ: Lawrence Erlbaum Associates, 153-179.

Fazio, Russell H., Martha C. Powell, and Carol J. Williams (1989), "The Role of Attitude Accessibility in the Attitude-to-Behavior Process," Journal of Consumer Research, 16 (December), 280-288.

Fazio, Russell H., David M. Sanbonmatsu, Martha C. Powell, and Frank R. Kardes, (1986), "On the Automatic Activation of Attitudes," Journal of Personality and Social Psychology, 50 (February), 229-238.

Fazio, Russell H., and Mark P. Zanna (1981), "Direct Experience and Attitude-Behavior Consistency," in Advances in Experimental Social Psychology, Vol. 14, ed. Leonard Berkowitz, New York: Academic Press, 161-202.

Feldman, Jack M. and John G. Lynch (1988), "Self-Generated Validity and Other Effects of Measurement on Belief, Attitude, Intention, and Behavior," Journal of Applied Psychology, 73 (August), 421-435.

Fischhoff, Baruch (1991), "Value Elicitation: Is There Anything in There? American Psychologist, 46 (August), 835-847.

Jacoby, Jacob and David B. Kyner (1973), "Brand Loyalty Vs. Repeat Purchasing Behavior," Journal of Marketing Research, 10 (February), 1-9.

Juster, F. T. (1966), "Consumer Buying Intentions and Purchase Probability: An experiment in Survey Design," Journal of the American Statistical Association, 61, 658-696.

Krosnick, Jon A. (1988). "Attitude Importance and Attitude Change," Journal of Experimental Social Psychology, 24 (May), 240-255.

Krosnick, Jon A. and Howard Schuman (1988), "Attitude Intensity, Importance, and Certainty and Susceptibility to Response Effects," Journal of Personality and Social Psychology, 54 (June), 940-952.

Lynch, John G., Howard Marmorstein, and Michael F. Weigold, M. F. (1988), "Choices from Sets Including Remembered Brands: Use of Recalled Attributes and Prior Overall Evaluations," Journal of Consumer Research, 15 (September), 169-184.

Peter, J. Paul (1979), "Reliability: A Review of Psychometric Basics and Recent Marketing Practices," Journal of Marketing Research, 16 (February), 6-17.

Powell, Martha C. and Russell H. Fazio (1984), "Attitude Accessibility as a Function of Repeated Attitudinal Expression," Personality and Social Psychology Bulletin, 10 (March), 139-148.

Prentice, Deborah A. and Dale T. Miller (1992), "When Small Effects Are Impressive," Psychological Bulletin, 112 (July), 160-164.

Raden, David (1985), "Strength-Related Attitude Dimensions," Social Psychology Quarterly, 48 (December), 312-330.

Schuman, Howard and Stanley Presser (1981), Questions and Answers in Attitude Surveys: Experiments on Question Form, Wording, and Context, New York: Acaedmic Press.

Schwartz, Shalom H. (1978), "Temporal Instability as a Moderator of the Attitude-Behavior Relationship," Journal of Personality and Social Psychology, 36 (July), 715-24.

Smith, Robert E. and William R. Swinyard (1983), "Attitude-Behavior Consistency: The Impact of Product Trial Versus Advertising," Journal of Marketing Research, 20 (August), 257-67.

Sudman, Seymour and Norman M. Bradburn (1974), Response Effects in Surveys: A Review and Synthesis, Chicago: Aldine.

Tesser, Abraham and David R. Shaffer (1990), "Attitudes and Attitude Change," Annual Review of Psychology, 41, 479-523.

Tyler, Sherman W., Paula T. Hertel, Marvin C. McCallum, and Henry C. Ellis, (1979), "Cognitive Effort and Memory," Journal of Experimental Psychology: Human Learning and Memory, 5 (November), 607-617.

Wu, Chenghuan and David R. Shaffer (1987), "Susceptibility to Persuasive Appeals as a Function of Source Credibility and Prior Experience With the Attitude Object," Journal of Personality and Social Psychology, 52 (April), 677-688.

Wyer, Robert S. and Thomas K. Srull (1989), Memory and Cognition in its Social Context, Hillsdale, NJ: Lawrence Erlbaum Associates.