Retrospective Reports on Consumer Decision Processes: &Quot;I Can Remember If I Want To, But Why Should I Bother Trying?&Quot;

Pete Wright, Stanford
Peter Rip, University of Chicago
[ to cite ]:
Pete Wright and Peter Rip (1980) ,"Retrospective Reports on Consumer Decision Processes: &Quot;I Can Remember If I Want To, But Why Should I Bother Trying?&Quot;", in NA - Advances in Consumer Research Volume 07, eds. Jerry C. Olson, Ann Abor, MI : Association for Consumer Research, Pages: 146-147.

Advances in Consumer Research Volume 7, 1980     Pages 146-147


Pete Wright, Stanford

Peter Rip, University of Chicago

In 1977, Nisbett and Wilson (N & W) published a provocative paper in which, after reviewing many studies where subjects gave retrospective reports on what stimuli had caused their decision or judgment, they draw the following conclusion: "The evidence reviewed is then consistent with the most pessimistic view concerning people's ability to report accurately about their cognitive processes (p. 247)". N & W's premise is that people cannot access valid memories about the strength and/or direction of their affective reactions to different aspects of complex options when judging preferences. Perhaps because N & W's paper is well-written and their review seemingly far-reaching, and perhaps because discrediting retrospective verbal reports suits the purposes of researchers wedded to a methodology that excludes such data, N & W's paper threatens to become the standard "knee-jerk" reference for disparaging such reports. We see such offhanded references increasingly in the consumer behavior literature as well as the basic psychology literature. Our intent in this paper is to counter that trend.

The view expressed here is that N & W's paper is a very good and a very bad paper. It is very good because it has spawned some much needed thinking about how retrospective reports are produced and about how to test the accuracy of such reports. It is a very bad paper because the evidence N & W reviewed is almost entirely irrelevant to the question of people's ability to accurately retrieve and report memories for causal stimuli, and hence could not logically lead to any conclusion, pessimistic or optimistic, on that question. We will discuss why this is the case. Our concern here is primarily with elaborating the conditions under which an informative test of someone's accuracy in recalling past events is possible. We will also summarize the results from two studies which seem more pertinent to the self-insight issue than most of those N & W reviewed.

First, we note two logical arguments that undermine the relevance of the studies N & W reviewed. Others have noted some additional reasons for questioning N & W's arguments, (Smith and Miller, 1978; Ericsson and Simon, 1978; Weitz and Wright, 1978), but we focus here on the two most damaging ones.

1. Almost none of the studies which N & W relied on undertook subject-specific tests of the accuracy of the self-reports. The tests relied on comparisons, informally or directly, of whether the statistical significance tests obtained in between-groups contrasts of subjects' actual judgments were replicated in similar contrasts of their post-hoc reports on what caused their judgments. In such tests, each subject responds to only unvaried stimulus situation. But in order to test an individual's memory about his or her own reactions to the aspects of different situations, that person's reports must be compared to some other "objective" measure of that same person's reactions. This implies that the subject must have provided multiple judgments in response to varied situations, i.e., a within-subject design. An individual can have memories about differences in his or her own reactions to different stimuli, but cannot have memories about how those compared to the reactions of other people. Between-group contrasts or cross-sectional correlations (e.g., Nisbett and Bellows, 1977; Smith and Wilson, 1978) are therefore inappropriate tests of self-insight. Interestingly, N & W did note that the one group of studies giving some evidence of accurate recall was that in which judges reported cue weights following participation in a within-subject judgment experiment. However, N & W failed to note that the distinguishing feature of these studies was the within-subject design and subject-specific testing; instead, they highlighted the nature of the judges (professionally trained).

2. Poor accuracy in a post-hoc report would reflect inability to retrieve and/or report valid memories only if the subject accepts accuracy as the goal in issuing the report. In the studies N & W reviewed, there was no attempt to make accuracy the paramount goal. It is quite likely that subjects usually had some other goal, like impression-management, self-esteem maintenance, or "use the simplest method available to produce a report". But our ignorance of their intent alone would make it impossible to interpret the observed accuracy as evidence of retrieval ability or the state of their memories, even if the accuracy tests had been subject-specific.

So, meaningful data on the accuracy of self-reports demands a subject-specific test, and interpreting that data as evidence on retrieval ability demands a subject motivated to try producing accurate reports. There is a dearth of such studies, so the basic question N & W raise is in no way answerable right now, and it is certainly not a closed issue, as N & W imply.

When is someone maximally motivated to try carefully retrieving what exists in memory and reporting it public-ally? We propose that four conditions must hold.

1. Subjects do not believe an accurate report will cause embarrassment or loss of face in the immediate situation.

2. Subjects do not believe that anyone will use the report in ways detrimental to the subjects or their friends, e.g., by evaluating the wisdom of the reported judgment policy, by copying it, or by using it as a basis for "gaming" them.

3. Subjects do believe that the accuracy of the report will be tested, and that greater accuracy will bring reward, e.g., by social approval or enhanced self-esteem.

4. Subjects do believe that their true mean mental reactions had been measured somehow, so that a self-in-sight test would be meaningful. In particular, they know they provided multiple observations for the researcher.

Conditions 1 & 2 imply that the subject has no clear goal for the final report other than self-revelation. Beyond this, Conditions 3 and 4 imply that the subject is willing to try memory retrieval as the process for producing the report, rather than just making up a semi-plausible report in the easiest way possible. It is doubtful if these conditions held in any study N & W reviewed.

N & W also proposed that one easy way for subjects to create a report, without accessing memories for the events, is to rely on a "theory" about "what causes people to make decisions as I did?", or about the way people in general or of a certain type react in such situations. They argue that, since observers might have similar theories to draw on, we should not infer that a subject retrieved memories of the events unless the subject's accuracy exceeds that of an observer's guesses about the subject's reactions. To test this, Nisbett and colleagues (Nisbett and Bellows, 1977; Wilson and Nisbett, 1978) ran several actor-vs.-observer comparisons. These suffered from the problems cited earlier, and are not interpretable on that score. These were also "weak" actor-vs.-observer comparisons, because the actors were strangers to the observers and the observers knew little about the stimulus situations the actors had faced. This was intentional, because N & W sought to show that even such ill-informed observers would match the accuracy of the actors self-reports. But had the actors won, that would have told nothing about whether they might have been drawing on private memories, since they could have had better theories than the observers--especially, a "theory of myself, based only on observations of how I behave"--or better data on situational parameters, since they had experienced the stimulus situation firsthand.

To stage a strong actor-vs.-observer comparison, the observers should have had a chance to observe the actor over time, in order to build a theory about him or her based on observable behaviors, and should have equally detailed information about the task environment. For example, a strong comparison would involve the guesses made by intimates of the actor, like family members or close friends, who had themselves made the same judgments in the same task environment, and knew this.

In two recent studies, we sought to begin building meaningful data base concerning the validity of retrospective reports on consumer decision processes. In both, subjects reported affective reactions to the attributes of profiled colleges after a personal preference judgment task in which colleges were varied in a within-subject design. These reports were compared against statistical estimates of the subject's actual reactions, obtained from a conjoint analysis. For each subject the correlation between the reported reactions and the estimated ones was computed.

In the first study, 140 high school juniors and both parents each judged personal preferences between colleges, then reported their own reactions, then guessed about the reactions of the other family members. If the actors' self-reports beat the guesses of these observers, who knew the actors and the task intimately, that could imply the actors were drawing on some private information not available to the observers...perhaps valid memories.

To summarize the results, the proportion of actors who achieved statistically significant reported-to-actual correlations was significantly higher than the proportion of observers whose guesses correlated significantly with the target actor's estimated reactions. And the average actor-to-actual correlation was significantly higher than the average observer-to-actual correlation. This favors the hypothesis that some of the subjects did draw on accurate private memories of their reactions. Since only half the actors achieved significant report-ed-to-actual correlations, and the mean was just .38, the data also suggest that a number of subjects did not have valid memories to draw on or did not bother to carefully retrieve these.

In the second study, we tested a motivational explanation for this. We picked two factors which students in the first study de-emphasized, A & B, and designated those as target factors. One group of students judged profiled colleges, without exposure to any information that might induce them to give more weight to A & B. Then half were given instructions before the retrospective reporting that led them to think their self-reports gave the only evidence on their true reactions, and that authorities who would learn of their reported reactions felt it was unwise to emphasize A & B. The others were given instructions to maximally motivate candid reporting. They were told their accuracy would be tested, that it could be tested meaningfully, and that authorities who valued good self-insight highly would learn of their accuracy. It was expected that both groups would achieve high accuracy, one by coincidence in trying to manage others' impressions of them, and the second by careful retrieval. Other subjects first read information intended to induce them to adopt a different policy, by giving somewhat more emphasis to A & B. Then, half received the "Impression Management" instruction and half the "Show Insight" instruction. Assuming the prior exposure did induce more emphasis on A & B, it was expected that those trying for accurate retrieval would still give accurate reports, but those motivated to report de-emphasizing A & B to impress others would not. The data conformed to this expectation, and the mean reported-to-actual correlation by those aiming for candid self-disclosure was .63. These results suggest (i) that under some conditions, subjects can give accurate reports about such aspects of recent decision processes if motivated to bother trying careful retrieval, and (ii) that when subjects have impression-management goals in mind for their reports, accuracy is achieved only by coincidence.

In summary, these data at least should cause researchers on consumer decision processes to resist any tendency to reject post-hoc verbal reports out-of-hand as data, as suggested by N & W. The careful analysis of how people produce such reports is healthy and important, and it is likely that valid reports are obtained only under limited conditions. It is important to define those conditions, since valid retrospective reports can provide useful evidence in situations where other measures are impractical, highly obtrusive, or reactive, and can serve as useful complementary evidence when other measures are used. Consumer researchers should not treat Nisbett and Wilson's thesis as anything more than a provocative hypothesis at this stage, and should begin empirically investigating how and when verbal report data of different types provides meaningful evidence on consumer mental activities.


Ericsson, K. Anders and Simon, Herbert (1978), "Retrospective reports as data". Unpublished working paper. Carnegie-Mellon University.

Nisbett, Richard and Bellows, Nancy (1977), "Verbal reports about causal influences on social judgments: Private access versus public theories", Journal of Personality and Social Psychology, 35, 613-624.

Nisbett, Richard and Wilson, Timothy D. (1977), "Telling more than we know: Verbal reports on mental processes", Psychological Review, 84, 231-259.

Smith, E. R. and Miller, F. D. (1978), "Limits on the perception of cognitive processes: Reply to Nisbett and Wilson", Psychological Review, 85, 355-362.

Weitz, Barton and Wright, Peter (1979), "Retrospective self-insight on factors considered in product evaluation'', Journal of Consumer Research (in press).

Wilson, Timothy and Nisbett, Richard (1978), "The accuracy of verbal reports about the effects of stimuli on evaluations and behavior", Social Psychology, 41, 118-131.