What You See Is Not Necessarily What You Get: the Effects of Pictures and Words on Consumers' Inferences

Ruth Ann Smith, Virginia Polytechnic Institute and State University
For some time, consumer researchers have demonstrated a strong interest in the inferences people form when processing advertisements. Inferences are defined here as conclusions derived from a set of premises using a rule relating the premises to a conclusion in a subjectively logical fashion (Hastie 1983). Consumers' inferences tend to follow everyday, rather than formal, rules of logic and they therefore have the potential to result in incorrect beliefs about product attributes (Harris 1977; Harris et al 1980).

Moreover, inferences may occur automatically when an advertisement is processed and they are integrated with explicit claims in a single representation (Bransford and Franks 1971; Hayes-Roth 1977). Thus, consumers are unlikely to be able to discriminate inferences from explicit product claims with the result that the (possibly) misleading inferences will exert a material effect on purchase decisions. Concern about this problem is exacerbated by advertisers' growing reliance on abstract product claims that do not require substantiation, but which may be subject to several interpretations (Shimp 1979).


As I have noted elsewhere (Smith 1991), the large majority of research on consumers' inferences focuses on the influence of the verbal claims in advertisements (cf Ford and Smith 1987; Huber and McCann 1982; Johnson and Levin 1985; Meyer 1981). Minimal effort has been directed toward examining the effects of nonverbal ad components, or the combined effects of verbal and nonverbal elements, despite the fact that advertising relies heavily on pictures, music, nonverbal sounds, color, and even odors (scratch and sniff magazine inserts, for example) in addition to copy.

In a study designed to begin to remedy this gap in the literature, I examined the effects of both pictures and words on consumers' inferences. The results indicated that consumers do form inferences based on pictorial product claims, and that these visually-based inferences tend to follow rules of covariation in a manner similar to verbally-based inferences. Moreover, I found that when pictures and words in an ad convey information about different product attributes, the picture claim tends to dominate inference formation (Smith 1991).

There are at least three important reasons to further investigate picture-based inferences. The first of these reflects a public policy concern. That is, even if the verbal claims in an advertisement are absolutely correct, the pictures may still be misleading. This possibility became very real to me several years ago when I saw a television commercial for a sweepstakes associated with some product (which I can't recall). The grand prize was $1,000 per month for life and the voiceover was completely unambiguous in making this claim. The visuals, however, depicted a fabulously weathly couple living it up on the Riveria. I found myself fantasizing about the great life I would have if I were the winner, until I suddenly realized that $1,000 per month is actually a very modest amount.

Second, when pictures and words are combined in a single stimulus, the pictures tend to be relatively more salient, and therefore attract a disproportionate amount of processing capacity (Taylor and Thompson 1982). This would imply that pictures would also be disproportionately influential on the meaning (including inferences) extracted from an ad. My finding that pictures drive inferences when they convey an attribute different from the copy is consistent with this proposition (Smith 1991).

Third, inasmuch as all media except radio have the potential to combine pictures and words in a single stimulus, it is artificial to restrict attention to the effect on inferences of verbal claims alone. (And even radio can incorporate some nonverbal elements such as music and nonverbal sounds.) It seems intuitively plausible that consumers' inferences are influenced by the gestalt of all the ad components, not just the copy. For all these reasons, ignoring pictures when studying consumers' inferential processes will result in an incomplete understanding of how consumers derive meaning from advertisements.

Research investigating the impact of pictures alone, however, is likely to suffer from the same limitations as that considering only copy-based inferences. As noted above, it seems plausible to suppose that an advertisement is processed as a unit composed of a variety of verbal and nonverbal components. Thus, the appropriate direction for future research on consumers' inferences would be to examine the total effect of pictures and words on the meaning consumers extract from advertisements. Some thorny measurement issues must be resolved, however, in order to pursue investigations of this type.


Much past research on inferences, both visually- and verbally-based, has employed dependent measures that are potentially reactive. That is, consumers' inferences about product characteristics may be formed spontaneously when an ad is initially processed or they may be prompted later by the dependent measures themselves. Inferences of the latter type are derived deductively from knowledge of the product category and do not necessarily indicate the influence of a particular advertisement on brand beliefs. It is clear that consumers draw this type of inference in natural settings, and therefore research on the processes leading to their formation is valuable. Because product category knowledge includes information from various sources, however, such research would not permit one to evaluate the specific brand inferences a consumer forms on the basis of a particular advertisement. If this is the researcher's interest, it is critical to use dependent measures that do not prompt inferences after exposure to the ad.

Kardes (1988) has developed such a procedure for evaluating verbally-based inferences. His approach involves presenting subjects with pairs of product claims of the form "A implies B," and "B implies C." A conclusion of the form "A implies C" is also included with some of the claims pairs, but excluded for others. A recognition task is used to determine whether or not the conclusion is inferred when not explicitly stated in the stimulus.

That is, a "hit," or a correct recognition of a previously presented conclusion can be attributed to a simple retrieval process. "False alarms," however, or incorrect recognitions of conclusions that were not presented, would be indicative of inferences derived from the explicit claims. Kardes (1988) reported that inferences about missing conclusions (measured as subjects' recognition confidence for false alarms) were strongest when consumers were knowledgeable about the advertised product, and when the conclusion was logically related to the explicit claims.

Kardes' (1988) approach employs verbal dependent measures to assess the inferences that consumers derive from verbal claims. The feasibility and appropriateness of using this procedure to evaluate inferences derived from pictures, or a combination of pictures and words, poses some problems. With respect to feasibility, developing pictures that unambiguously convey claims of the type "A implies B" may be quite difficult. Even very simple pictures (black and white line drawings, for example) are subject to multiple interpretations. Any researcher who has suffered through the pretesting of visual stimuli designed to communicate a discrete concept (e.g. fast pizza delivery) can attest to the varied meanings that viewers extract from pictures. My own attempts to develop pictures that would clearly communicate the concepts in Kardes' belief statement sets without benefit of verbal labels have been noteworthy only for their lack of success.

Second, some concepts are simply not amenable to pictorial representation. For example, the statement, "Intestinal disorders often cause a loss of appetite," which was included in one of Kardes' belief statement sets, seems to defy pictorial representation. While I can easily imagine my negative reaction to food when my stomach is upset, I cannot imagine a picture that would convey that feeling.

Third, people have a remarkable ability to recognize pictures. Shepard's (1967) findings clearly demonstrate that accurate discriminations between new and old pictures can be made after lengthy intervals of time and for very large stimulus sets. Because their surface structure is so enduring in memory, the incidence of false alarms for pictorial claims would probably be very small compared to that for verbal claims. It would be impossible to determine, however, if this were due to the absence of inferences or superior memory for the specific pictures that constituted the claim set.

Assuming these difficulties could be overcome, the issue of appropriateness must be addressed. Specifically, the question arises as to whether or not verbal recognition items should be used to assess picture-based inferences. As noted above, the persistance of the surface structure of pictures in memory would seem to argue against using visual recognition test items to assess visually-based inferences. In addition, there is no compelling reason to expect that either the form of processing (imagery of discursive) or nature of the mental representation of information (verbal or visual) is dependent upon or dictated by the form of the original stimulus. MacInnis and Price (1987) argue that "The well-substantiated ability to move from words to pictures and pictures to words suggests that there is a representation in memory that encompasses both" (p. 474).

This perspective seems consistent with the earlier suggestion that the components of advertisements are processed as a unit, rather in isolation from one another, and that the overall meaning is stored in a unitized representation. If so, these meanings should be equally accessible regardless of the form of the recognition items. Moreover, given that the proper focus of inference research is the total impact of all ad components on the meanings consumers derive, the particular form of the dependent measure (visual or verbal) should be of little concern as long as the measures tap the stored meaning of the ad. In view of the problems posed by the persistance in memory of the surface sturcture of pictures, however, verbal measures would seem to be the preferable alternative.


I have argued that future research on consumers' inferences should properly focus on the combined impact of visual and verbal ad claims, rather than assessing the effect of either in isolation. Moreover, I have advocated adapting nonreactive measures, modeled after that developed by Kardes (1988), in such investigations. How specifically might these prescriptions be put into practice?

First, suppose that subjects were presented with the following verbal propositions (Kardes 1988):

(1) Stresstabs contain B vitamins. (A implies B)

(2) B vitamins give energy. (B implies C)

The conclusion, which could either be explicitly stated or omitted from this stimulus, is that:

(3) Stresstabs give you energy. (A implies C)

Given only this verbal information, subjects are likely to incorrectly recognize (3) when it is not explicitly stated. The recognition confidence for the conclusion when only verbal information is presented could serve as a benchmark for evaluating the effect of combining these verbal claims with pictures.

For example, now suppose that claims (1) and (2) were presented together with a background photograph of people engaged in an active sport, such as a volleyball game on a beach. By itself, such a picture would be likely to elicit a variety of interpretations about the product's characteristics, one of which might be that it gives you energy. When combined with the verbal claims, however, it seems plausible to hypothesize that the recognition confidence of the omitted conclusion, (3), would exceed that of consumers exposed to the verbal claims without the picture inasmuch as the picture reinforces the conclusion. That is, the combination of the pictures and the copy is likely to result in stronger brand beliefs than either by itself. This hypothesis could easily be investigated using Kardes' (1988) measurement procedure with the recognition test items presented in verbal form.

Even more interesting would be to examine the effect of verbal claims and unrelated pictures on inferences. For example, assume that claims (1) and (2) were presented along with a picture of a muscular man and a slender, fit woman. The verbal claims logically imply the conclusion (3), but the picture suggests a different product benefit--physical fitness--that is not logically implied by the verbal claims, and in fact may be untrue. Given the relatively greater salience of pictures compared to words when the two are combined in a single stimulus, it would be hypothesized that recognition confidence for conclusion (3) would be weaker than for the incorrect conclusion and also weaker compared to the other two conditions described. Once again, verbal recognition test items would seem to be appropriate to evaluate this prediction.


Despite some formidable measurement problems, research on the effects of verbal and visual advertising claims on consumers' inferences has progressed. To date, however, this research has been limited by its focus on evaluating the isolated effects of either copy or pictures, while failing to consider the overall impact of the two on consumers' inferential processes. It is argued that this shortcoming can be remedied by adapting Kardes' (1988) nonreactive measure of verbally-based inferences to assess the inferences derived from the combination of pictures and words.


