Visual Imagery: Applications to Advertising

John R. Rossiter, Columbia University
ABSTRACT - This paper presents 13 broad applications of visual imagery theory to advertising. It covers guidelines for the effective use of visual content in: general advertising, print advertising, and TV advertising. The applications are well supported by psychological experiments and offer challenging extensions to advertising practice.
Advertisers have long realized the importance of advertising that creates mental images in the buyer's mind. In psychology, on the other hand, mental imagery (notably but not confined to visual imagery) has had an uneven history. Interest in mental imagery waned during the behavioral learning theory stronghold of the 40's, 50's and 60's when coincidentally, Madison Avenue was reaching its peak (Mayer, 1958). Mental imagery is now back to the forefront of psychology. As Bugelski (1979) observes: the "image builders" of Madison Avenue probably were right-imagery may prove to be the primary principle for the psychology of learning.

Advertisers' edicts about how to create effective advertising make many implicit references to mental imagery, especially visual imagery. Thus Claude Hopkins in the pre-TV era (1923) referred to "the power of pictures;" Leo Burnett (1947) advised copywriters to use "picture words;" and David Ogilvy (1963) became famous for the "brand image" school of advertising. However, specific recommendations by advertisers regarding the advertising stimuli that will presumably create imagery are surprisingly limited.

The purpose of this paper is to approach the topic of effective advertising through imagery from an academic perspective. Instead of turning to advertisers' experience, academic research in the area of visual imagery is used to adduce practical guidelines for advertising. The research findings hold provocative implications for advertising practice.

The scope of this paper covers visual imagery created by visual input (i.e., pictures and video). In the two following papers, Larry Percy discusses the visual imagery properties of words and copy; and Morris Holbrook emphasizes the importance of mental imagery in other modalities. To the possible disappointment of some, this paper does not discuss "right brain-left brain" theory. Hemisphere phenomena are a physiological issue, not a psychological issue, and a contentious one at that (Rossiter, 1980; Rossiter and Percy, in press; Sergent and Bindra, 1981). Hemisphere phenomena are not necessary to visual imagery theory nor to applications of visual imagery in advertising.

The applications in this paper should be regarded as hypotheses. While they appear sound from a psychological standpoint, most have not yet been tested in an advertising context. The applications are presented under three headings: general (applicable to all types of) advertising; print advertising; and TV advertising.


G-1 Visual content warrants relatively more advertiser attention than verbal content.

The superiority of visual content does not mean that verbal content is unimportant--far from it, as the next paper demonstrates. The advertiser has to pay attention to the planning and execution of both types of content except, obviously, in radio advertising. However, the evidence justifies relatively more emphasis on visual content than on verbal content.

Pictures have a well-known superiority over words when it comes to learning (important for brand awareness and brand beliefs). The evidence is reviewed by Eysenck (1977) and the superiority holds regardless of whether the learning task pertains to short-term or long-term memory, or to recognition responses or recall responses (McKelvie and Demers, 1979). A leading explanation of the picture superiority effect is Paivio's (1971; 1978) dual-coding theory, which holds that pictures generally result in a visual representation as well as a verbal one, whereas words are less likely to result in the former. Long-term visual memory, unlike long-term verbal memory, appears to have virtually unlimited capacity, deteriorates very slowly, i, at all, and shows no primacy or recency effects (Avons and Phillips, 1980).

Evaluative (brand attitude) responses have been less studied by psychologists in relation to visual and verbal content. However, advertisers' "empirical" favoritism of TV advertising seems to support the picture superiority effect for evaluative responses too. TV is the ultimate pictorial medium in that all TV commercials (but not all print ads) contain pictures; also, TV commercials present not just one but multiple pictures to the viewer. A small-scale laboratory study (Grass and Wallace, 1974) and the several large-scale field studies that have been conducted (The Media Book, 1979) clearly support TV over magazines, newspapers and radio in effecting attitude change, purchase intentions, and purchase behavior. The "bottom line" is reflected in the higher cost-per-thousand advertisers are willing to pay for TV advertising.

Another reason why pictures may be superior to words in inducing evaluative responses is the visual channel's superiority in accurately communicating emotions (Mehrabian, 1980). Emotions drive the basic motivations that energize behavior (Rossiter and Percy, in preparation) and this may be one reason why TV, which captures the full (gestural) emotional range, is so effective in influencing purchase behavior.

G-2 Use high imagery (more concrete) visuals rather than abstract visuals.

Just as high imagery words should be used in advertising, as the next paper will discuss, so should high imagery visuals. High imagery visuals are those that themselves arouse other mental images (i.e., a mental picture, a sound, or a sensory experience) quickly and easily. Imagery value is in turn related to stimulus concreteness (both definitions here are taken from Toglia and Battig, 1978). Concrete pictures, like concrete words, refer to objects, persons, places or things that can be seen, heard, felt, smelt, or tasted; as contrasted with abstract referents that cannot be experienced by the senses.

Imagery value and concreteness are highly but not perfectly related (Richardson, 1980). Some abstract words, such as "fantasy," have high imagery values. Similarly, surrealistic visuals, such as for Levi's Jeans, seem to be somewhat abstract yet to virtually encourage imagery. Imagery value is the relevant factor and this tends to be higher when the visual items are immediately interpretable in the context of the presentation.

It should be noted, also, that animation can be highly realistic. Animation, through its simplified, "linedrawing" technique, can make entities very concrete by "stripping" them to their essential denotative characteristics. Thus it is not surprising that animated TV commercials and cartoon-like print ads are highly recalled and recognized

"Realistic" visuals are probably superior for learning for two reasons. First, people can "relate" to realistic depictions better than to abstract ones, which is in turn probably a function of their imagery value regardless of their specific content. Second, following dual-coding theory, people can more easily attach a verbal label to realistic visual material. Older children and adults automatically assign verbal labels to all but the most complex and novel pictorial stimuli (Pezdek and Evans, 1979) and thus double-code these stimuli.

Advertisers tend to use realistic visuals although there are sometimes faddish trends toward highly abstract (not merely surreal or animated) visual content. Recent campaigns for J 6 B Scotch and Grand Marnier Liqueur, for example, employ visuals that seem to bear a very abstract relationship to the advertised product. Corporate campaigns, too, have a tendency to favor abstract visuals, perhaps because they do not advertise a specific product. The learning research suggests that these fads are ill-advised. The audience is less likely to get any message with abstract visuals. Abstract visuals also may be tempting as a means of achieving uniqueness in a cluttered advertising environment. However, with the perhaps infinite range of visual stimuli available to the advertiser, it is quite easy to be unique as well as realistic.

G-3 Use color in visuals for emotional motivation but black & white is sufficient for "information" provision.

Some years ago, two psychologists in the research laboratories at Xerox Corporation (Dooley and Harkins, 1970) demonstrated that color's principal effect is motivational, whereas black 6 white is equally as effective as color as an information transmitter. Some data that Larry Percy and I have collected suggest that color visuals enhance brand attitude but have less effect on informational types of beliefs about the brand

Advertisers, even in television, have the option of using color or black 6 white. For newspapers (the largest advertising medium) this is a serious and costly decision. A split-run field experiment by Sparkman and Austin (1980) showed that one-color versions of newspaper ads generated 41% more sales volume than matched black 6 white versions. One-color ads cost about 35% more than black 6 white, so if the additional revenue from the sales volume is greater than the higher cost, then color is justified. Magazine advertisers (and also for outdoor and transit) sometimes use black 6 white ads to provide contrast in what have virtually become color media. The preceding research suggests that attention-getting through contrast should not be the only consideration, especially if emotional values are to be communicated.

G-4 "Interact" or juxtapose the product with the user or usage context in visuals.

Interactive pictures (Bower, 1980) that show two items juxtaposed in obvious interaction promote better associative learning than pictures that show each item separately. (Sentences, of course, are the analogous means of achieving interaction in the verbal mode.) Lutz and Lutz (1978), working with advertising symbols or "logos," showed that interactive symbols combining the company or brand name with a pictorial stimulus facilitated subsequent brand name recall. McKoon (1981) has shown that interactive picture sequences, in which two or more items are shown interacting in a script (like a comic book or TV photoboard), facilitate subsequent item recognition when any of the items is used as a cue for another.

The importance of interactive visuals for advertising is seen in relating products to users or to usage contexts. Consumers often buy "status" products because they suggest a particular type of user image (e.g., Polo shirts). And they usually buy "functional" products because they are perceived as suitable for a particular need or usage context (e.g., various types of detergents). The interactive imagery results suggest that users should be shown actually interacting with the product, on the one hand; an,d on the other, that products should be shown in action in the usage context. Although these types of interaction are quite common in TV advertising, one still sees many "solo" sequences in commercials. Interaction is less common in print advertising. A picture of the user (frequently an endorser) often is shown next to, but not interacting with, the product. Or a usage context is mentioned in the copy but not shown visually with the product. The psychological research indicates that associative learning is better facilitated by interactive visuals than by visuals that leave the audience to infer an interaction.

G-5 High imagery visuals work far better than "instructions to imagine."

There are three basic ways to take advantage of mental imagery: (a) use high-imagery, most often highly concrete, stimuli; (b) instruct the viewer, reader or listener to form images, i.e., to "imagine"; and (c) target individuals in the audience who have differentially greater imagery ability. Of the three methods, use of high imagery stimuli is by far the most powerful. In an ANOVA comparison of the three methods applied to visual imagery, Slee (1978) found an F-value for stimulus imagery of 134.9, versus an F-value of only 6.4 for imagery instructions; also, the individual difference variable was not itself significant but there was a significant interaction between imagery instructions and individual imagery ability (F = 7.0). Use of high-imagery stimuli seems to offer a 20:1 advantage over the other two methods.

Stimulus imagery and visual imagery ability were examined in our beer study, reported in the original write-up (Rossiter and Percy, 1978). Consistent with Slee's results, we found a large stimulus effect for both visual and verbal advertising content, and a small but significant individual imagery ability effect. Instructions to image were tested in an advertising context by Mowen (1980). A group exposed to an advertisement that instructed them to imagine using the product, a fictitious brand of shampoo, exhibited no stronger intention to try it than a control group exposed to a non-instructional but otherwise identical ad. However, supplementary data on reactions to the ads themselves suggest that the instructions may have been "hyped" too unrealistically; there were five sentences containing exhortations to imagine in a seven-sentence ad. Subtler instructions, combined with high imagery stimuli, may be more successful.

Reliance on high imagery stimuli seems advisable for other reasons pertaining to the two alternative ways of stimulating imagery. The technique of instructing the audience to "imagine" may involve a complex process analogous to explicit versus implicit conclusion-drawing in the verbal persuasion mote (e.g., McGuire, 1969; Percy and Rossiter, 1980). Conclusion-drawing poses a risky manipulation for advertisers without careful pre-testing with particular target audiences. The other alternative of targeting individual differences in visual imagery ability poses a practical problem, too, because the target audience selected for marketing or advertising reasons may not be distinctive on the visual imagery variable. In short, then, use of high imagery visual (and verbal) stimuli is the most reliable method for advertisers.


P-1 The larger the illustration, the better--except for direct-response ads of the informational variety.

Larger pictures (or in print ads, illustrations) produce larger reported visual images and these, in turn, produce better learning (Kosslyn and Alper, 1977; Kosslyn, 1980). This has long been known in advertising in the form of the "square root law" stating that recognition of print ads increases with the square of illustration size; i.e., roughly twice as much recognition, such as measured by the Starch Noted score, with four times the picture size.

Rossiter and Percy (1978; 1980) and Mitchell and Olson (1977; 1981) have demonstrated, additionally, that illustration size also has a positive impact on evaluative responses (brand attitude) and not just memory responses. The relationship between illustration size and brand attitude has since been replicated in another, as yet unreported, experiment. Our first experiment, on beer, seems to have found some application. Although the evidence is anecdotal, we have noticed a tendency toward more product close-ups in beer advertising. Pabst, for example, has been using a two-page spread that consists of a larger-than-life. brimming glass of beer.

There is only one print advertising situation where illustration size is not important: direct-response ads of the informative "long to " variety. Direct mail and direct-response print ads fall into this category. First of all, memory is not a factor since the consumer responds either for or against, immediately. Thus the memorial advantages of illustration size would not accrue. In the second place, the typical technique in direct-response ads is to provide the reader with as much information as possible in order to achieve a "stimulus-sufficient" decision. In informative ads, space limitations force a trade-off between long, detailed copy and the area that could be devoted to large illustrations, with the former more important in this case.

P-2 Seek attention-holding illustrations (2 seconds or more) not just attention-getting illustrations.

A number of experiments (reviewed in Rossiter and Percy, in press) have shown that recognition and recall of pictorial stimuli reach an asymptote or peak when the stimulus is attended to for at least 2 seconds; more recently, Avons and Phillips (1980) place the peak at 2.6 seconds. Going beyond memory responses, Graefe and Watkins (1980) demonstrated that pictures can be mentally rehearsed just like words (cf. "cognitive responses"). This suggests that attention-holding is important for evaluative responses as well. For evaluative responses, the longer the stimulus is attended to beyond 2 seconds, the better.

Advertisers frequently seek attention-getting illustrations but rarely consider attention-holding as an additional consideration. Two seconds is quite a long time for a reader to pause and look at a print ad illustration, especially if the advertiser also wants people to read .he headline and body copy. Many illustrations would not seem to have good attention-holding capacity. Berlyne's early work (1960) indicates that novelty gets attention. But holding attention may require stimuli that are familiar to or "relevant" to the reader. Crane (1972) has aptly ca!:ed this the "first dilemma of message construction"--the necessity for the advertiser to combine the familiar and the novel. Often, as Crane points out, the solution lies in combining familiar stimuli in a novel way.

P-3 Place the illustration where it will be seen before the headline and copy are read.

A very carefully controlled experiment by Brainerd, Desrochers and Howe (1981) has demonstrated that learning is facilitated if the order is picture-word rather than word-picture. Their experiment is particularly important because they employed realistic, line drawing (high imagery) pictures and concrete nouns (the highest imagery parts of speech). Thus, subjects in the experiment could presumably "label the picture," if the picture came first, or "picture (image) the label," if the word came first. This is very similar to alternative print ad formats. The picture-then-word superiority held regardless of whether the target response was to recognize the picture (as in package recognition) or to recall the word (as in brand name recall).

The picture-then-word superiority might imply putting print ad headlines toward the bottom of the page, that is, partially or fully below the illustration so that the illustration is attended to first. However, people may be drawn to an effective illustration first (especially a large one, see P-2) regardless of where the headline is placed. Perhaps the best solution is to pre-test print ad executions to make sure the headline or other verbal copy doesn't draw attention ahead of the illustration. This recommendation applies to single-page print ads. The picture-then-word recommendation would be even more relevant for multi-page "gatefold" print ads where the advertiser can lead or "tease" with either a dominant illustration or a dominant headline.

P-4 Attitudinal "wearout" should not be a problem with illustrations but they may lose attention, suggesting use or variations on a theme for print advertising.

The phenomenon of attitudinal "wearout" has primarily been demonstrated for TV commercials, not for print ads. Attitudinal wearout seems especially likely for humorous commercials and "slice of life" commercials. The intrusive television medium draws attention to the commercial but people get tired of the scenario and begin to counter-argue with it. Print ads, on the other hand, are less intrusive because the reader can turn the page. People rarely counter-argue with print ads on later exposures after a favorable initial exposure. Rather, print ads lose attention.

This hypothesis suggests greater use of variations on a theme ("pool-outs" in TV advertising parlance) for print advertising, not just for TV advertising. Varied but related illustrations are also consistent with the novel-but-familiar principle espoused by Crane. There seems to be a recent trend toward pool-outs in print advertising, as in the campaigns by large budget whiskey advertisers (e.g., Chivas Regal, Johnny Walker) and cigarette advertisers (e.g., Barclay, Kent III). The TV strategy applied to print would appear to be useful in renewing attention.


T-1 Hold key scenes for at least 2 seconds and alternate key and redundant scenes.

TV video is composed of separate frames which are, of course, perceived as a sequence. However, evidence is emerging that people remember single pictures better than dynamic movement patterns (Hall and Buckolz, 1981). Elsewhere we have suggested that, from a visual imagery standpoint, people encode TV commercials more as a series of "still shots" than as an entire sequence (Rossiter and Percy, in press). The resulting memory sequence is similar to Abelson's (1976) notion of "scripts." In concrete terms, the proposition is that people interpret and remember TV commercials much as they started out, as storyboards or Photoboards.

Even without accepting the extreme "still shot" version of this hypothesis, it is still clear that certain scenes in TV commercials are more important than others. A fair degree of redundancy, in fact, makes for more successful communication. English text, for example, is highly redundant (Miller, 1951) and one well-known test of text comprehension is the Cloze procedure, a sort of sentence completion measure in which redundancy or predictability makes for a high score. The analogy to TV commercials is that there are key scenes and redundant scenes.

It follows that certain of the guidelines for print advertising would therefore hold for TV advertising. In particular, following P-2 above, key scenes should be held for at least 2 seconds (not necessarily "frozen," but without an essential change in visual content).

Rossiter and Percy (in press) also suggested that for elaborative, evaluative, visual imagery to occur, it is actually better if the viewer looks away from the screen, so that this self-generated visual imagery will not be interfered with by the literal images on the screen. This suggests what we might call the TV corollary to P-2; namely, to alternate key scenes with redundant or predictable scenes. These "pauses" would give the viewer time to develop visual imagery, much like cognitive responses to verbal material, where the reader can pause "to think."

T-2 Put key scenes before related audio with the audio in the "pauses."

The TV version of P-3 suggests a picture-then-word sequence to be superior. Since pictures in TV commercials are continuous, what this means operationally is to place the "labeling" audio in the redundant scene following the key scene. Abrams (1981) was perhaps the first to suggest the notion of pauses in TV commercials. However, his idea was somewhat the reverse and more limited; namely, to put the key audio in the visual pauses, and not in any particular order. The picture-word sequence superiority suggests, alternatively, that the key is in the video and that the video-audio order be followed.

The key scene followed by the labeling audio in the redundant scene has important implications for the pre-testing of TV commercials via "anima tic" devices. Animatics have a continuous audio track but still-shot slides for the video. Coordination is usually attempted on a fairly rough basis. Better attention to the timing would allow testing of the preceding hypothesis and, if the hypothesis is correct, would result in more effective finished commercials.

T-3 Use atypical variations on a typical script.

The TV version of P-4 is to seek visuals that are novel yet familiar. In a way, the basic scripts or themes for many products advertised on TV are very familiar, perhaps even more so than in print advertising. Thus we have the "problem-solution" (negative reinforcement) script, the "happy theme" (positive reinforcement) script, and the "testimonial" (endorsement) script. TV commercial executions vary somewhat in their adherence to these basic scripts, though detergent commercials (problem-solution) and cola commercials (happy theme) seem to have trouble in finding uniqueness.

A recent experiment by Graesser, Woll, Kowalski and Smith (1980) indicates that TV advertisers should continue the search for unique executions of these basic themes. Specifically, these investigators found that atypical versions of scripts are better recalled and recognized than typical or "stereotyped" versions--but only for short intervals (h-hour delay in their experiment). At longer intervals (1 week in their experiment) recall and recognition became increasingly "regressed" toward the typical versions; that is, the generic script was recalled or recognized rather than the atypical variations.

These results can be taken to imply that TV advertisers really have to fight hard to find and maintain unique executions, especially if the purchase decision is likely to be substantially delayed following commercial exposure. Otherwise, "everyone's" advertising in the category tends to get merged in the consumer's mind with time (somewhat like the episodic versus semantic memory distinction in verbal research). Intelligently varied and competitively unique pool-outs, notably for Miller Lite beer, therefore seem worth the expense if the purchase decision is delayed.

T-4 For visual-word "supers" use high imagery words in positive sentences except, perhaps, for disclaimers.

This final hypothesis merges visual input with verbal input in the form of "seen words" in TV commercials, i.e., superimposed written messages known as "supers." Words, like pictures, have visual imagery capacity (see the two following papers and also Rossiter and Percy, 1978; 1980; in press). It follows that supers that are meant to be attended to and reacted to (such as statements of product claims) should employ high imagery words and use positive sentences. Conversely, and this raises ethical questions, disclaimers that the advertiser may wish to hide should employ low imagery words or else high imagery words in negative sentences. Supporting evidence is given in Smith (1981).


This paper proposes 13 hypotheses for more effective visual input in advertising. They were developed from psychological research on visual imagery. The most immediate future direction is therefore to test them in applied advertising settings. My colleague Larry Percy and I have some of these tests under way. Some astute advertisers seem to have picked up on others, perhaps intuitively or by also translating the burgeoning visual imagery literature.

The expanding scope of visual imagery theory and research promises many future insights for advertisers. Visual imagery theory and research (for visual input) and psycholinguistic theory and research (for verbal input) may be the "breakthrough"perspectives for creating more effective advertising.


