Modality Effects in Television Advertising: a Methodology For Isolating Message Structure From Message Content Effects

ABSTRACT - Research on modality effects has rarely examined how processing of visual and verbal message components may be affected by the amount of meaning congruence between the pictures and the words used in an advertisement. Only by controlling for or manipulating message content can treatment effects be attributed to differences in modality. This paper describes a methodology for developing content redundant television ads. Findings from an experiment comparing a picture-only message with its verbal-only analog provide support for the "picture-superiority effect" on memory within the television medium. The findings are -suggestive of the need for further research into the nature of modality synergies inherent in television advertising.
Increased interest in modality research comes at a time when communication researchers, advertising practitioners and public policy makers are calling for more research on the relative importance and role of visual versus verbal message components in advertising (Alwitt & Mitchell, 1985; Liu, 1986; Schmalensee, 1983). Recent studies in this area have focused almost exclusively on the persuasive effects of written words and pictures as combined in print media ant have produced inconclusive and controversial findings ( Edell & Staelin, 1983; Kisielius & Sternthal, 1984; Mitchell & Olson, 1981). One problem that may have contributed to the inconsistent findings in previous research on modality effects is the failure to control for content differences between pictorial and verbal messages, i.e., the failure to separate message structure from message content effects.

The objective of this paper is to describe a methodology developed to construct meaning congruent or content redundant pictorial and verbal television ads. Using these commercial stimuli, findings from an experiment designed to examine the effects of modality on memory are then described.


To isolate the effects of modality on message learning and persuasion, development of messages with meaning congruent content in each modality - audio and visual - was an important goal [This paper does not examine the visual as opposed to aural presentation of verbal information. The terms audio and visual are used to refer to spoken words and pictures, respectively.].

Only by controlling for message content can processing differences be attributed to differences in message modality. Content redundancy was operationalized based on subject ratings of the similarity of the information content in audio and visual messages, which required wanting new audio copy and modifying the existing visuals. The methodology developed for accomplishing this was adapted from Edell & Staelin (1983) and Baggett & Ehrenfeucht (1982). A six step procedure was used:

Step 1: Selection of Products/Commercials

To ensure, insofar as possible, that results were not due to the nature of the product category or brand, three brands from different product categories were used. The television commercials for these brands were obtained from existing film footage. Selection of the products/commercials was guided by the following criteria: (1) brands in categories for which the subject population (college students) would be potential members of the target audience and for which they would normally use information presented in the mass media in brand selection decisions; (2) brands unfamiliar to the subjects so that learning and persuasion measures would not be affected by prior brand knowledge/beliefs or by associative interference from prior media exposure; (3) commercials that would allow product information to be depicted both verbally and visually (e.g., no use of computer graphics or other "high tech" video devices, verbally reproducible visual scenes); and (4) commercials that used voice-over audio and few people, so that source credibility effects could be minimized.

Requests for commercials meeting these criteria were made to various packaged goods marketers and advertising agencies. In addition, commercials for products not available in the U.S. were obtained from Canadian television. From this pool of commercials were selected those that met the technical criteria of voice-over audio, few people, etc.. Subsequently, primary data were collected on the remaining set to determine which commercials met the other selection criteria. Data were also collected on the importance of various product attributes in selecting a brand within each category. It was desirable to use commercials that mentioned attributes/claims considered somewhat important by the target audience. Analysis of the data yielded four commercials that met all of the aforementioned criteria: Sauce'n Savour baking/barbeque sauce, Neo Citran cold medicine, Black Magic boxed candy and Catelli spaghetti sauce.

Step 2: Editing of Visual Messages

The four television commercials selected for further testing were initially edited to insert a package shot at both the beginning and end of each ad, to remove any supers (superimposed writing), and to eliminate the audio tracks. Storyboards, consisting of still photos/slides taken at approximately two second intervals throughout each thirty second commercial, were then developed. These storyboards illustrated the story line for each ad, capturing every cut or dissolve to a noticeably different frame.

Each commercial (video only) was then shown to a different subset of the prospective subject population, followed by the slide presentation representing sequential "snapshots" of the visual scenes depicted in the actual commercial. Approximately 50 subjects participated in each of the four presentations. Subjects were allowed to view each slide for three minutes, during which time they were asked to write down everything that was happening in each slide, including their interpretation of what the manufacturer was trying to tell them about the product.

It was felt that since later analyses would require collapsing responses across products/ads for each condition, any significant differences found among the four ads on such things as comprehensibility might obscure the effects of the modality manipulations. Several 5-point scales were therefore administered to measure subjects' subjective responses to the overall commercial (boring-interesting, attention getting-not attention getting, sad-funny, likeable- not likeable, confusing-not confusing). However, no attempt was made to control for differences on these variables since the focus of the research was on learning and persuasion differences across modality conditions (audio, visual), within each commercial.

A check on the validity of proceeding in this fashion was made using one-way analyses of variance and a posteriori contrasts (Tukey) to compare the four commercials on each of the attributes mentioned above.

Although all of the F values and some of the contrasts were found to be significant at at least the .05 level, the means indicate that these differences were not extreme, i.e., all of the means fell somewhere between 2.4 and 3.9 on a 5- point scale. In general, all four ads were perceived as somewhat interesting, attention-getting, likeable, and not very confusing.

The open-ended responses for each slide were coded and counted by a graduate student, resulting in a summary of the different messages conveyed per slide and the number of subjects who perceived each message. Those visual scenes from each commercial that conveyed too many messages or conveyed strikingly different messages to different individuals were then eliminated (if they did not interfere with message comprehension) or modified (if they illustrated important copy claims).

Step 3: Second Version of Visual Messages

The revised commercials and storyboards were then given to four additional groups of subjects of approximately twenty-five each. The instructions and procedure were identical to step 2 above, except that subjects were given the additional information/incentive that the manufacturer was interested in converting each television commercial to a radio commercial, while still communicating the same message to the audience.

Only one commercial (Black Magic) had to be altered at this stage (to eliminate a white rose dissolve that elicited multiple interpretations). It was also decided to eliminate the Catelli commercial at this point, because of its overall visual complexity and because the research called for only three commercials.

An analysis of the second set of responses to the semantic differential items indicated that there were now DO important differences among ads on "attention-getting" and "like-ability" (largest F = 2.821, p < .05). The remaining attributes showed very little change from step 2 above.

Step 4: Development of Redundant Copy

The verbal information obtained in step 2 and step 3 was then used to write a 30 second audio text for each commercial, on a slide by slide basis. The text attempted to faithfully incorporate not only the ideas expressed by subjects re each slide, but the language style and vocabulary as well. Unfortunately, the copy developed ran to 140-150 words per ad, while a typical 30 second commercial may have 90-110 words. At this point time compression of the audio was considered. However, a professional copywriter was able to take the audio copy that had been initially developed and not only reduce the number of words required to faithfully express the visual scenes, but add a professional and exciting touch to the copy as well.

The copy thus created for each ad was then recorded by professional radio announcers, a different voice being used for each of the three ads. Professional creative assistance was obtained to determine the type of voice required to match the mood conveyed by the video track of each ad, and to determine inflection, tonality, and voice speed. Five different takes were recorded for each ad. Selection of the "best" take for each ad was subjective, but made by a group of professional creatives.

Another group of subjects was then exposed to the audio and visual messages. Some subjects (40) received the 30 second audio version of each ad, and some (29) received the 30 second video version. All subjects were asked to indicate the major copy points being conveyed by 30 second audio or visual messages. In addition, were asked to rate each message on three 5-point scales for ease of understanding, believability, and importance of product claims for making brand selection decisions. The latter scale consisted of four items, different for each commercial and representing those claims or product attributes that had been mentioned most frequently in responses to step 2 and step 3 above.

A Chi-squared test of homogeneity (Kruskal-Wallis One- Way ANOVA) was used to test for significant differences in the number of messages conveyed by each modality. No significant differences were found between modalities on this variable (largest Chi-squared = 1.09, p =.30). T-tests were used to test for differences in believability, ease of understanding and relative importance of attributes between audio and visual messages, within each ad. Again, no significant differences were found for ease of understanding (largest t = 1.47, p < .10), believability (largest t = 1.84, p < .07), or for any of the product attributes.

The number and content of different copy claims communicated by each modality, for each ad, was also examined. From 70-90% of all responses pertained to the top four claims. Very few differences in the number/content of copy claims mentioned were found between audio and visual messages for any of the ads [with the exception of the expected differences in messages conveyed by each modality - - manufacturer's name and package/product size/color/shape, which were only mentioned in the audio or video tracks, respectively]. It was decided, therefore, to consider the top four claims made for each ad as the ones of primary importance, and these were used in subsequent analyses.

Step 5: Further Tests for Content Redundancy

As a final check on content redundancy, another group of subjects (28) received both) the audio and visual tracks of the three commercials and were asked to rate the similarity of copy points conveyed by each modality. The four copy points/claims determined in step 4 above to be the most commonly mentioned in both modality conditions, for each ad, were tested for audiovisual similarity (l=very similar; 5=not at all similar). In addition, two claims that were made in (only) the audio (manufacturer's name) or video (package/ product size/ color) component of each ad were included to permit a comparison between similar and dissimilar claims for each ad, and to check on subjects' understanding of the task .

A correlated t-test was performed to test for differences in similarity between the four attributes for each ad hypothesized to be perceived as similar across modalities and the two attributes hypothesized to be perceived as dissimilar across modalities. A correlated t-test was used because all data were within subject data. Two new summary variables were created for each ad, by computing the average of the means for the four "similar" variables and the average of the means for the two "dissimilar" variables. A one-tailed test was used because the alternative hypothesis (H1) states that the average of the means for the four similar variables should be less than that for the two dissimilar variables, given that l=very similar and 5=not at all similar.

The HO of no difference was rejected in each of the three tests for the three ads (smallest t=3.855, pc.001; see Table 1). Thus, one can conclude that those claims that were supposed to be perceived as highly similar across audio and visual messages were so perceived, and those that should have been perceived as highly dissimilar were also so perceived.



At this point, the Neo Citran ad remained somewhat problematic. It appeared, upon questioning some of the subjects, that the claim "available in two formulas" did not come across clearly in the visual message, while the claim "effective against all cold symptoms" was not clearly communicated in the audio message. Further editing of the audio and visual components of this ad was done in order to correct these problems. Additional data on perceived similarity of claims across modalities was then collected and analyzed. These results indicated that the audio and visual messages for the Nco Citan commercial were now perceived as highly redundant.

Step 6: Final Version of Audio and Visual Messages

Lastly, it was necessary to further edit the audio and visual messages so that in the combined audio-visual condition (included in a second study not reported here) the timing of the audio statements and visual scenes coincided. This required manipulating the visual message primarily, shortening and lengthening some scenes so that the words being said corresponded to the visual scenes being illustrated (e.g., when "Black Magic" was heard, a package shot with the written name was seen).

The general tone of the finished commercials was favorable to the products, simply stated/depicted, and presented factual information to provide claims for subjects to recall/recognize. Certain message elements remained the same in all treatment conditions, across all commercials: (1) the manufacturer's name was mentioned in the audio component of the ad, once in the first five seconds and again in the last five seconds; (2) the package shot, with the brand name, was shown in the visual component of the ad - - once each at the beginning and closing of the ad, and several times in the middle; and (3) temporal position of the brand name mentions coincided with that of the visual package/brand name shots.

The audio and video tracks for each commercial (three) were taped separately, so that subjects could be exposed to each modality alone. Each message was thirty seconds in length. Rate of presentation was similar to that of typical television commercials: 34 words/second and 30 frames/second for each commercial. Appropriate background music was selected for each ad, and was included in both audio and visual modality conditions. This was necessary so that in those experimental conditions conveying only visual information there would be some accompanying sound. The music selected was instrumental, and included three versions of the same melody, identical except for the instruments used and the beat (slow/moderate).


The content redundant audio and visual messages created in this manner were used to examine modality effects on memory. A consistent finding in the literature is the "picture superiority effect", the ability of pictures to be remembered more easily and for a greater length of - time than their verbal counterparts (Childers & Houston, 1984; Shepard, 1967; Paivio, 1971). However, few studies have applied these findings to advertising messages; fewer still have examined picture-word differences on memory within the television medium.

Experimental Design and Methodology

The experiment reported here attempted to confirm the picture superiority effect with television commercial stimuli that contain complex, dynamic pictures and words. It compared single channel visual-only messages with content redundant audio-only messages on measures of learning and ad/brand evaluation. Subjects (college students at Washington State University) in each modality treatment condition were exposed to three experimental ads for different products (at three levels of repetition). The ads were embedded in twenty minutes of program material and a cover story was used to induce a "low involvement" message processing strategy.


The recognition measure used was the sum of six two- alternative forced choice recognition questions. The brand name recognition question was identical in both audio and visual conditions. The other five questions were either administered aurally on tape (audio condition) or visually as a slide presentation (visual condition). The recall measure consisted of an aided, sequential recall task for brand name and copy claims, given the product category name. Those in the visual condition were also asked to draw the packages and indicate package colors, while those in the audio condition were asked to recall the manufacturer's name for each brand. Recall was scored as the total number of message items correctly recalled.


To test the hypothesis, analyses of variance for each product were conducted with Message Modality as the independent factor [While other independent factors were included in this experiment (e.g., exposure-test delay, repetition), findings regarding each of these factors and their interaction with modality are not discussed in this paper.].

The results of the analyses of variance are presented in Table 2, along with the means for each Modality condition by commercial seen or heard. Recognition and recall scores were significantly higher for the visual messages than for their verbal counterparts. These findings provide support for the hypothesis tested.




Previous research on modality effects has rarely examined how processing of visual and verbal message components may be affected by the amount of meaning congruence or content redundancy between the pictures and words, although a cursory content analysis of current television executions indicates that there is considerable variation in the amount of content overlap present in audio and visual message components. This suggests that there may be synergies created when meaning overlap is present across modes. However, modality research has not systematically investigated these synergies but has been characterized by either a failure to control for content differences between pictures and words, or superficial attempts to ensure some degree of picture-word meaning congruence.

A more fruitful approach to this issue may be to consider audio-visual content redundancy as a theoretically important mediator of modality effects in its own right. This might be approached by (1) controlling for content differences by creating content redundant audio and visual messages, or (2) manipulating content redundancy as an independent factor in studies with dual channel, audio-visual messages. This paper described a methodology developed to achieve (1) above, and reported findings confirming the existence of a "picture superiority effect" on memory with television commercials. Modality researchers should also consider manipulating content redundancy so as to determine how learning and persuasion are affected by the amount of congruence between audio and visual message components.


