The Advertising Pretest As Part of a Multimeasure, Multimethod, Multisituation Validation and Application Research System

Michael L. Ray, Stanford University
[ to cite ]:
Michael L. Ray (1975) ,"The Advertising Pretest As Part of a Multimeasure, Multimethod, Multisituation Validation and Application Research System", in NA - Advances in Consumer Research Volume 02, eds. Mary Jane Schlinger, Ann Abor, MI : Association for Consumer Research, Pages: 577-588.

Advances in Consumer Research Volume 2, 1975      Pages 577-588

THE ADVERTISING PRETEST AS PART OF A MULTIMEASURE, MULTIMETHOD, MULTISITUATION VALIDATION AND APPLICATION RESEARCH SYSTEM

Michael L. Ray, Stanford University

This invited paper was developed as part of a workshop on pretesting at the ACR meetings. The author's contention is that a collaboration of managerial, behavioral and quantitative people in advertising can be established with a research system that attacks the important "problems of the middle range" that occur repeatedly in advertising. The research system would include, in order: (1) examination and development of the problem structure, (2) analysis of available literature, (3) laboratory pretesting, (4) computer simulation application of pretesting results, (5) field experimental validation of lab and simulation predictions, and (6) campaign monitoring. The system includes managerial aspects with multimeasures representing managerial goals, behavioral aspects with multimethods representing the main contributions of the behavioral sciences, and quantitative concerns with multisituations bringing in the quantitative sciences. This approach is supported by an extensive research program with one dozen lab pretests which balance natural conditions and research needs, a media modeling study, two extensive field studies, and several monitoring projects.

The advertising pretest, when it is used correctly, is part of a managerial process. It can be used to deal with general, recurring questions that keep coming up in specific advertising decision situations. The pretest is quite inefficient if it is used in isolation or if a new form of pretest is developed for each individual situation. So the goal of our research at Stanford has been to develop a standard pretesting format for general managerial problems in advertising.

Thus far we have done about a dozen developmental studies involving about 3000 respondents. The general technique we have developed has been tentatively validated by two large-scale field experiments, one on print advertising and one on television advertising. Our data have been used in media model runs with positive results.

The purpose of this brief statement is to provide an overview of our research our assumptions, main findings and implications for advertising pretesting. For those interested, there are three papers which give greater detail on the research (Ray, 1972, Ray et al. 1973, Ward and Ray, 1974).

TYPES OF PRETESTING

There are essentially three types of pretests which come at different stages of the advertising decision-making process. The first is what might be called developmental pretesting. This is done as the advertising is being created. It should be the province of the creative people within an advertising agency, because developmental pretesting is used to determine whether certain components of advertising work. Typically, the materials are unfinished advertising or advertising in a rough form. The samples are usually accidental. The measures tend to be either gross physiological ones or, on the other hand, consumer "expert" opinions about the content of advertising.

The second form of pretesting is what might be called selection-scheduling pretesting. This is done with finished advertising, and its goal is to determine which of several alternatives should be run and how those alternatives should be run or scheduled. Here the alternatives are in as finished form as possible, the sampling is more representative of the audience, and measurement is made to determine realistic communication response rather than physiological indicators or consumer opinions of advertising effectiveness. The name of the game here is to predict to the response that is in fact the objective of the campaign. Because of this goal, it is necessary to make the exposure conditions and measurement as natural as possible, considering the usual "laboratory" type of setting for the test.

The third type of pretesting might be called limited posttesting. The best example of this sort of approach is the day-after-recall test. This is usually used at a later stage in the decision-making process when the alternatives have been narrowed down to a very few, often times one. In a sense, then, this is used as a final check in many cases to assure decision makers that the commercial they are about to run is an excellent one. While some companies such as Proctor & Gamble use limited posttesting on a regular basis for selection and scheduling, this tends not to be the norm. The advantage of limited posttesting are natural exposure and quick measurement. The disadvantages are a limitation in both measurement and possible variation of alternatives. Also, sometimes, the cost can be too high.

The Stanford research has opted for the middle ground in pretesting; that is, we have worked with just the selection-scheduling type of pretest. Our feeling has been that it is desirable to have finished advertising to determine actual potential of ad response. The laboratory setting of the selection-scheduling pretest offers the potential for control of those variables which are critical in advertising effectiveness. Such pretesting can be done quickly with less cost than the limited posttesting type. It is possible to make the conditions of exposure quite natural and measure on a number of levels so the full texture of communication response can be gauged. Most important for us, the selection-scheduling type of pretest has the potential of being used in a research system involving not only pretests but also behavioral analysis, media models, field experimentation, and campaign monitoring. Such a system promises the potential of validation of both the pretest and the models. The end result of such an approach should be not only better pretesting but also more extensive, efficient utilization of both behavioral science hints and media models in advertising decision-making.

THE RESEARCH SYSTEM CONTEXT FOR PRETESTING

The Stanford research started with the general goal of developing a pretesting technique that could be used to determine the repetition response function for advertising in specific situations. As this work began, we began to realize that the repetition function or wear out problem in advertising was really only one of a number of general recurring problems that needed to be attacked in a more SYStematic was than had PreviouslY been done.

The situation as we saw it for these general recurring problems could be depicted as shown in the chart on the following page. In essence, there were three general types of attacks on advertising problems, none of which was totally satisfactory. The first was the advertising media modelers' or management scientists' approach. These people started with a well-defined problem, such as advertising scheduling, and then proceeded to develop a clear conceptualization of the problem area by working intensively with managers who understood it. On the basis of that discussion and references to simple-minded behavioral science ideas, media models were created. In some situations, these models are used to plan campaigns, but they are never satisfactorily validated.

The second approach shown on the following page is the one this paper and workshop are most concerned with. This is the typical copy testing approach. Unlike the management science modeling one, the copy testing approach does not specify a problem that can be dealt with in a continuing and ordered way. The usual question given to a copy tester is a very short term and narrow one such as: Is Ad A Better Than Ad B? This kind of question is almost immediately examined in a copy test without: (a) any intervening examination of past experience relative to the general types of ads involved or (b) behavioral science information which might provide hints as to whether either or both of the alternatives should be considered at all. The copy testing procedures that are used are also deficient in several ways. Usually copy tests are done with accidental samples, without the normal surrounding material that is involved when ads are actually exposed, without the variations of exposure that allow a scheduling decision (that is, there tends not to be any repetition variation or variations in program type or variation in media type or variation in competitive setting), without measurement on the several levels that determine advertising response in an effective way, and without delayed measurement or measurement of purchasing action. What is being described here is the "one shot immediate measurement copy test." Not only is such a copy test typically one shot in terms of exposure, but there is also a tendency for copy testing researchers to seek the single ideal measurement. This is a search which could be likened to the search for the Holy Grail. In fact, a single measure can measure only a single response, and no single measure, whether it be physiological response or response time or anything else, can predict directly to a field or campaign response under natural conditions.

In addition to having difficulties in terms of the problem statement, the lack of behavioral science and experience review, and the characteristics of the copy tests themselves--the copy testing approach suffers from a lack of actual application. The dashed line in the chart is meant to indicate that only sometimes are the results of copy tests actually applied to campaign decision making. The managerial situation in which copy tests are done often leads to their purpose being something other than improving the quality of the campaign. It seems that psychological, organizational and political factors are often a greater reason for copy tests than anything else. Since these copy tests do not have the variations that might allow them to be related to the field, it is very difficult for copy testers to say anything more than Ad A seems to be better than Ad B. This provides very little direction in terms of how the ads would operate in an actual schedule in an actual campaign. And because there is very little use of specific copy testing information in campaign development, there is also very little feedback from the campaign to subsequent decisions or to the copy testers themselves.

The third type of approach to advertising decision making problems is the academic application of behavioral science ideas. The main problems with this approach, as can be seen in the chart, is that the behavioral scientist tends to start with propositions and then attempt to find a problem to which to apply the propositions. Beyond that difficulty, the propositions sometimes get applied in actual campaigns, but seldom is there application in copy tests or field experimentation. And feedback is almost non-existent.

FIGURE

THE STATE OF BEHAVIORAL APPLICATION IN ADVERTISING.

When one attempts to solve a particular problem by using the three approaches shown in the chart, he or she often finds that these approaches give very different answers to general problems. For instance, in the case of the repetition function problem, we found that each approach gave a different answer to the question of whether the repetition function was affected by different advertising situations. The management science model builder tended to use a single function, adapted from the findings of nonsense syllable verbal learning research, to depict all of the possible responses in all possible situations. So the model builders answer to the question was typically: "No, there are no differential effects on the repetition response function due to the situation." The copy tester, on the other hand, earns his living by finding differences between commercials. So his answer to the repetition function question. even though he or she seldom does any copy testing with repetition, is "Yes, there are big differences in response to advertising." The academic behavioral science applier tends to see a great deal of complexity in the world. Since, as shown on the chart, there is really no research done in this approach, the behavioral scientist answering the question usually said: "Well maybe, there might be response function differences."

Given the ambivalent state of affairs with regard to the repetition function problem and others which recur in advertising, we felt that there was a need to develop a research system which could be applied to such problems. The basic assumption of this research system is that the problems can be most efficiently handled with a combination of the best aspects of each of the three approaches shown in the chart. One current version of our recommended research system is in Figure 1, which originally appeared in Ray (1972, p. 476). In this case, the management problem of concern was: 'What kind of messages should be used in highly competitive situations?" This is representative of the type of general, recurring problem that should be studied by such a research system.

The second stage of the research system is one in which behavioral science and past experience is tapped to develop alternative strategies, in the case of Figure 1, this meant the develoPment of alternate message strategies.

The third stage is the one involving copy testing. In Figure 1, it is called the laboratory experimentation stage. The function of copy testing in such a research system is to take the alternatives developed from the second stage and test the way they have been implemented in the specific situation of concern. Copy testing or laboratory experimentation is emphasized because it provides a quick and low cost method to separate out those alternatives that should be considered further. It also allows the decision maker to see how the ideas from the behavioral area and past experience apply in this specific setting. It is with these types of goals in mind that the pretesting research and techniques of our project have been developed.

The fourth stage of the research system has to do with media model runs. In the few cases in which copy testing results have been given a validation test in the field, there has been a naive attempt to see if the results in the laboratory as they stand apply directly to the results in the field as they stand. There are several problems with this sort of validational approach. The fourth stage of the research system is an attempt to eliminate some of these problems, which come from lack of attention to field situation variables in making the predictions to the field. Media models have been developed which include as part of their data base, many of the factors which might affect the operation of various commercials in the field. By using response function estimations from the lab within the context of the other media models inputs, it is possible to make very precise and realistic predictions to the field.

The fifth stage in the research system is field experimentation. It provides a solution to another of the validational problems of pretests. That is, pretests are often asked to predict to a situation in which the very variables which are manipulated in the lab are not cleanly manipulated. It is unreasonable to expect a variation in copy to show up in a test marketing situation in which these variations ln copy have not been experimentally manipulated. The field experimentation recommended by the research system would not have to be done often, but it would have to be done from time to time to accurately validate both the copy tests and the media models. The article from which this figure came indicates how such field experimentation might be developed to test these two components of the research system.

FIGURE 1

OUTLINE OF PROPOSED RESEARCH

There is one more stage of the research system which is not shown in Figure 1. This is the campaign monitoring stage. Once the previous five stages of the research system are done, there would be a great deal of information on the particular alternatives that survive and are actually run in campaigns. Continuous monitoring systems, especially of the repetitive survey type, can provide additional information which might be utilized in the earlier stages of the decision process.

THE LABORATORY EXPERIMENTAL TECHNIQUE

When pretesting is considered as part of a decision and research system of the sort partially depicted in Figure 1, there are certain requirements for it that otherwise would be ignored. There is a need to present materials in a very natural way with all of the key situational variables operating while at the same time there is clear experimental control and definitive measurement of response.

The procedure we have used in the Stanford program is an after-only experimental design with a double-blind cover story. Typically there are repetitive exposures of test commercials embedded in a stream of messages, and there is multiple measurement after test exposures to determine the nature of communication response.

This technique has been used for several problem areas and in a variety of types of locations. These have included mobile units in shopping centers, store fronts, schools, and central location research facilities. Respondents are recruited to participate in studies that are a cover for the actual intent of the research. In most of the research we have done the "Shopping of the Future" cover story has been used. When they arrive at the facility, respondents are given materials on possible alternative means of cable television and teletype-telephone home shopping approaches that might be used in the future. They are then told that they will see a demonstration of such an approach which consists of a stream of messages presented on a futuristic television screen display. In some studies we have departed from the "Shopping of the Future" cover. We have told respondents that the research was being conducted to investigate television violence and humor or television ratings. In the case of these studies, respondents then saw normal program material with our test commercials embedded in the Programs.

Following whatever presentation the respondents see, they fill out a self-administered questionnaire that first has questions related to the cover story and then questions on play-back of the messages, cognitive response to the messages, attitude and purchase intention, and cued recall and response to the messages. Also included in the questionnaire are questions on respondent characteristics and past experience with the products and brands involved. Sometimes there are behavioral measures which follow the test itself.

Several aspects of this procedure should be underlined. It combines the advantages of experimental control with an effective cover story and relatively natural exposure conditions. The respondents are not told to concentrate on the commercials themselves. Rather they are concerned with the general presentation, which is closely connected to watching television normally. In addition, they usually view the commercials in a "living room" setting along with two or three other individuals. At the same time that this natural setting is achieved, it is possible to show respondents commercials at 0 through 3 exposures in competition with the normal commercial fare that is offered on television. In fact, it is possible to add print ad versions of commercials and increase exposure up to 10. The studies that have been done with the technique have included competitive messages, multi-media effects, differences in cover story and degree of attention directed to messages, variation in distraction, and in the types of viewing groups.

Review of Results with the Technique

By doing pretesting within the setting of the research system just described, it is possible to develop a more realistic assessment of the communication response that occurs in specific advertising situations. As the repetition project has developed, we have moved from a belief in the straight learning hierarchy of effects ideas with common exponential functions of response toward one in which there are a variety of hierarchy possibilities with unusual functional relationships. A course of that expanding realization has run through the studies that are outlined below. More detail on these studies is found in Ray (1972), Ray et al. (1973) and Ward and Ray (1974). The paragraphs below simply give a general indication of the development of our thinking.

Studies I through III: Repetition Pretest Technique DeveloPment. In the first three studies we were concerned with determining whether we could get repetition response functions in the laboratory setting that were realistic and consistent from study to study. The first study involved 18 different test advertisements, three ads for three brands in each of six product categories. Not only were there repetition effects but there were also interesting differences across product categories, ad types, life cycle stage of the product, and across specific ads. The second study was done with a subset of the advertisements from the first study, and the results were consistent with those that were developed in the first one. The third study concentrated on "refutational" versus "supportive" types of advertisements. There were pairs of ads for five different brands in five different product categories. In addition, there was a test of a color versus a black and white version of a campaign (four advertisements) for a grocery product. There were strong refutational versus supportive differences when considered by usage groups. Unlike the results in study one, which indicated that there was no consistent color versus black and white difference, this study found that the color campaign was more effective in generating gross awareness of the advertising, while the black and white campaign, if recalled, was recalled at a greater depth. The conclusion of these three studies was that we had a technique that worked quite well and gave us interesting and, on a face validity basis, reasonable results.

Field Study 1. This study was a replication with variation of the mail print advertisement field experiment that was done originally by H. Zielske in 1959. The study ran over a 13 week period with respondents receiving mailings on a weekly basis. Embedded into the mailings were weekly, bi-weekly and monthly schedules of advertising for six different ads or ad campaigns. All of the advertising used in this print field study was previously used in studies I through III. In addition to developing data supportive of Zielske's earlier study, Ed Strong, who did this study, was able to develop a scheduling simulator that could be used to evaluate various advertising schedules. His research is reported in an article in the November,1974,Journal of Marketing Research. More important for the present purposes is the fact that his findings were also supportive of the results of studies I-III. One thing that was learned, however, is that strong response measures from the laboratory were likely to be good predictors of weak response measures in the field. For instance, in the color versus black and white test, the black and white ads that did well on depth response in the lab tended to out-perform the color ads in ad and brand awareness in the field. This kind of "translation" from the lab to the field has been necessary in all the validational work we have done.

Model Application Study. Since one of the goals of the repetition project was to develop response function information for media models, it seemed reasonable that we should attempt to apply some of the laboratory findings to runs of the well-known media model MEDIAC. The data from study III on differences for refutational versus supportive response in purchase intention in various usage groups was applied in a number of ways to a series of runs of the model. The general finding (reported in detail in Part II of the December,1971,Management Science) was that the model ran much more efficiently and produced better results when the more textured data from the laboratory pretests were used as opposed to a single function run of the model. This study gave some support for the attempt to develop more realistic data from advertising pretests than is usually obtained.

Studies IV through VI: Repetition with Variation. These three studies, which were conducted by Roger M. Heeler, were an extension of the use of the repetition laboratory technique in a new setting and for new purposes. These were the first studies that were not done in 8 mobile unit parked in a shopping center. Instead Heeler did these studies in a central research facility. In addition, these were the first studies done with television advertising. Heeler studied in one project repetition with variation in message, in another repetition with variation in media and in a third study the effect of repetition on perceptual maps. Thus this set of studies pushed the ability of the general technique to its boundaries. Heeler found that the effect of television advertising was more dramatic than the effect of magazine advertising. He also learned that there was usually a need for some print in a "campaign" in order to achieve maximum effect. But the mixed media campaigns did not work in all advertising situations. Heeler was able to develop a general simulation of effects, given his data on mixed media. The effects of variation in message were not as dramatic. It is not always true that variation in message can extend the life of a campaign. By varying the amount of competitive advertising, Heeler was able to show that response was quite different depending on the measure. More competitive advertising actually helped the recall of test advertising. But it had a negative effect on attitude and purchase intention measures.

Study VII Methods of Continuous Repetition Response Measurement. A small scale developmental study was done to determine the effects of asking for response during the exposure to repetitive advertising. In the standard design we normally use, measurement is done only after exposure to messages. There would be significant advantages if it were possible to get some measurement during exposure without affecting the exposure itself. Two methods were tried in a small pilot project done at the U.C. Berkeley management science laboratory. One of the measures was galvanic skin response. By attaching respondents to a GSR monitoring system, it was possible to see if resulting gross arousal changes had any pattern during the advertisement exposures. The other during-exposure measure was a teletype-activated scale by which respondents can indicate how interested they were in the material they were seeing. There were essentially four experimental treatments: GSR only, teletype only, both GSR and teletype, and normal after-exposure measurement only. The analysis of this study is not complete, but surprisingly enough, there does not seem to be a great effect caused by the during-exposure measurement. This promises to add dimensionality to future studies.

Study VIII Repetition of Political Advertising. Michael Rothschild did a study in a shopping center storefront in which the key variables were levels of political contest and political involvement of respondents. His was the first study in which we clearly observed two different types of hierarchy of response within the same experiment. For the presidential advertising, increased repetitions produced standard learning hierarchy results acroSs the cognitive, affective and conative measures in the study. For the more low involvement state assembly race, there were effects on the cognitive and extremely strong effects on the attitudinal measure. This was similar to what would be predicted by Herbert Krugman's low involvement learning hypothesis. These results are reported in Ray et al. (1973) as well as in an article by Rothschild and Ray in the July 1974 issue of Communication Research. Rothschild's study also included products that were similar to the classifications in Studies I and II. Again, there was support for the findings in that initial study, as well as some indication of the comparative response to product advertising as opposed to political advertising. Presidential campaign advertising seems to operate in a more high involvement way than the product advertising; whereas the congressional and state assembly advertising seems to operate in a more low involvement manner than the product advertising.

Political Campaign Monitoring Studies. As part of his dissertation research, Michael Rothschild also collected extensive data on one political campaign for the Senate in a midwestern state and on a number of ballot propositions in California. Although these data were not in a form that allowed a precise test of the ideas that were supported in the laboratory experiment, there was some support as far as the analysis could go.

Media Vehicle Exposure Value Research Proposal. Professor Alvin J. Silk of MIT and I developed a proposal for a Marketing Science Institute project on determining the extra value or qualitative value of medical publications. The research proposed was quite similar to that used in the repetition area, and it gave further testimony to the flexibility of this particular copy testing technique.

Studies IX and X: Anti-drug Abuse Advertising Pretesting Procedure. Jerome B. Reed did two large-scale laboratory projects in which repetition was not a variable, but distraction, attention to commercials, competition, audience, and message type were variables. Contrary to previous research in this series, the advertising was placed in the context of program material, and the cover story consisted of the "television violence-humor project." Parents, junior high school and senior high school students were the three kinds of audiences for the study, and they came to central school and organization locations for the test interviewing. These two large-scale laboratory studies convinced us that the response to advertising is much more complex than would be indicated by even the three-orders model mentioned in Ray et al. (1973). By including cognitive response measures in various conditions of distraction and audience type, we were able to determine that some messages were able to be recognized and affected later behavioral responses, while at the same time generating quite negative cognitive responses and little attitude change. Other messages created a generally positive response but nothing that seemed to be indicative of a behavioral one. These findings,,which are presented in more detail in Ward and Ray (1974), indicated a planning procedure in which the three-orders hierarchy model is used as an initial planning step but copy testing with multiple measurement is essential to determine the nature of communication response in each specific advertising situation. This research argued very clearly for the textured sort of copy test and against the single measure, single exposure, unnatural setting type of copy test.

Field Study II: Split Cable Experiment with Anti-Drug Abuse Messages. Two of the messages that were found to have quite opposite and interesting effects on the parents audience were run in a month long split cable field experiment in the Ad Tel West split cable market. The two messages were run on alternate cables with a 400 GRP weight over a 4-week period in May and early June 1973. The primary finding for the present purposes was that the laboratory findings with regard to the two commercials (which were described in the previous paragraph) held up in the field experiment. The strong but irritating commercial did best on all field measures, with the exception of those measures having to do with advertisement liking. Another finding relevant to those doing antidrug abuse advertising was that it was possible to have a marked effect on the belief atmosphere in a local community with a heavy saturation campaign.

Studies XI and XII: Television Clutter Research. Peter Webb, with support from the Marketing Science Institute, has done two large-scale studies in which the key independent variable was the degree and nature of television clutter, i.e., the amount of nonprogram interruptions (primarily advertising) and the way they are scheduled within a program. In this research he used the "Television Violence-Humor Project" and a new "Television Program Rating Evaluation Project" cover stories. This research involved an observational measure of respondent attention while viewing the commercials. Respondents were told that they should watch the programs as they normally would, and coffee and doughnuts were provided in the room to get them moving about as they normally would when they watch television. One preliminary finding was that on many of the measures, the gross amount of clutter was not as important as the position of each advertisement within the clutter stream and the scheduling of the nonprogram material during the program itself.

CURRENT PERSPECTIVES

Developing a pretesting technique is an engineering job. It requires movement back and forth over the stages of the research system outlined here. We believe we have a useful pretesting technique that can be used in a wide variety of situations. Our research thus far has indicated that the response to communication is quite complex. But it can be monitored with a technique such as the one proposed and used here. The data from such a pretesting technique should be usable in the development of media model applications. Our data have been promising thus far and offer hope for improved advertising planning in the future.

REFERENCES

Ray, Michael L., A proposal for validating measures and models in highly competitive decision situations. In Boris W. Backer and Helmut Becker (Eds.), Combined Proceedings. Chicago: American Marketing Association, 1972, 475-478.

Michael L., in collaboration with Sawyer, A.G., Rothschild, M.L. Heller, Roger M., Strong, E.C., & Reed, J.B. Marketing communication and the hierarchy-of-effects. In Peter Clarke (Ed.), New models for mass communications research. Beverly Hills: Sage Publications, 1973.

Ward, Scott & Ray, Michael L. Cognitive responses to mass communication: Results from laboratory studies and a field experiment. Paper presented at the meeting of the Association for Education in Journalism. San Diego, August, 1974. (Stanford GSB Research Paper No. 232, September, 1974.)

----------------------------------------