Copy Research in the Coming Decade
Citation:
Joseph T. Plummer (1981) ,"Copy Research in the Coming Decade", in SV - Symbolic Consumer Behavior, eds. Elizabeth C. Hirschman and Morris B. Holbrook, New York, NY : Association for Consumer Research, Pages: 86-89.
My objective is to articulate a position paper for copy research in the coming decade. This requires that we take a brief look at what we have been doing in copy research and what we have learned in past decades. It should be clear after a little thought to anyone connected with our industry that what most of us have been doing is built around a single model of how advertising works. The "AIDA" model was first articulated by Strong in 1925 and it is essentially a "learning" model that suggests each consumer must go through stages of (1) awareness, (2) interest, (3) desire, and (4) action. This early hierarchical model was refined somewhat by the ANA in 1961, but the basic model has remained in force since 1925. Copy research, for all practical purposes, has concentrated on the first step or stage of this simple learning model -- awareness or attention. While the theory behind this model was primarily how an advertising campaign worked in practice, copy researchers have adopted this model for single tests of an advertisment because they have assumed that unless a viewer or reader first becomes aware or evidences some comprehension of an advertisement no further response or behavior can take place. In addition, we researchers have made a further leap which states that awareness (and to a degree comprehension) can only be measured by a memory test of some kind. It goes by many names -- attention which is a physiological response by an organism; intrusiveness which is a concept built around the nature of a stimulus, and memorability or retrieval which is a learning concept borrowed from educational psychology. The primary memory test most of us use is "day-after-recall" of some specific element of the Lest commercial. Let's look more closely at this specific test that has been in use for decades and is based on an advertising campaign model developed in 1925. What does Day-After-Recall really measure? The recall score measures how many people, when aided with a product or brand cue, can consciously remember and verbalize some specific element of the test commercial which ran the "day before" on the air. It does not measure attention, intrusiveness, or impact of the test ad. Attention, intrusiveness, and impact are response phenomena that occur at the time of actual viewing or reading. All delayed measures such as 24 hour recall can only work on the assumption that these specific responses took place and an appropriate verbal answer 24 hours or 72 hours later is valid indicator that attention took place. Day after recall does not measure recognition, long-term memory, attitude effects, motivation, buying intentions, strategic soundness or campaign effectiveness. Because the methodology of DAR is dependent upon verbalization and noun-based or product substantive stimuli, it is a measure most relevant for certain kinds of commercials. Using two commercials, one scored a 4% on DAR and the other a 58%. Therefore it should be very clear that one commercial is not very effective advertising. We have learned over the years that there are different forms of memory. Memory is really a form of storage and retrieval. Our job as researchers is to find a way to "retrieve" what viewers have "stored" from the commercial they were exposed to. Day-after-recall is one way to retrieve one kind of memory which we have come to know as "left brain" or linear storage. - Linear Verbal learning Rational cognitive thought Mathematics Noun-based facts Conscious There is at least one other form of memory which we have come to know as "right brain" or pattern recognition. - Pattern Visual images Music Emotions and feelings Experience Insight Pre-conscious or Unconscious This latter memory or stimulus storage is very rich and extremely difficult to verbalize during most kinds of retrieval tests. How often have we heard people say, "The sunset was beautiful beyond words," or "words fail me. " In order to accurately trigger each type of memory recall, the trigger or retrieval hook must match the stimuli and form of storage. Linear storage must be retrieved by a noun-based or verbal cue to elicit verbal playback. Pattern storage requires that we must use part or all of the stimulus to obtain recognition or emotional response. The point I am trying to make here is that it makes no sense to test certain kinds of advertising by DAR. It is like trying to measure weight with a ruler or intelligence with a Rorschach test. DAR is an appropriate measure for highly structured commercials with a strong product substantive claim. By a product substantive claim, we mean a message or claim about a product improvement, a new product benefit, something of issue value, a product demonstration, or some specific utility that is highly relevant to the audience. Advertising which is less appropriate for DAR testing is advertising designed to tap some psychological value, user imagery, sensory benefits, reinforce positive usage experience, and emotional feelings. Clearly this is not a dichotomy, but a range where there are many variants in advertising. Beyond what we have learned about the validity or limited value of DAR within an old model, recent evidence on day-after-recall reliability tells us that we have a fairly blunt measure in the related recall score -- with swings up to 10 points. This means that we are dealing with a range rather than a precise scale. Our recent evidence shows that we have significant variation by program type. In addition, recent test-retest evidence indicates fairly wide swings in scores when retested. In a recent article of the Journal of Advertising Research, Cal Hodock presented evidence that target audience composition can also have a major impact on recall scores. Finally, since most of our decisions are nude within product categories, it is instructive to look at this total experience for a brand since 1976. There are 10 copy tests over this three year period with a good range from 8 to 21 and normal distribution of scores. It seems quite clear that the commercial which scored an 8 and the commercial which scored a 21 are the most significantly different performers for this brand during this time period. There is just one catch -- they are the same commercial tested twice! There are a number of reasons why DAR reliability has become rather loose -- much like a "rubber ruler." The changing media environment with increasing clutter, shifting programs and formats, more "special" programming and fragmented audiences. Thus many of the environmental variables that were controlled in the past are less controllable or uncontrollable in the 1980's. There are research reasons caused by working women, unlisted ?hone numbers, people out-of-home more, and lack of interviewer quality control. Finally, there is just good old sampling error and traditional analysis error. It all adds up to influence the reliability of DAR scores, unfortunately, in the wrong direction...to make any given related recall score less reliable than we may have thought it was. The other major test that we have been using a great deal in the past (although not to the extent of DAR) is built around the old "persuasion model" developed in the late 1940~s and given prominence by the Hovland, Janis and Kelly experiments at Yale. The model is essentially a Pre-to-post shift model. The basic notion is that we hold certain attitudes and beliefs toward something and then are presented with a persuasive argument or proposition designed to "change" our attitudes or behavior which can be measured "post the message." This basic persuasion model found its way into copy testing via the theater laboratory developed by Horace Schwerin in the 1950's. A pre-measure toward the brand is taken, the test advertising shown in some context, and then a post measure toward the brand is taken. The major problem with this approach is that it assumes change is the relevant measure and that a single exposure of a very short length message can evoke measurable change. Recent evidence suggests that this model has some validity for new products, but established brands continue to create problems for this simple shift model. Here, as in DAR, target audience composition is a major variable which must be controlled. Establishing a true experimental design in order to obtain reliable and sensitive results is costly; therefore usually ignored. In essence what we have done in the past is spend millions and millions of dollars testing commercials with two basic systems linked to two very old models of the advertising process. We have tended to use them in a most reflexive manner ignoring new evidence and learning as we go. Our companies adopt a system and "by God we stick with it come bell or high water."' What about the coming decade? Is there any hope that we will attempt to learn, build on previous knowledge, and experiment? Do you think we will look any smarter to research colleagues in other professions or to the creative audience we try so hard to help? I believe we will do better in the 1980's if we adopt a copy research philosophy that better acknowledges our state of the art instead of sounding like a religious credo --
There are many reasons why we must adopt this philosophy for the coming decade above and beyond industry pressures. We have seen that the available copy testing tools in use are very blunt measures of advertising communications. Most research has some error, but most copy testing methods -- especially those on-the-air -have considerable error so that interpretation and action must deal with extremes and ranges. Effectiveness implies precise scale and not "a rubber ruler."
Secondly, the models that are broadly operating today were not developed for researching a single commercial or print ad. They were borrowed from other disciplines and adapted in some manner to copy testing, The traditional learning model (AIDA) was developed to explain how a campaign worked -- which is how advertising does work -- as a campaign over time together with the other marketing variables. But even this model as a campaign model has not held up under field experiments or other analytic procedures (e.g. Ray, Palda, etc.).
Perhaps one reason is that the hierarchy does not always occur in the precise, logical order that Strong suggested. There is a suggestion in the work by Festinger that the hierarchy may happen in reverse order, A number of field experiments conducted in the past decade provide some basis for this model -- especially with single purchase high involvement products such as cars. Festinger~; cognitive dissonance theory suggests that attitude change follows favorable action or usage and then selective attention or perception causes the consumer to become more aware of the advertising and its content. This "reverse hierarchy" model still deals primarily with campaign effects and not a single advertisement.
There are some newer models of advertising communication which come closer to individual pieces of advertising communication and take into account how television advertising seems to operate in today's environment with today's consumer. The model which has received the most attention in recent: years is Herb Krugman's low involvement model. This model is especially relevant because it is designed to explain the effect of TV advertising on low involvement products. Low involvement products are those which arc frequently purchased, with relatively low prices and are difficult to discern or have inconsequential product differences. Krugman states that the effect of most TV advertising occurs without much conscious "learning" on the part of the consumer. In other words, many of the responses to low involvement product advertising occurs on a subconscious level and often in a right brain pattern storage mode.
The major assumptions of Krugman's model are (1) TV ads for low involvement products are viewed in a low involvement or passive state. This means that there is little perceptual defense or attempt by the viewer to challenge the message. (2) There will be little conscious learning of content or measurable shift in attitude or change in awareness content until the consumer is in a purchase situation. (3) Constant and repetitive exposure to a brand's advertising may alter the way the consumer perceives the brand -- both attributes and image-over time. Most of this alteration occurs on a pre-conscious or non-verbal level.
This model seems to make a great deal of intuitive sense to most skilled advertising practitioners who know that much of the success of advertising is non-verbal in nature. Symbols, audio sounds, music, pictures, emotions, and unique ideas -- those dimensions for which we have poor measures for today -- are often the ingredients that separate "average", safe and dull advertising from the great campaigns. Given this low involvement model, the measures that exist today which seem most relevant to this theory are (1) recognition measures (2) attitude and image scales and (3) physiological measures such as brain waves.
This is another model or theory of advertising effects which is an outgrowth of the low involvement model and is built upon the work of Jon Gutman of USC and Tom Reynolds of the University of Texas. Their work in the area of cognitive structures suggests that the majority of successful advertisements build some kind of cognitive link between the advertised brand and a basic selling idea. This idea can be (1) an attribute such as all-temperature cleaning and Cheer or softness and Charmin, (2) a consumer benefit such as cleaner breath and Dentyne, and (3) some basic imagery or brand personality such as friendliness and United Airlines.
Those methods which seem most appropriate for this cognitive linkage model are slogan identification measures as the basic selling idea is often a key part of the slogan, association measures, perceptual mapping, structural analysis to deter-mine if the key idea is integrated with the brand, and basic communication checks.
Clearly as we learn more about this whole area of cognitive linkage, we may find that we need more multidimensional measures or that measures of basic brand values give us the insights we need to succeed in the coming decade.
One final model or theory that I wish to discuss today as a contender for copy research in the 1980's draws its theoretic foundation from the work of Elihu Katz in the area of uses and gratification of the mass media and from the work of William Stephenson on the importance of .,play" in mass communication.
This new model is built upon the notion of viewer reward in advertising communication. As audiences fragment, clutter increases, media alternatives expand, and skepticism increases about advertising from better educated consumers, this notion of viewer reward should take on more importance in our research. The viewer reward model postulates that successful advertising communication should provide the viewer with some small reward for watching the commercial each time and for buying the product. In our research to date, we have conceptualized and measured viewer reward as more than sheer liking of an ad. It is a multi-dimensional model of response both cognitive and affective in nature. It is primarily an immediate response by the viewer, although I'm convinced that there are lingering and cumulative effects.
We have developed four major dimensions of viewer reward from our research and thinking. These are (I) Entertainment, (II) Empathy, (III) Useful News and (IV) Viewer Respect. Each of the four major dimensions of viewer reward has several variations or subdimensions of response and can be represented by specific commercials. It is clear to me that most successful advertising has many dimensions of viewer reward, but to help clarify these dimensions, let me mention some appeals which reflect a specific viewer reward as its most dominant response by consumers.
Within the broad dimension of entertainment, which must dramatize some consumer benefit beyond sheer entertainment, there seem to be four successful variants. The first of these is humor. Advertising which puts a smile on your face while delivering a motivating message. A second form of entertainment is when a commercial is able to combine music, symbols and storyline drama into an enthusiastic style which gives you a lift. A third kind of entertainment is when the advertising proposition is built around charming, likeable characters or spokespersons which communicate many good values. The fourth dimension of entertainment is when the sheer imagination of the copywriter, art director and viewer interact on a trip into fantasy. It includes the fantasy of Marlboro Country, the Valley of the Green Giant, and even daydreams.
The second broad area of viewer reward is empathy. Here there seem to be three dominant types of response -- (1) like real life as I know it, (2) pure human emotion, and (3) the representation of an ideal.
The third broad dimension of viewer reward is useful news. In some ways this dimension could be thought of as the true objective of all advertising. Yet when examined from the perspective of the viewer and his/her gratification much of our advertising misses the mark. He/she understood our claim, but unless he/she internalizes that news and finds it useful and relevant, we have done nothing more than delivered " a sales pitch." There are two types of useful news in our experience that viewers perceive as self-enhancing product news relevant to their needs, The first and most obvious is the compelling advertisement for a new product which fulfills a need among target viewers. The second type of useful news in viewer reward terms is when viewers are presented with a new consumer benefit from an established product, such as the successful campaign for Arm & Hammer baking soda. That advertising gave millions of homemakers a new and relevant use for a product which had been in homes for decades.
The final major dimension of viewer reward is viewer respect. The first type of viewer respect is quite obvious. It is when a commercial--through its tone, content and manner--leaves the viewer with the feeling that the advertiser respects him/her as an intelligent, knowledgeable consumer. The second type of viewer respect is less obvious, but no less meaningful. There seems to be real consumer reward when the viewer feels that the commercial is clear, straight-forward, and easy to follow. It is as if the viewer is saying "Thank you for not making me work so hard to understand your point(s)."
We should begin thinking more about advertising in the 80's from this viewer reward perspective even if the model proves ultimately to be invalid for copy testing, because the new consumer in increasing numbers and with increasing intensity is looking for elements of personal and advertiser altruism in advertising as one ethical reassurance that the advertiser and his product are honest, valuable and worthy of consumer support.
We should be trying to develop methods to reliably measure viewer reward. We are currently doing this at Y&R and find it to be most helpful in the creative process.
I believe advertising will diminish in its value and effectiveness if we continue to rely on single number tests and do not expand our models, theories and methods. Our track record in the past of greater and greater reliance on a single model and a single system like DAR has not advanced the ball at all. This does not mean that certain commercials where DAR is appropriate -- but even when it is used, we should use all the data provided -message content, storyline play~ack, claimed recall and commercial audience -- not just a single number.
It seems to me that we should realize by now that there is no "holy grail" to copy research. There is no single model or numerical effectiveness scale which can predict the effectiveness of a single piece of copy. We should expand our tool kit beyond the hammer and put the emphasis on attempting to learn as much as possible up front during the creative process to both improve the advertising itself and our decisions to run it. Which techniques we use should depend upon our collective abilities to determine the major role of advertising for the brand and the specific objectives and issues at hand.
We should not only encourage the use of many different approaches, but we should encourage and support experimentation and efforts to validate the measures available to us.
Most importantly for the decade ahead we should concentrate more of our resources and energies toward campaign evaluation -- put the emphasis of our search for the holy grail or some effectiveness scale where it belongs ... on campaigns in the marketplace. Since it is very difficult to test a single commercial -- that is generate a reliable, valid single number on a small sample -- perhaps we can test a campaign. For it is campaigns and marketing programs that "work" in the marketplace over time to product results, not a single commercial or print ad exposure.
It is very difficult to predict in advance via some test of a work of art, a symphony or a piece of theater its ultimate success in the marketplace. Heaven knows we have enough trouble testing and predicting from a single test the ultimate success of a product. Yet we can tell over time and from many different measures, including judgm~e-nt, how successful a work of art, a symphony, a play or a new product is in the competitive marketplace. The same must be true of advertising.
Therefore we must shift our search for the test from single commercial tests to better ways to evaluate campaign effectiveness in the marketplace. There is emerging technology such as scanners, in-home two-way communication, consumer panel data, etc. which should aid us in our search.
This is not to shy away from our responsibility to research our advertising before final decisions are made. But we must recognize that our models are limited, our few tools blunt, and our resources expensive. The answer lies in working early-on with a wide variety of tools in a partnership with the creative people. We should use every ounce of consumer feedback we can muster to make the most informed and intelligent judgments possible. Ultimately we must take some risks, think harder about the questions and be innovative in our selection of the best tools from an expanding "kit bag."
----------------------------------------
Authors
Joseph T. Plummer, Young & Rubicam
Volume
SV - Symbolic Consumer Behavior | 1981
Share Proceeding
Featured papers
See MoreFeatured
Cheating Your Self: Diagnostic Self-Deceptive Cheating for Intrinsic Rewards
Sara Loughran Dommer, Georgia Tech, USA
Nicole Marie Coleman, University of Pittsburgh, USA
Featured
Scope Insensitivity in Debt Repayment
Daniel Mochon, Tulane University, USA
Nina Mazar, Boston University, USA
Dan Ariely, Duke University, USA
Featured
Intentionally “Biased”: People Purposefully Use To-Be-Ignored Information, But Can Be Persuaded Not To
Berkeley Jay Dietvorst, University of Chicago, USA
Uri Simonsohn, University of Pennsylvania, USA