Measuring Advertising Effect By Perceptual Mapping: a Cautionary Tale

Roger M. Heeler, York University
[ to cite ]:
Roger M. Heeler (1974) ,"Measuring Advertising Effect By Perceptual Mapping: a Cautionary Tale", in NA - Advances in Consumer Research Volume 01, eds. Scott Ward and Peter Wright, Ann Abor, MI : Association for Consumer Research, Pages: 192-200.

Advances in Consumer Research Volume 1, 1974    Pages 192-200

MEASURING ADVERTISING EFFECT BY PERCEPTUAL MAPPING: A CAUTIONARY TALE

Roger M. Heeler, York University

[This research was funded by a grant from the educational committee of the American Association of Advertising Agencies.]

Perceptual mapping of advertising effect change is a logical and attractive extension to the use of multidimensional scaling for the identification of static consumer perceptions and preferences. A variety of techniques have been used by recent authors to operationalize the concept. Vavra (1972) used factor analysis to plot individuals' positions in a product attribute space both before and after their exposure to a massive advertising campaign. By this means he was able to demonstrate the attitudinal shift caused by the advertising exposure and remeasurement procedure. Rao (1971 & 1972) used individual differences scaling (Carroll & Chang, 1967) to evaluate the differences in perceptions held by subjects in different experimental conditions. The individual differences scaling generates an attribute space which has common dimensions for all subjects, but differential weighting of these dimensions by individual subjects. The dimension weights for subjects in different experimental conditions were used in regression and analysis of variance evaluations of the effects of the experimental conditions. Heeler (1972, Appendix A) used nonmetric scaling of brand similarities data to compare subjects' perceptions of and preferences for breakfast cereals in two experimental conditions. One group of subjects were exposed to a massive advertising campaign for brand X, and a second group was exposed to a massive campaign for brand Y. MDSCAL (Kruskal, 1964) configurations obtained for the two conditions were compared to evaluate the differential effects of the advertising exposures.

Perceptual mapping of attitude change compares groups or individuals through time or between experimental conditions. Similar analysis techniques can be used for the comparison of subgroups of marketing interest, for example market segments. Assael (1971) compared factor analytic maps for beverage brands for different marketing regions. Green and Carmone (1971a) compared "expert" and "non expert" respondent groups' perceptions of advertisements using non-metric scaling configurations. Moinpour and MacLachlan (1973) monitored the shifts in students' perceptions of presidential candidates over a nine week period.

All the methods referred above share what Shepard (1972) has described as the human engineering advantage of multidimensional scaling. Information presented in visual form is readily assimilable. This property of communication speed and impact carries with it the inherent disadvantage that it is all too easy to transmit misinformation through visual data. All elementary statistics students are familiarized with the misleading impressions that can be given by such devices as an unobvious origin on a graph or an inappropriate drawing scale for two or three dimensional block diagrams. Similar mistransmissions of information are possible with multidimensional scaling. It is always tempting to generate interpretations of structure from unstructured analysis (Humphreys et al, 1969; Einhorn, 1972). This temptation is especially strong when the information is in the form of an effect which is visually "obvious", The next section of this note provides an illustration of the misinformation capacity of multidimensional scaling.

AN ILLUSTRATION

Data on breakfast cereal perceptions were collected as supplementary information during a study of the effects of variety in advertising (Heeler, 1972). Three hundred housewife subjects were randomly assigned to two experimental conditions. The 150 subjects assigned to condition X were exposed to a campaign consisting of two 30 second television commercials for breakfast cereal brand X and four print advertisements for X, during a program of advertisements which contained no other advertising for breakfast cereals. The 150 subjects assigned to the Y condition were exposed to a similar campaign for brand Y.

After the advertising exposures, both groups completed a series of measures of their brand perceptions. Thus the study was a posttest-only control group design (Campbell & Stanley, 1966, p. 8), with the X condition a control for the Y and vice versa. This design has considerable internal experimental validity thus permitting valid comparison of the effects of condition X versus condition Y. This design was also used by Rao (1971 & 1972). Vavra (1972) used a pretest-posttest design (Campbell & Stanley, 1966, p. 7). This design has many inherent internal validity problems, such as accentuation of differences because the subjects are tested twice. Vavra deliberately used this design for its difference accentuating effect. This is an entirely proper use in a pilot study to demonstrate analysis technique, but would be less satisfactory for measuring true advertising effect.

The measure of F me relevance to this illustration was a series of cereal brand pair similarity rating questions. Each subject was asked to rate each of 14 pairs of breakfast cereal brand names for similarly by using an eleven point rating scale ranging from "totally different" to "identical". The 14 pairs were samples from the 55 possible pairings of 11 brand names. Although each subject received only 14 pairs to rate, all possible 55 pairs were used through presenting different samples of 14 to different subjects in the two experimental groups. This method of obtaining similarities judgements, by dividing the brand pairs amongst subjects and using rating scales was very economical on questionnaire time (less than five minutes in this study) and permitted the similarity items to be incorporated in a questionnaire which contained many other items. The method could only be used for aggregate mapping (Taylor, 1969) because too few of the possible brand pairings were completed by each respondent for individual mapping. This indirect (Neidell, 1969), similarities rating, method of questioning was advantageous in itself. It avoided the forced salience problem of direct ratings (Greenberg, 1967). The different style of questioning had an additional advantage. It increased the utility of the scaling measures for cross-validating direct ratings. Convergent validation of two measures is more meaningful if the measures are obtained in contrasting ways (Heeler & Ray, 1972). Girard and Cliff (1973) and R.C. Sherman (1972) have found rating of brand pairs to be a superior method of similarities data collection.

One of the 11 cereal brands was a hypothetical brand labelled "your ideal breakfast cereal". This hypothetical brand was used to introduce the concept of preference into the context of similarity ratings. Brands judged to be similar to the "ideal" were presumably preferred to brands judged to be less similar to the "ideal". This method of establishing preference within similarities is called the explicit ideal point [Homayounfar (1970, p. 168) in a study of building brick perceptions, found the explicit ideal point to be similarly positioned to the implicit ideal point obtained by joint analysis of both similarities and preference data.] technique.

Scalings of averaged similarity data can be misleading unless the subjects who have similar product perceptions (1969). Market segments were used in this study to increase the likelihood of subjects having similar market place perceptions. Since the study was principally concerned with nutritional cereals, the segmenting variable reported here is usage of nutritional cereals. The sample contained 208 users of nutritional cereals, with 101 in the X condition and 107 in the Y condition. Since each subject rated one quarter of the similarity pairs, each similarity pair in both experimental conditions was rated on average by 25 subjects in the user segment. The user segment only is described in the remainder of this note. [The results for the non-user segment were similar to the user segment in the placement of cereal brands, but the spurious results later described for the user segment did not occur.]

Average ratings were calculated for each brand pair within the user segment. The averaged similarities data were scaled using Kruskal's (1964) non-metric scaling program MDSCAL. Multiple starting configurations were tried for each scaling, so as to minimize the risk of local optima solutions (Spence, 1972). The solutions with the lowest obtained stress were used in the subsequent analysis. The MDSCAL program positions the stimulus objects (the 11 cereal brands) in a space (Euclidean in this study) of a dimensionality specified by the program user. The positioning is such as to maximize the agreement between the rank order of the interbrand distances in the space with the inverse rank order of the similarities ratings. Thus the program reproduces the similarities ratings in a more concise and viewable form.

Clusters of similarly perceived brands were located in the MDSCAL configurations by Johnson's cluster analysis (1967) following a proposal by Shepard (1972) that cluster rather than axes labelling may be most appropriate for many scalings. The clusters were named by reference to open ended questions on reasons for subjects liking their most liked brands and disliking their least liked brands.

The scalings were compared by using the program CONGRU by Olivier (1970). This program rotates and stretches pairs of non-metric scaling configurations into maximal agreement, and provides test statistics for the similarities of the configurations. CONGRU can be used to compare data points derived from different populations and is thus more general then canonical correlation.

Figure 1 shows the CONGRU obtained positions of cereal brands in two dimensions for the segment "nutritional cereal users". Brand position changes between the experimental conditions X and Y are indicated by the corrected lines. The larger stress of the two configurations compared was .167. By contrast Klahr (1969) found that 95% of two dimensional MDSCAL configurations obtained from eleven [Interpolated from Klahr's (1969) results for ten and twelve points.] variables using random data yielded stress in excess of .182. Thus the obtained configurations appear to be reasonably "significantly" different from those obtained from random data.

There are obvious differences between perceptions in the X and Y conditions For brand X, the effect of the brand X advertisements (X condition versus Y condition) was to dissociate X from the other "crunchie" cereals and move it closer to the ideal. For brand Y, the effect of the brand Y advertisements (Y condition versus X condition) was to dissociate Y from the other "unappetizing" cereal but not to move it anywhere nearer the ideal. Thus it appears that the advertisements for X successfully transmitted their message of the nutritional advantages of X relative to the other "crunchies". but that the Y advertisements transmitted only a differentiation message. The correlation (rsk, Olivier, 1970, p. 12) [rsk = (1/p) tr (A' B B' A)1/2 where A and B are the matrices of P points in N dimensions that are to be compared. With A and B rotated into maximal congruence and regarded as two sets of NP numbers, then rsk is the correlation between these two sets.] between the cereal brand positions in the X and Y conditions was .60. This level of correlation allows for both some shared positioning and some differences of the type indicated above.

FIGURE 1

COMPARISON OF TWO DIMENSIONAL MDSCAL MAPS OF CEREAL BRAND PERCEPTIONS FOR USERS OF NUTRITIONAL CEREALS IN TWO EXPERIMENTAL CONDITIONS.

How does this interesting result appear in three or four dimensions? Quite simply, it does not. In three dimensions there are minimal differences between the X and Y conditions brand positionings. This is reflected in an rsk = .91 correlation between the positionings for the two conditions. A similar null result obtains for four dimension, with an rSk = .87.

Which result is true, the interesting one or the null one? First note that the clusters did not differ between the X and Y conditions. Also a subsidiary analysis showed little difference between the X and Y conditions. In this subsidiary analysis similarities data were pooled over treatments for the unadvertised brands, X, Y, and "ideal" were represented separately for each treatment group, and a single map produced. Both these alternative analysis formats in this instance avoided spurious results. A more reliable check can be obtained by examining the original data that was input to the scaling programs. No scaling contains more information than its original data.

The sets of averages for the X and Y conditions were first compared using t tests [T tests were used (as in Girard & Cliff, 1973) since the data were averaged and the t tests is robust. Some correlation of the individual tests is likely, which could-decrease the number of "significant" pairs expected under the null hypothesis.] for the significance of the differences between the means for each of the 55 pairs of similarity scores. Only four of the 55 pairs were significant at the .1 significance level (two tail). If the means were truly equal (null hypothesis) then the chance expected number of pairs significant at the .1 level would be 5.5. Clearly there is no evidence that the sets of means differ significantly between the two experimental conditions.

The rank order correlation (spearman rho) between the averaged similarity ratings for the two conditions was calculated as an additional check. A value of .87 was obtained, indicating considerable and significant (p<.001) agreement between the ratings for the two experimental conditions.

It is clear that the interesting two dimensional result was spurious, and that three and four dimensional results correctly indicated a null result. This null finding was cross validated by the more traditional measures of brand rating, preference and purchase. These measures indicated non-significant differences between X and Y experimental conditions (Heeler, 1972, p. 200).

Would it have been possible to predict the spurious two dimensional result in advance? The mean stress values for the two, three and four dimensional configurations were .153, .077 and .038 respectively. The corresponding stress values for the Klahr (1969) 95% points were .182, .093, and .049. Thus all the dimensionalities were reasonable in relation to Klahr's values. Three or four dimensions might have been selected as appropriate dimensionalities owing to the reduction in stress achieved in going from two to three, and three to four dimensions. The use of three rather, than two dimensions would also be supported by the dimensionality error level interpolation procedure of Wagenaar and Padmos (1971). On the other hand low dimensionalities are often used in scaling analyses because of the laudable objectives of parsimony and ease of display. Further, both three and four dimensionalities have lower than desirable degrees of freedom ratios (Sherman, C.R., 1972). That is the ratio of the number of independent items of input information are low compared to the number of independent items of output information. Hence a dimensionality of two might have been selected.

IMPLICATIONS

The results obtained above do not indicate that perceptual mapping of attitude change is a valueless technique. To the contrary, other authors have obtained significant, interesting results with such studies. Even the null result obtained in the present study agreed with the null results obtained with more traditional measures and thus indicated the potential value of scaling as a cross validating measure (Heeler & Ray, 1972). Such validation is facilitated when similarities data are collected in 4 reduced form as in the illustration above. But a need for caution is evident. Reliability and validity testing of static scalings have been conducted by several authors (e.g. Fenker & Brown, 1969; Fry & Claxton, 1971; Green, Maheshwari & Rao, 1969; Turner, 1971). There is even more need for such testing of dynamic scalings.

A six stage check on data used for comparison of scalings is suggested. First the input data should be compared. If there are no significant differences between the data sets to be compared by scalings, then any subsequent differences shown by scaling should be regarded with suspicion. Second, the scaling analyses should be replicated in several dimensionalities, as suggested by C.R. Sherman (1972). Interesting effects that are not consistently reproduced in varied dimensionalities should be regarded with suspicion. Third, alternative analysis methods should be compared. Fourth, the results obtained from scaling should be compared with the results from other measures. Results that show only in the scalings should be regarded with suspicion. Fifth, if the data base is of sufficient size, split-half analysis should be used to test for the stability of effects found within the current data base. Sixth, additional data bases should be used to test for the stability of the effects found across data collection occasions and relevant contexts (Fenker & Brown, 1969; Green & Carmone, 1971b; Stefflre, 1972).

It will not be possible to make these checks on each measurement occasion. The sixth stage of comparisons across measurement occasions will not be possible for exploratory or other one shot studies. But past practices of making comparisons in one dimensionality only, of making significance tests only with data output from configurations, and of neglecting significance tests on input data, may lead to invalid results.

REFERENCES

Assael, H. Perceptual Mapping to Reposition Brands. Journal of Advertising Research, 1971, 11, 39-42.

Campbell, D.T. & Stanley, J.C. Experimental and Quasi-Experimental Designs for Research. Chicago: Rand McNally, 1966.

Carroll, J.D. & Chang, J.J. A New Method for Dealing with Individual Differences in Multidimensional Scaling. Bell Telephone Laboratories, Murray Hill. N.J.. 1967. mimeographed.

Einhorn, H.J. Alchemy in the Behavioral Sciences. Public Opinion Quarterly, 1972, 36, 367-378.

Fenker, R.M. & Brown, R.R. Pattern Perception, Conceptual Spaces and Dimensional Limitations on Information Processing. Multivariate Behavioral Research, 1969, 4, 257-71.

Fry, J.N. & Claxton, J.D. Semantic Differential and Nonmetric Multidimensional Scaling Descriptions of Brand Images. Journal of Marketing Research, 1971, 8, 238-240.

Girard, R. & Cliff, N. A Comparison of Methods for Judging the Similarity of Personality Inventory Items. Multivariate Behavioral Research, 1973. 8. 71-87.

Green, P.E. & Carmone, F.J. Advertising Perception and Evaluation: An Application of Multidimensional Scaling. The Thomson Medals and Awards for Advertising Research 1970, London: The Thomson Organization Limited, 1971(a).

Green, P.E. & Carmone, F.J. The Effect of Task on Intra-Individual Differences in Similarities Judgements. Multivariate Behavioral Research, 1971. 6. 433-50. (b).

Green, P.E., Maheshwari, A. & Rao, V.R. Dimensional Interpretation and Configuration Invariance in Multidimensional Scaling: An Empirical Study. Multivariate Behavioral Research, 1969, 4, 159-180.

Green, P.E. & Rao, V.R. Multidimensional Scaling and Individual Differences. Journal of Marketing Research, 1971, 8, 71-77.

Greenberg, A. Is Communication Research Really Worthwhile. Journal of Marketing, 1967, 31, 48-50.

Heeler, R.M. The Effects of Mixed Media, Multiple Copy, Repetition, and Competition in Advertising: A Laboratory Investigation. Unpublished doctoral dissertation, Stanford, 1972.

Heeler, R.M. & Ray, M.L. Measure Validation in Marketing. Journal of Marketing Research, 1972, 9, 361-370.

Homayounfar, F. Evaluation of Market Potential for New Industrial Products: An Application of Multidimensional Scaling Techniques. Unpublished Ph.D. dissertation, Stanford University, 1970.

Horan, C.B. Multidimensional Scaling: Combining Observations when Individuals Have Different Perceptual Structures. Psychometrika, 1969, 34, 139-65.

Humphreys, L.G., Ilgen, D., McGraph, D. & Montanelli, R. Capitalization on Chance in Rotation of Factors. Educational and Psychological Measurement, 1969, 29, 259-71.

Johnson, S.C. Hierarchical Clustering Schemes. Psychometrika, 1967, 32, 241-54.

Klahr, D. A Monte Carlo Investigation of the Statistical Significance of Kruskal's Nonmetric Scaling Procedure. Psychometrika, 1969, 34, 319-30.

Kruskal, J.B. Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis. Psychometrika, 1964, 29, 1-27.

Moinpour, R. & MacLachlan, D.L. Longitudinal Image Measurement with Multidimensional Scaling. Proceedings of the American Marketing Association, 1973, in press.

Neidell, L.A. The Use of Nonmetric Multidimensional Scaling in Marketing Analysis. Journal of Marketing, 1969, 33, 37-43.

Olivier, D. Metrics for the Comparison of Multidimensional Scalings. Unpublished mimeo, Harvard University, August 1970.

Rao, V.R. Salience of Price in the Perception of Product Quality: A Multidimensional Measurement Approach. Proceedings of the American Marketing Association, 1971, 33, 571-7.

Rao, V.R. Alternative Econometric Models for Sales-Advertising Relationships. Journal of Marketing Research, 1972, 9, 177-81.

Shepard, R.N. A Taxonomy of Some Principal Types of Data and of Multidimensional Methods for their Analysis. In R.N. Shepard, A.K. Romney, and S.B. Nerlove, Multidimensional Scaling, Vol. I, New York: Seminar Press, 1972.

Sherman, C.R. Nonmetric Multidimensional Scaling: A Monte Carlo Study of the Basic Parameters. Psychometrika, 1972, 37, 323-55.

Sherman, R.C. Individual Differences in Perceived Trait Relationships as a Function of Dimensional Salience. Multivariate Behavioral Research, 1972, 7, 109-29.

Spence, I. A Monte Carlo Evaluation of Three Nonmetric Multidimensional Scaling Algorithms. Psychometrika, 1972, 37, 461-86.

Stefflre, V.J. Some Applications of Multidimensional Scaling to Social Science Problems. In A.K. Romney, R.N. Shepard, and S.B. Nerlove, Multidimensional Scaling, Vol. II, New York: Seminar Press, 1972.

Taylor, J.R. Alternative Methods for Collecting Similarities Data. Proceedings of the American Marketing Association, 1969, 30, 150-2.

Turner, R.E. Market Measures from Salesmen: A Multidimensional Scaling Approach. Journal of Marketing Research, 1971, 8, 165-72.

Vavra, T.G. Factor Analysis of Perceptual Change. Journal of Marketing Research, 1972, 9, 193-99.

Wagenaar, W.A. & Padmos, P. Quantitative Interpretation of Stress in Kruskal's Multidimensional Scaling Technique. British Journal of Mathematical and Statistical Psychology, 1971, 24, 101-110.

----------------------------------------