# Meta-Analysis: Integrating Results From Consumer Research Studies

^{[ to cite ]:}

Michael D. Reilly and Jerry N. Conover (1983) ,"Meta-Analysis: Integrating Results From Consumer Research Studies", in NA - Advances in Consumer Research Volume 10, eds. Richard P. Bagozzi and Alice M. Tybout, Ann Abor, MI : Association for Consumer Research, Pages: 509-513.

^{[ direct url ]:}

http://acrwebsite.org/volumes/6171/volumes/v10/NA-10

[Order of authorship is arbitrary; the authors' contributions were equal.]

Meta-analysis is a set of techniques for combining the results of independent investigations of the same topic to arrive at more molar assessments of the studied effects. The literature on meta-analysis is briefly summarized and a variety of techniques are examined. Estimation of overall effect size is proposed as the most versatile meta-analytic procedure. Techniques of meta-analysis are then applied to studies on the relationship between product familiarity and consumers' external search activity. Finally, conclusions from this analysis are drawn, and several caveats regarding the use of meta-analysis are noted.

INTRODUCTION

Consumer behavior, a relatively recent arrival on the social science scene, is just beginning to face a problem/opportunity that has confronted the more mature social sciences for years -- a diverse body of independent studies addressing the same basic topics. To the extent that a large number of studies focus on similar independent and dependent variables, literature reviews become complex beyond the ability of researchers to synthesize the results of the studies into a meaningful evaluation of the research results. The usual response to a large literature base is to narratively report the results of various studies, and then to attempt to synthesize the existing evidence. Conflicting results are typically explained in terms of differences in experimental design, construct operationalization, or subject population.

However, as the literature base grows, reviewers find it increasingly difficult to manage the cognitive tasks required by such subjective procedures. Social scientists have responded to the burgeoning literature by developing a variety of meta-analytic techniques. Meta-analysis is distinct from the more common statistical analysis of primary data in that the input is the results o a series of independent studies of the same research problem.

Meta-analysis is a valuable tool. Once the studies to be reviewed are systematically examined for overall contribution to statistical significance, the researcher is in a position to estimate two important quantities: the overall significance level from the pooled studies, and the magnitude of the effects attributable to the independent variables.

The foregoing is not meant to suggest that meta-analysis is totally absent from the marketing literature. Clarke (1976) estimated Koyak lag distributed effects to attempt to determine the length of advertising effects found in 70 econometric studies. More recently, Farley, Lehmann and Ryan (1981) used MANOVA to explore hypotheses about the beta weights for a number of behavioral intention model studies in the social psychology and marketing literatures. However, neither of these works estimated effect sizes for a variety of studies where the data were reported in diverse ways. In both cases the designs of the underlying studies were sufficiently similar that directly comparable statistics were reported or could easily be computed. Synthesis of results across more dissimilar studies requires a different procedure, effect size estimation (for an example, see Sudman and Bradburn's 1974 report on 935 survey response studies).

PRINCIPLES OF META-ANALYSIS

Before demonstrating meta-analysis with consumer behavior literature, a brief review of the various meta-analytic techniques is in order. These can generally be divided into two categories: techniques for estimating the overall significance of the studied effects, and techniques for estimating the relative magnitude of the studied effects.

Beginning with Mosteller and Bush (1954), the initial goal of the meta-analytic literature was to determine the degree to which a set of studies collectively supports the hypothesis that two groups have different means. Several methods have been proposed as appropriate for such inquiries. Rosenthal (1978) reviews nine methods for testing the overall significance level of a body of research; limitations, advantages and appropriate situations for each technique are discussed there in some detail.

A more recent development in this stream is Rosenthal's (1979) approach to determining the "file drawer number", which is the number of studies with null results necessary to swamp the overall significance levels of existing published studies. Rosenthal shows that, in most cases, a large number of null-result studies -presumably buried in file drawers somewhere because of journal reviewers' biases against studies with nonsignificant results -- must exist in order to drive the overall significance level below acceptable standards.

To see how these procedures could be applied to the consumer behavior literature, the present paper applies meta-analysis to an important and widely-cited finding in consumer information processing research.

Illustration: The Familiarity-Search Relationship

A meta-analysis was conducted with literature relevant to the hypothesis that prior experience or familiarity with a product decreases a consumer's tendency to actively search for more information prior to purchase (Newman, 1977). Several studies relevant to this general proposition were found, of which seven reported results amenable to meta-analysis.

Since the accuracy of a meta-analysis depends on the degree to which the included studies actually represent the body of published and unpublished research on the topic, literature sources are critical. For the present illustration, the reviewed literature was gathered using the following sources: the Marketing Abstracts section of the Journal of Marketing, Newman's (1977) review article, dissertation abstracts, and the bibliographies of all located studies. In more extensive applications a search of computer library sources might be desirable. Additionally, it might be worthwhile to write to identified researchers in the area to attempt to locate unpublished manuscripts on the topic.

In several or the identified studies, more than one result relevant to the familiarity-search hypothesis was reported. leading to a total of 20 separate tests of the hypothesis. The key variables, significance tests, and probabilities associated with chose tests are reported for each study in Table 1.

It should be noted that the studies employed various operational definitions of "familiarity" or "experience;' and of "information search". Some were experimental studies manipulating preliminary information exposure prior to a decision, and observing how much information about the alternatives subjects subsequently requested before choosing (Biehal 1978; Swan 1969). Others were based on surveys that attempted to determine how much information seeking occurred prior to a recent purchase, with various definitions of actual previous product experience.

Since a relatively large number or hypothesis tests were available, the overall probability was assessed by the procedure known as testing the mean p. In this procedure, one simply averages the probability values for the reported significance tests of the various studies Applied to_the present 20 p-values, the estimate of the mean p is p = .206.

This estimate can be transformed into a corresponding standard normal deviate (z) according to the formula:

where z_{o} = the overall value, and

N = the number of p's averaged into p.

In the present analysis, the resulting z = 4.55. By reference to a table of standard normal deviates, we find that such a z-value occurs with a probability of only p=. 000001, leading to the conclusion that, across the set of studies, considerable support exists for the hypothesis that unfamiliar consumers seek more information than familiar consumers.

However, as Rosenthal (1979) and others (Bakan 1967; McNemar 1960; Smart 1964) have observed, there is a possibility that the studies surveyed are only those that achieved significant results due to sampling error. If this were true, then the studies are not representative of all studies that have been carried out. Accordingly, we can estimate how many unpublished studies with null results must exist in order to reduce the overall significance level to an unacceptable value. First, each of the 20 p-values is converted to a corresponding Z-value, using a table of standard normal deviates; the mean of these 20 Z=values, Z_{K}, equals 1.554 in the present example.

Then, using Rosenthal's (1979) formula for the file-drawn number (X). with a set at .05.

X = 20/2,706 [20(1.554)^{2} - 2.706] = 336.97.

Thus, approximately 337 studies with nonsignificant results would have to exist in order to negate the consistently supportive findings of the studies reviewed in Table 1, reducing the overall significance level for the hypothesis to greater than p=.05. [This estimate may actually be a bit high, since the 20 tests come from only seven independent studies.] Such a state of affairs is very unlikely; hence, we can conclude with considerable confidence that familiar consumers seek less information than unfamiliar consumers.

Effect Sizes

As valuable as it is to know that there is statistical support for the hypothesis that less familiar consumers seek more information, such a finding is hardly unexpected given the large number of cases on which the conclusion is based. Because statistical power increases with the sample size, even trivial differences are likely to be significant. As a result, the most recent development in meta-analysis has led to a variety of methods for estimating the magnitude of the differences between the hypothesis groups. While there is some controversy about whether magnitude of differences should be estimated by the percent of shared variance or by the D statistic, which is the ratio of the mean difference to the standard deviation (see Cooper 1981; Sohn 1980; Glass, McGaw and Smith 1981), there is general agreement that "how much" is a more interesting question than "whether . Thus, the next step in the present meta-analysis is to estimate effect sizes associated with the studies of the familiarity-search relationship.

Effect size meta-analysis. Since the studies reported their findings in several different ways, various techniques were required to convert those results to a common metric of effect size. One of the most direct approaches is useful for studies comparing two groups (e.g., high- vs. low-familiarity) on degree of external search. The effect size is estimated simply by subtracting the mean search score of the familiar group from that of the unfamiliar group (thus, positive differences will be consistent with the hypothesis), and dividing by an estimate of the within-group standard deviation. In Swan's (1969) experimental study, these data were directly available for the comparison of "continuity" (i.e., previous information about same brands) and "discontinuity" (different brands over trials) conditions. In some other cases, however, group means or within-group standard deviation estimates were not available. Thus. to compare the early and late trials performance of Swan's "continuity" subjects, the reported significance levels of t-tests were converted back to their respective t-ratios, which were then multiplied by EQUATION to produce estimates of effect sizes. Similar conversions or t-ratios to effect sizes were performed for data from Katona and Mueller's (1955) survey of durable goods buyers, and for Biehal's (1978) experimental results (which, for each dependent variable, reported an F-ratio [the square root of which equals t] and group sizes) comparing subjects with preliminary information about alternatives to those who had no such prior information.

Some studies reported no direct comparisons between high- and low-familiarity groups, so the above approach was not applicable. Bennett and Mandell (1969), for example, surveyed new car buyers, and reported on the number or prior car purchases of those buyers. as well as the degree of pre-purchase information seeking. The results were tabulated in contingency tables of number of prior purchases by amount of search effort; chi-square statistics were reported for each table (with separate tables for three different operationalizations of the independent variable). As Glass et al. (1981) note, such chi-square summaries max be conveniently converted to equivalent product-moment correlation coefficients (equaling .172, .330, and .363 for the first three lines of able 1, respectively). These correlation coefficients may then be converted to equivalent effect size estimates by selecting reasonable values for the percentile values along a familiarity continuum corresponding to "high" and "low" familiarity levels. In the present analysis, these were selected at approximately the 84th and 16th percentiles, respectively, indicating groups that were each a standard deviation above or below the familiarity mean. Given the above correlation (r_{xy}), the effect size is estimated by: D = 2r_{xy} (1-r2_{xy})1,2:

Hempel also used a contingency table (1969, Table 5) for reporting the proportion of home buyers, with varying degrees of prior home-buying experience, who were in the 'high search effort" category on a composite index. By subtracting these percentages from 100, the corresponding proportions of "low search" buyers were also obtained, and this search (two levels) -by- experience (three levels) table generated a chi-square value that was converted to an effect size as with the Bennett and Mandell data. A similar approach was employed with Bucklin's (1966) data from a survey or people who had shopped for non-food products valued over $5.00 [r_{xy}=.149].

Claxton, Fry, and Portis (1974) reported in yet another form data from a survey of durable goods purchasers. Pooling together subjects in various sub-categories of "thorough search", and likewise pooling various "non-thorough search" groups, we have two different values for the proportion of buyers who had had prior usage experience with the same product that they had bought. An estimate of effect size was computed from these data by converting the difference between the two proportions to a z-score (using the standard formula); the z-score is, in essence, an approximation of effect size, since it compares the difference between the proportions to their standard error.

RESULTS AND INTERPRETATION

As shown in Table 2, the clear majority (1/ of 90) Of the effect size estimates were positive, indicating support for the hypothesis that product familiarity leads to reduced information search. Further, the mean effect size across the 20 available comparisons is D = +0.40.

This indicates that the less familiar groups are .40 standard deviation higher in external information search than the more familiar groups, on the average. To put this result in more perspective, assume that search activity is normally distributed in intensity. A z-value of .40 corresponds to a probability of .66. Thus, the average unfamiliar consumer searches more than do two-thirds of the familiar consumers. Graphically, this can be illustrated as follows:

Further Analysis

Once the overall effect size estimates are available, the meta-analysis can proceed to consider some further questions. For example, in the current literature review, we might be interested in the effect of different definitions of product familiarity on the apparent degree to which consumers are likely to engage in active search. To test the hypothesis that there is a difference in effect size attributable to the operational definition or product familiarity, the 20 reported effects were divided into two groups: those which defined familiarity in terms of past purchases of the product, and those which measured or manipulated familiarity in terms of other variables such as knowledge or exposure to information. (The Claxton et al. [1974] study was not included as it was impossible to separate purchase history from other bases for familiarity. Similarly, the first Katona and Mueller [1955] effect was not included since both types or familiarity were included in that effect.)

The mean effect size for studies defining familiarity in terms or purchase was +0.40 (standard deviation = 0.266), while the mean effect size for the non-purchase-based studies of familiarity was +0.55 (: = .506). The difference between the two groups of studies can be tested with a:

which indicates that the difference between the mean effect sizes is not statistically significant at the .05 level. This lends support to the hypothesis that the familiarity-search relationship does not depend on whether familiarity is defined by past purchase or exposure to information.

Similarly, it would be possible to test a number of other hypotheses about differences in the resultant effect size estimates. For example, one might investigate the effect or experimental versus survey studies; the influence or type or sample (e.g., student vs. non-student); published vs. non-published results; well-controlled vs. poorly-controlled studies, etc. All such hypotheses could be tested in a similar manner, using statistical techniques to investigate differences in effect size. Further, the effect size estimates could be used as dependent variables in regression analyses to study the impact of several of these factors simultaneously.

POTENTIAL PROBLEMS WITH META-ANALYSIS

Before drawing the conclusion that meta-analysis is an ideal solution to the problem of integrating results from independent studies, it should first be noted that the approach has some inherent limitations which need to be addressed.

One potential problem that meta-analytic reviews share with more traditional review approaches is bias in the selection of studies to be reviewed. To the extent that a given review does not sample representatively from the relevant literature, it cannot be expected to produce an accurate picture of the conclusions to be drawn from that literature. If the reviewer arbitrarily omits from consideration studies published before a certain date, or includes only those published in "prestigious" journals, there is a chance that meta-analytic results will differ from those that-would be obtained if he or she included earlier studies or results from dissertations and unpublished papers (although Glass et al., 1981, showed that in several published meta-analyses there were minimal differences among effect sizes reported in journal articles, books, and unpublished papers). There is no substitute for beginning a meta-analysis with an exhaustive search for relevant studies, with attempts to explore less widely read sources as well as mainstream publications. Unfortunately, the reviewer is often faced with limited resources for obtaining unpublished or hard-to-find works, and such works may have to be omitted on occasion. But, if such resource limitation is the case, at least the reviewer can estimate the number of omitted studies by noting citations that the search identifies but that he or she cannot obtain.

Another source of frustration in conducting meta-analyses comes from trying to integrate the findings of studies that do not provide complete statistical data suitable for estimating effect sizes. In the literature search described in the present paper, several papers were uncovered which presented verbal conclusions about the relationship between familiarity or experience and information search, but which-did not report data that could be used in estimating the size of the effect. With the recent recognition of the relative importance of reports of effect sizes, rather than (or in addition to) simple tests of significance (Cohen and Hyman 1979), the researcher should strive to include direct estimates of effect sizes in all reports or his of her studies. Such data would greatly aid the cause of statistically integrating findings from diverse studies, and would indeed help place more deserved emphasis not on "whether" a given variable has a postulated effect, but rather "how much" of an effect it has.

SUMMARY AND CONCLUSIONS

This paper has demonstrated the usefulness or the recently developed meta-analytic techniques by applying them to the hypothesis that increased familiarity leads to decreased external information search. Not only were the results of a number of studies examined simultaneously for statistical significance, but it was also demonstrated that a very large number of unpublished studies with null results would have to exist to offset the consistent evidence from published studies. Accordingly, it was concluded that a sampling error explanation for the published results is quite unlikely.

Once the existence of the relationship was established, inquiry turned to the question of the magnitude of the effect. The reviewed studies were individually decomposed into effect size est,mates, each representing the difference between the high- and low-familiarity groups divided by the standard deviation. Thus, we were able to estimate the average difference between the information search tendencies of familiar and unfamiliar consumers. Further, we tested whether conclusions about the effect of familiarity depended on the way in which familiarity was operationalized.

Considerable insight into the underlying relationship was gained through the use of meta-analysis. In future consumer research literature reviews, the use of meta-analysis should prove equally useful. To the extent that the number of studies in a given-area is large, the use of this more objective method of surveying the results and investigating moderating variables should result in greater understanding of basic propositions in consumer research.

REFERENCES

Bakan, D., (1967), On Method, San Francisco: Jossey-Bass.

Bennett, P.D. and Mandell, R.M. (1979), "Prepurchase Information Seeking Behavior of New Car Purchasers-The Learning-Hypothesis," Journal of Marketing Research, 6. 430-33.

Biehal, G.J. (1978), "The Effects of Prior Information and Information Search Costs on External and Internal Information Search Behavior," Technical Report No. 63, Graduate School of Business, Stanford University, Stanford. California.

Bucklin , L.P. (1966), "Testing Propensities to Shop," Journal of Marketing, 30, 72-27.

Clarke, D.G. (1976), "Econometric Measurement of the Duration of Advertising Effects of Sales," Journal of Marketing Research, 13, 345-357.

Claxton, J.D., Fry, J.N., and Portis B. (1974), "A Taxonomy of Prepurchase Information Gathering Patterns," Journal of Consumer Research, 1, 35-42.

Cohen, A. Alan, and Hyman, Joan S. (1979), "Home Come So Many Hypotheses in Educational Research are Supported? (A Modest Proposal)," Educational Research, 8, 12-16.

Cooper, H.M., (1981), "On the Significance of Effects and the Effects of Significance," Journal of Personality and Social Psychology, 41 (Number 5), 1013-1018.

Farley, J.U., Lehmann,D.R., and Ryan, M.J. (1984), "Generalizing from Imperfect Replication," Journal of Business, 54, 597-610.

Glass, G.V., McGaw, B., and Smith, M.L. (1981), Meta-analysis in Social Research, Beverly Hills, CA: Sage Publications.

Hempel, D.J. (1969), "Search Behavior and Information Utilization in the Home Buying Process," Proceedings of the Educators Conference of the American Marketing Association, P.R. McDonald (ed.), Chicago: American Marketing Association.

Katona, G. and Mueller, E. (1954), "A Study of Purchase Decisions," in Consumer Behavior: The Dynamics of Consumer Reaction, L.H. Clark (ed ), New York: New York University Press.

McNemar, Q. (1960), "At Random: Sense and Nonsense," 15, 295-300.

Mosteller, F.M. and Bush, R.R. (1954), "Selected Quantitative Techniques," in Handbook of Social Psychology: Vol. 1 Theory and Methods, G. Lindzey (ed.), Cambridge MA: Addison-Wesley.

Newman, J.W. (1977), "Consumer External Search: Amount and Determinants," in Consumer and Industrial Buying Behavior A.G. Woodside, J.N. Sheth and P.D. Bennett (eds.), New York: North-Holland.

Rosenthal, R. (1978), "Combining the Results of Independent Studies," Psychological Bulletin, 85, 185-193.

Rosenthal, R. (1979), "The File Drawer Problem and Tolerance for Null Results," Psychological Bulletin, 86, 638-641.

Smart, R.G. (1964), "The Importance of Negative Results in Psychological Research," Canadian Psychologist, 5 225-232.

Sohn, D. (1980), "Critique of Cooper's Meta-analytic Assessment of the Findings of Sex Differences in Conformity Behavior," Journal of Personality and Social Psychology, 39, 1215-1221.

Sudman, S., Bradburn, N. (1974), Response Effects in Surveys: A Review and Synthesis, Chicago: Aldine Publishing Company.

----------------------------------------

Tweet
window.twttr = (function (d, s, id) { var js, fjs = d.getElementsByTagName(s)[0], t = window.twttr || {}; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = "https://platform.twitter.com/widgets.js"; fjs.parentNode.insertBefore(js, fjs); t._e = []; t.ready = function (f) { t._e.push(f); }; return t; } (document, "script", "twitter-wjs"));