Validity Importance in Consumer Research: Some Pragmatic Issues

Allen D. Shocker, University of Pittsburgh
Gerald Zaltman, University of Pittsburgh
ABSTRACT - This paper poses several reasons for the general lack of explicit concern with validation in research. Theoretical considerations for validation generally are difficult, if not impossible, to meet. Consequently, pragmatic means of coping develop, and these often include no explicit validation effort at all. The paper suggests that validation is sufficiently important that attempts at explicit measurement ought to be more widely encouraged even if they are not highly sophisticated. By a greater willingness to accept validation efforts dictated by the pragmatics of the situation under which the research is conducted, an incremental gain may result in the quality of research reported. The paper concludes with some particular examples of research which serve as prototypes for the type of emphasis advocated.
[ to cite ]:
Allen D. Shocker and Gerald Zaltman (1977) ,"Validity Importance in Consumer Research: Some Pragmatic Issues", in NA - Advances in Consumer Research Volume 04, eds. William D. Perreault, Jr., Atlanta, GA : Association for Consumer Research, Pages: 405-408.

Advances in Consumer Research Volume 4, 1977   Pages 405-408

VALIDITY IMPORTANCE IN CONSUMER RESEARCH: SOME PRAGMATIC ISSUES

Allen D. Shocker, University of Pittsburgh

Gerald Zaltman, University of Pittsburgh

ABSTRACT -

This paper poses several reasons for the general lack of explicit concern with validation in research. Theoretical considerations for validation generally are difficult, if not impossible, to meet. Consequently, pragmatic means of coping develop, and these often include no explicit validation effort at all. The paper suggests that validation is sufficiently important that attempts at explicit measurement ought to be more widely encouraged even if they are not highly sophisticated. By a greater willingness to accept validation efforts dictated by the pragmatics of the situation under which the research is conducted, an incremental gain may result in the quality of research reported. The paper concludes with some particular examples of research which serve as prototypes for the type of emphasis advocated.

THE REALITY OF VALIDITY

In the marketing literature on consumer research, one still finds relatively little explicit concern with issues of validity, i.e., with the validation of measures or the broader concerns of internal and external validity attending the research itself (Heeler and Ray, 1972). In fact, there appears to be only one extensive treatment of this issue in the literature (Zaltman, Pinson, and Angelmar, 1973). This situation is changing, as a review of recent literature seems to indicate. Hopefully, this modest trend will continue and perhaps gather more momentum. Issues of validation are very central to the development of theory and to the progression of research from mere ad hoc responses to specific inquiries toward the cohesive bodies of knowledge characteristic of disciplines. As consumer researchers gain more confidence in differentiating their concerns from those of more traditional behavioral science, the fundamental criteria of science and scientific endeavor will increasingly have to be met if their work is to take its place among such sciences.

But talking about the inevitability of change says nothing regarding how quickly such change is brought about. The purposes of this paper are to pose some thoughts about some of the reasons for the present lack of concern with issues of validation in consumer research and to offer concrete examples of what might be done in overcoming such problems. In so doing we recognize that we are caught in the logical trap of asserting some things whose own validity is open to debate. But as there is no way around the trap given the constraints under which this paper is written, we offer our thoughts openly.

Much interest in consumer behavior comes about because of its relevance to marketing management. Marketing is at once a field of research and a field of practice. The practice aspect strongly supports applied research to provide answers to specific questions and problems. And research very often tends to be ad hoc and pragmatic. Researchers may naturally find their training and values strongly affected by such considerations. Explicit validation efforts generally add costs and time delays to research and thus where such pragmatic criteria are important, consideration of the tradeoffs may argue against validation efforts. If there is a reason to be sanguine about the future of validation components in consumer research, it stems from the influx of numbers of discipline oriented researchers as opposed to policy oriented researchers, whose values stress the scientific importance of validity measurement. The growing acceptance of their contributions should create an environment in which competitive pressures force those who still straddle between discipline and practice to meet new criteria for acceptance of their work.

Before continuing, we would like to indicate the types of validity issues we feel consumer behavior researchers should be concerned with. A much more extensive treatment of this can be found elsewhere (cf. Zaltman, Pinson and Angelmar, 1973). First is the issue of observational validity or the degree to which a phenomenon is readily observed. One might readily question, for example, whether concepts elicited through factor analysis are really observed. This, of course, is closely related to the very important issue of construct validity or the extent to which an operationalization measures the concept which it purports to measure. A most serious problem in research generally is that the same label and definition (e.g., brand loyalty) are employed by different researchers who measure them in quite different ways. This is a problem to the extent that there is truth in the argument that different operationalizations necessarily represent different concepts of phenomena. If this argument is true, then we must question the notion of convergent validity as a method. Convergent validity is the degree to which two very different ways of measuring a concept are correlated or otherwise produce the same result. Again, if this argument is true, we automatically have discriminant validity since each different measure always produces a different concept. Another neglected type of validity is systemic validity. Systemic validity is important for theory building. It is the degree to which hitherto unrelated concepts or ideas are brought together to form a new set of concepts. We mentioned the issue of how well an operationalization measures a concept. This is not quite the same thing as an operationalization representing the concept it is to represent. This is sometimes referred to as content validity. This brings us back to the argument of whether each operationalization represents a unique phenomenon. If the argument is true, then we may always have content validity. This holds when a grounded approach to theory is taken, using such tools as factor analysis, then the phenomenon we label "X" is by definition veridical (in representation and measurement) with the operationalization.

The inherent importance of validation to research is almost beyond question. What we are discussing is a matter of degree, a matter of the explicitness of the validation effort. Face validity is a part of any research from its conception to the assessment of its implications. The issue is usually one of whether to go beyond such judgment to more formal measurement. We have already noted that the value system of the audience for whom the research is aimed may determine the degree of concern a researcher experiences with regard to validity issues. However, even when the researcher actively wants to pursue validity issues, there are unavoidable constraints.

In its purest form the validation of behavioral research is virtually impossible (Torgerson, 1958). The concept of validation implies the existence of a true measure or criterion against which the empirical result or measure can be compared. Rarely does such a criterion exist. If it did, it could be used and the whole issue of validity would be noted. Perhaps in some cases a true measure may exist but be too expensive or time consuming to implement so that "valid" surrogates are desired which do not suffer from these handicaps. But we suspect such circumstances are rare. Consequently, it is with operational methods of validation that the researcher must be concerned. Validation operationally becomes a method of uncertainty reduction rather than its elimination.

However, if the researcher seeks guidance from definitions of validation, he finds them merely specifying desiderata. To paraphrase the comments of some authors in research methodology:

A measure or measuring instrument is valid to the extent that differences in scores among objects reflect true differences of the objects on the characteristic (construct) which the instrument tries to measure (Selltiz, et al, 1959, p. 155).

A research study is internally valid to the extent that the specific treatment in which the researcher is interested has actually brought about the effect indicated by the measuring devices (Banks, 1965, p. 26).

Research results are externally valid to the extent they are generalizable to a population or setting of interest beyond the context in which these results were obtained (adapted from Banks, 1965, p. 26).

The fact that such definitions stress criteria the researcher should try to achieve rather than ways of achieving them, is frustrating to a researcher because he does not know what is an acceptable operational measure of the true criterion. How can a researcher tell whether a treatment has brought about a specific effect? How can one actually determine whether results are generalizable (Peter, 1976)? In reality "this question of external validity ...is never completely answerable..." (Campbell and Stanley, 1963, p. 5).

Further, at a conceptual level, internal and external validity "are frequently at odds in that features increasing one may jeopardize the other" (Campbell and Stanley, 1963, p. 5). This observation is illustrated by the laboratory vs. field experimentation dilemma where the experimental controls possible in the laboratory permit a more internally valid measure of some phenomenon but at the possible sacrifice of empirical relevance. Further, research involving a large percentage sample of some relevant population may have greater external validity (with respect to that population), but if there is contagion (elements of the population communicate with each other about the research), there may be lower internal validity.

A further difficulty arises from the differing concepts of validity that have been developed in an attempt to move toward operationalization. Despite attempts to clarify (definitionally) a whole hierarchy of decreasingly demanding criteria for establishing measure validity (Angelmar, Zaltman and Pinson, 1972), the real result may be confusions and resulting frustration for researchers. For example, in this hierarchy construct validity represents the highest form while face validity is relegated to the lowest. Because construct validation is the ultimate, for reasons noted previously, it is largely unobtainable which may prove frustrating to the perfectionist (but not frustrating enough to those who have claimed construct validation for significantly less accomplishments). Placing face validity at the opposite end of the hierarchy has deprecated its use. Few researchers will explicitly discuss their face validation procedures (although face validation remains implicit in the research and publication processes). Consequently, the degree of the researcher's concern with validation issues, if any, may not be disclosed to his audience. Finally, reliability, which is an inherently more operationable concept, is often confounded with validity since validity is a higher status term (conceptually, if a measure is valid, it is also reliable, the converse not being necessarily true). Convergent validation, for example, more probably produces measures of reliability (consistency) than validity. A measure is valid if it is highly correlated with several additional, distinct measures purportedly measuring the same construct. This conclusion is based upon the logic that these measures would have low prior probability of being correlated unless there really was a common factor underlying each. The fallacy is that such a procedure cannot demonstrate that the common factor is the construct of interest.

The pragmatic effect of validation effort, either of measures or of research results, then is to reduce uncertainty regarding the meaningfulness of the research. Typically, science has relied upon the legitimacy of certain methodologies and processes of research to provide this confidence. The researcher who employs an experimental design, uses probability sampling methods, uses widely accepted measures, pretests his research instruments, and conducts his analysis by sophisticated multivariate statistical methods will generally be less overtly concerned with the validity of what he is about. Yet it is true that such procedures do not guarantee the validity of research results (even in terms of the purposes for the study).

We would like to see greater explicit consideration of validation issues in research and research reports. One way of bringing this about is for the profession to be more insistent that these issues be discussed. Researchers are obviously constrained to do the feasible. And what is feasible must necessarily be constrained by the circumstances surrounding the conduct of the research. Reports of validation should always be expected and their legitimacy should be judged in the context of the circumstances of the particular study. If researchers were encouraged to think more often about the validity of what they were about and overtly to perform the most appropriate, if possibly crude, tests of validity feasible under the circumstances, it is our contention that less but more meaningful research would result. Such criteria would demand validation evidence regardless of the legitimacy of the methodologies used. It would condone validation efforts appropriate to the researcher's situation so long as their appropriateness were legitimized. Explicit measures of face validation could more often be disclosed, for example.

In the following section we would like to exemplify the kinds of work we are advocating. These examples drawn from the consumer research literature serve also to indicate the kinds of approaches to validation that are feasible in consumer research generally. Because Heeler and Ray (1972) have provided a number of examples of measure validation, our examples deal largely with the validation of research results.

EXAMPLES OF VALIDATION IN CONSUMER RESEARCH

1. Midgley (1976) developed a simple mathematical theory of the diffusion process for new products based upon flows in a network where potential adopters can become active adopters, active rejectors, or passives with respect to an innovation. The model consists of a series of differential equations describing the rate of change of these populations of adopters/rejectors with time. Midgley might have been content merely to posit his model. Instead, several validation efforts were undertaken. Consistency between theory and data was assessed by fitting the equations to time series data on market penetration (percentage of population adopting) for a half-dozen products. The magnitude of the squared error between the fitted model and the actual penetration was among the measures used to determine consistency. Various additional tests were discussed for assessing the statistical significance of the consistency between theory and data. Midgley argued, however, that assessment of the consistency between theory and research was a relatively weak test of the theory's value. Calibrating this model using part of the data and using the resulting model to predict the remaining data offered a test of predictive validity. Additionally, Midgley obtained further corroboration for his model by offering explanations for various turning points in terms of changes in the real world environment (e.g., problems with distribution). He offered face validity evidence by linking the model's behavior to management's perceptions of what actually happened and, in addition, examined the plausibility of the relative magnitude of parameter estimates. The Midgley article offers one of the finer examples of research where concern with issues of validity made the result so much more impressive.

2. Arguing that the widely studied multiattribute attitude model associated with Fishbein and/or Rosenberg was, in addition to being a model of attitude measurement, a theory of attitude formation and change, Bettman, Capon and Lutz (1975) attempted to demonstrate that the model possessed explanatory power with regard to these processes. They associated such a demonstration with the establishment of "construct validity" for the model viewed as a theory. Specifically, the assumptions of component multiplication (belief that an object possesses a certain attribute multiplied by a measurement of judged goodness-badness of the attribute) and subsequent summation (overall attributes) were investigated.

The authors made use of a relatively new research paradigm based upon a factorial analysis of variance design. This paradigm, termed integration theory, had been applied to a wide range of psychological phenomena in which the integration of information was presumed to underlie the process under investigation. Basically, the method consists of presenting certain bits of stimulus information (information regarding the belief and evaluative components of two or more attributes of a brand). The information is varied in systematic fashion using a factorial design where each factor represents one of the theoretical constructs in the model being tested and levels of each factor correspond to different "amounts" of the constructs. By having each subject respond to the entire set of treatment combinations and then performing analysis of variance on the resultant data, the combination rules which the subjects seemingly employ can be identified.

This is accomplished by noting which patterns of factor interactions and main effects are significant. By comparing the subjects' apparent combination rules with those of the model being tested, the validity of the theory underlying the model can potentially be assessed. The results of their empirical research generally supported the validity of the Fishbein formulation.

While the paper focused upon one type of validation (and introduced some relatively exotic methodology), the authors took care to review research of their own and others which dealt with other validation questions. Convergent and predictive validation efforts were particularly discussed. The paper was one in a continuing series of studies by various of the authors in which the usefulness and validity of the Fishbein model and its underlying assumptions was assessed.

3. Scott and Wright (1976) conducted a study which examined several issues surrounding the use of multiple regression for estimating the weights buyers give to different factors in evaluating purchase options. Evaluations were obtained from corporate purchasing agents and consulting engineers regarding the overall appeal of a specific type of product. Descriptions of several alternative electrical resistors were given to them abstractly in terms of profiles on significant attribute dimensions. After recording his judgments of the profiles, a respondent rated the relative importance he thought he had given each attribute (as well as other data regarding the judgmental task itself and the importance of decisions about resistors in his job). Individual level analyses were carried out with the overall evaluations of each resistor as the dependent variable and the profiles of the resistors as the independent variables. Multiple criteria were used to assess the validity of the derived regression coefficients as estimates of cognitive weights.

These criteria included two measures of predictive validity: (1) goodness of fit for the estimated equation within the sample of observations used for estimation and (2) cross-validity of the model (the model estimated from one sample of a respondent's evaluations should accurately predict a second set). Additionally, (3) it was desired that the results possess face validity -- the signs of the coefficients indicating attribute importance should agree with expert judgment and with the self-reported measures provided by respondents. Finally, (4) weights derived for different subpopulations (purchasing agents and engineers) where between group differences in priorities were suspected should discriminate those groups. The reliability of estimates between different subsamples of a respondent's evaluations was also investigated. A comprehensive literature review provided further face validity for the research effort.

What is noteworthy about the research is not only the number of ways validity is measured but the willingness of the authors to discuss the meaning of instances when the validity of their work is not supported. This paper in our judgment provides an excellent prototype for a research study where the authors are continually thinking about the meaningfulness of their efforts.

CONCLUSIONS

In this paper we have essentially registered a plea --for greater concern with the achievable validity and meaningfulness for research. We have tried to indicate that complete validation is not achievable but that "partial" validation is. If one looks on validation pragmatically, we can point to numerous examples of feasible effort. Researchers are reminded that validation of one's research is still held in high regard and that such efforts not only make the research more meaningful but reflect substantially upon the researchers themselves.

REFERENCES

Richard Angelmar, Gerald Zaltman, and Christian Pinson, "An Examination of Concept Validity," in M. Venkatesan (ed.), Proceedings: Third Annual Conference of the Association for Consumer Research, (1972), 586-593.

Seymore Banks, Experimentation in Marketing (New York: McGraw-Hill, 1965).

James R. Bettman, Noel Capon, and Richard J. Lutz, "Multiattribute Measurement Models and Multiattribute Attitude Theory: A Test of Construct Validity," Journal of Consumer Research, 1 (March, 1975), 1-15.

Donald T. Campbell and Julian C. Stanley, Experimental and Quasi-Experimental Designs for Research (Chicago: Rand McNally, 1963).

Roger M. Heeler and Michael L. Ray, "Measure Validation in Marketing," Journal of Marketing Research, 9 (November, 1972), 361-370.

David F. Midgley, "A Simple Mathematical Theory of Innovation Behavior," Journal of Consumer Research, 3 (June, 1976), 31-41.

J. Paul Peter, "Reliability, Generalizability and Consumer Behavior," Working Paper (St. Louis: Washington University, 1976).

Jerome E. Scott and Peter Wright, "Modeling and Organizational Buyer's Product Evaluation Strategy: Validity and Procedural Considerations," Journal of Marketing Research, 13 (August, 1976), 211-224.

Claire Selltiz, Marie Jahoda, Morton Deutsch, and Stuart W. Cook, Research Methods in Social Relations (New York: Holt, Rinehart and Winston, 1959).

Warren S. Torgerson, Theory and Methods of Scaling (New York: John Wiley and Sons, 1958).

Gerald Zaltman, Christian Pinson, and Richard Angelmar, Metatheory in Consumer Research (New York: Holt, Rinehart and Winston, 1973).

----------------------------------------