Incorporating Tests of Data and Model Validity in Commercial and Academic Research

Edgar A. Pessemier,
[ to cite ]:
Edgar A. Pessemier (1975) ,"Incorporating Tests of Data and Model Validity in Commercial and Academic Research", in NA - Advances in Consumer Research Volume 02, eds. Mary Jane Schlinger, Ann Abor, MI : Association for Consumer Research, Pages: 705-712.

Advances in Consumer Research Volume 2, 1975      Pages 705-712


Edgar A. Pessemier

[Edgar A. Pessemier is Professor of Industrial Administration, Krannert Graduate School of Industrial Administration. Purdue University.]

The important types of validity are reviewed as they apply to research in consumer and market behavior. Subsequently, two examples are presented, one from commercial research and the other from academic research. The commercial case emphasiZes problem definition and measurement selection. The academic case focuses on the problem of developing a complex model where both the validity of the model and the measurements may be questioned. These two contexts serve to illustrate the need for early consideration Of validity issues, incorporating validity tests in the research effort and reporting the validity of the measures and models along with other research results.

In the applied and academic realm, data collection and model selection are purposeful activities. Variables are designed to measure the properties the analysts wants to measure and a model is selected to expose the relations of interest. The validity of the data and model must be Judged together with reference to explicit criteria. An incorrect choice of variables and measurement methods or the selection of an inappropriate model frequently yield faulty Predictions,


Reliability is concerned with the consistency and accuracy of the measurement. Unreliable measures tend to mask significant relations and increase the quantity of data required to achieve useful results. The importance of reliability has been widely recognized and a large body of techniques have been developed to assist the research worker [Lansing and Morgan, 1971; Pessemier, 19711. Although the subject of reliability is not central to the present discussion, a lack of reliability may indicate the absence of validity. For example, rushing subJect responses or asking questions that are beyond the capacity Of subjects to respond can result in low reliability and validity.

Validity deals with the meaning of measurement. The development of a consistent, precise measure does not guarantee its usefulness in any specific research context. If measurements do not measure what they are intended to measure, then the association of predictor variables to the dependent variables will be weak or misleading. But how can the validity of a measure be examined? From the point of view of consumer and market research, most emphasis is on predicting the future or present level of criteria from separately measured predictors. Predictive validity is tested by examining the capacity of data and a model to forecast future states of a dependent variable; will this month's consumer optimism index or sum score predict next month's automobile purchases. The score's construct validity may be demonstrated by reference to attitude theory. This theory indicates the measured level of an attitude (consumer optimism) should be associated with subsequent behavior (purchase of an automobile). In addition, the score may have an appropriate degree of content validity if its components effectively represent the domain of consumer optimism. Given satisfactory evidence of content and construct validity, the predictive capacity of the optimism score specifies the measure's validity for a specific purpose, predicting the purchase of an automobile.

Concurrent validitY measures the association between a predictor(s) and dependent measure(s) obtained at the same time; will the current occupation Of the head Of a household predict his (her) consumer optimism score. Here the case for construct validity is weak since it is harder to find strong theoretical grounds for believing present occupation will predict the current level Of a consumer's optimism score. Also note that the two consumer optimism illustrations point up the need to examine model validity. The predictor score was obtained by simply summing several measures of consumer optimism. Even if an analyst boldly assumes all the essential variables were included, the we of equal weights may be challenged. A factor analysis of a larger set Of candidate variables could alter both the variables employed and the method used to compute a score. Finally, note that no direct attention has been given to the question Of how each basic variable was defined.

Although the measurement Of complex constructs raise the most serious validity questions, widely used variables such as income and occupation may also be troublesome [Lansing and Morgan, 1971]. Income can be defined in many ways, each more or less relevant to various criterion, and occupation classes are largely arbitrary unless the purpose for which they are defined has been specified. Even the purchase Of an automobile raises measurement issues. An automobile may be purchased new or used, with or without trade-in for cash or credit. Should these elements be considered and how should gifts and long-term rentals be handled?


In many types Of scholarly work, the selection of measurements and models is made by the researcher. In commercial research, the analyst is often denied this luxury. If data comes from a syndicated service, the variables included are frequently limited by the needs of many users, the desire for comparability through time and the cost Of data connection. In addition, the commercial analyst may have no capacity to influence the sample population from which data were connected or to verify the extent to which sample and response bias influence validity Of the data for the roles it must play in his models. Furthermore, he may experience difficulty in comparing findings based on separate data bases. A company analyst studying the purchase Of major appliances could use syndicated Trendex data, data from a proprietary consumer panel and consumer survey results published by the University Of Michigan's Survey Research Center. Sample populations, time periods, methods of datA collection, the coding Of variables and applicable methods Of data reduction may all potentially inhibit his capacity to make valid comparisons. In this illustration, however, sample sizes tend to be large and well defined, and many variables are sufficientlY standard to permit a variety of cross data-base comparisons.

The commercial researcher's most important problem is not simply employing valid data and models to discover significant relAtions. His task is more difficult. He must find valid relations that are managerially important. That means the relations must be strong and the dependent variables must have useful economic meaning and suggest effective action. Clearly, the most important criteria to marketing decision-makers concern the purchase of a particular type Of product or service, of the brands within a product class. Small differences in purchase rates or brand shares among various classes of buyers may brave large managerial significance. Since these differences can occur for many reasons, it is important that the predictor variables be valid and strongly associated with these criteria. It is equally important that tests of significance be purposefully selected. As Bass, Tigert and Lonsdale (1968) have observed, testing the worth Of segmentation variables by referring to measures of explained variance may greatly understate their managerial value. The manager's first concern is group differences and the power to predict individual behavior may be of little or no interest.

Continuing to develop the appliance illustration, it is easy to think of a variety Of strategic moves that could be associated with demographic predictors [Pratt, 1972]. They can assist product planners find market segments that are attractive targets for an innovative product or that are becoming stable replacement markets for mature products. In both cases, product life-cycle theory tells him that product design, pricing, and other considerations hinge on identifying the changing demographic characteristics of first time and replacement purchases [Smallwood, 1973]. Adoption and diffusion theory also suggests that buyer profiles change in predictable ways during the product's life-cycle [Rogers and Schoemaker, 1971]. For these reasons, a planner or analyst may attempt to measure the demographic properties of buyers and owners at various stages in the adoption cycle. A classification analysis Of this type of data identified the following differences associated with income:



The association between income and ownership is extremely significant and managerially important but the expected higher proportion of high income to low income buyers/owners during the early phase of a product's life cycle cannot be clearly established fran the data at band. Product Type A is owned by 24% of the subJects and has a higher percentage of owners among high income groups than the other two products. On the other hand, Product B (33% owners) has a lower percentage of owners among the high income group than does Product C (54% owners). Also, more of Product Type B's buyers come from the low income group than do Product A's or Product C's buyers. Possible explanations abound. Is the sample biased, has the presence of a substantial number of non-respondents to the income question rendered that variable invalid, are variables properly coded, is the theory strongly conditional on product type or is the theory faulty?

The analyst could examine his data from a variety of points of view, eliminating several possible sources of the observed "difficulty" by examining alternate studies to test for sample bias, comparing the non-respondents to the income question with respondents and repeating the analysis with a recoded income variable. In addition, the theory can be examined with the aid of other variables such as age and education. If the latter effort also produced "unexpected" results across the three product types, then attention should turn to ancillary measures and the statement of the research question.

An explanation of the observed results is simple to find and perhaps too obvious to have generated the initial concern. However, information is required that is not contained in the original data and cannot be directly measured. There is strong reason to believe that Product Type B is much closer to reaching the limits of its potential penetration than either Product Type A or C. In other words, the relevant populations are not common across the three product types. When percentage Of ownership data were converted to percentage penetration, and the columns of the tables were rearranged, the theory was strongly supported across the predictor variables examined by the analyst.

The above illustration raised a common set of validity issues facing the commercial researcher. He must use the data at hand to investigate managerially important questions. A variety of methods can and should be used to establish the validity of his basic variables. The appropriateness of his analytical models should be examined in the light of the properties of these measurements. Finally, he must consider carefully the purpose of the analysis and satisfy himself that the research question has been correctly framed.


A somewhat more difficult problem arises when the researcher is concerned with the validity of a model of consumer or market behavior. Often these models include high degrees Of abstraction or make assumptions about processes that are not observable. For example, the models Of Carroll, 1972 and Kruskal 1964, convert consumer perceptual Judgments into spacial representations of objects and subjects. The spaciAl proximity of choice objects have meaning in terms Of products' characteristics and consumers' power to discriminate among the competing alternatives. In addition, consumers' ideal locations can be placed in the same space, allowing the analyst to appraise how product characteristics are related to product preferences. Typically such joint space analyses have been based on product spaces obtained by decomposing similarity judgments. Recently, Pessemier 51973; Pessemier and Jones, 19741 has adopted the discriminant model to provide an alternative method of directly deriving a single consumer's product space from judged attribute levels for each choice object. This model is directly competitive with traditional multidimensional scaling procedures.

At least two validity questions surround the single-subject discriminant model. Data is required about the possible correlation Of attribute levels within choice-objects. For example, if a consumer learns the safety of an automobile has been increased, in what direction and to what degree would this influence his judged level of another attribute, say cost? Data of this type can be connected in a variety Of ways. The validity Of each method is hard to measure because the observations are individual judgments. Of course, the same problem has troubled researchers using similarity judgments.

The validity question in Such cases can be examined in several ways. First, objects with known properties can be used and the congruence of judgment about them can be examined in relation to the objective measures. Second, judges self reported experience in making the judgments can be appraised. Third, the stability and reliability of the judgments can be examined over time and across judges who have demonstrated a capacity to respond effectively to well understood perceptual tasks. Finally, the data can be employed in the discriminant model to develop perceptual product spaces and the performance of the spacial representations can be compared to product maps developed by standard scaling methods that use similarity judgments.

The flow diagram from a text book study [Jones and Pessemier, 1974] which used the above approach will clarify the nature of the validity tests:



In the top phase of the analysis, two distinct classes Of measurements were used, each in conjunction with an appropriate model and computer programs. Next, the congruence of these separately generated spacial configurations of text books was examined analytically. The reliability of the basic input data was examined directly but the validity Of the similarity judgments on one hand and the attribute level and correlational Judgments on the other hand could not be directly studied. However, a comparative examination was made of the attribute data collected by different instruments. The configurations developed from these alternative data sources were also examined.

At the next level of analysis, the predictive qualities of the configurations were tested in joint-space. Prior to this test, however, the reliability and validity of several preference measures were studied to insure the integrity of the affective components of the joint-space model. With this assurance in hand, it was possible to use both types of configurations to predict the cognitive and affective placement of a "new" text book not used to develop the original product space. This form of predictive testing is especially helpful in establishing the validity of the research procedures.

Finally, the object and attribute proJections in the discriminant space were compared with separately fitted attributes in the space derived from similarity judgments. This step helped establish the cognitive computability of the two procedures and indicated how well the directly Judged attributes spanned the space generated from similarity Judgments.

Under certain limiting conditions, the single subject discriminant model can be shown to produce the same space as a scaling model. This mathematical property is of little real comfort since the required conditions cannot be easily met. Even if they were, the validity Of a configuration in the context of the joint-space model in which it is most frequently used cannot be directly established. Therefore, the models have to be judged in relation to the levels Of content and predictive validity that can be demonstrated in specific research tasks. In this regard, four research efforts have been launched to study the single-subJect discriminant model's comparative validity across four diverse types of choice objects; textbooks, beer, recreational programs and nuclear power facilities [Jones and Pessemier, 1974; Moore, 1974; Jones, 1974; and Braden, 1974]. The findings produced by the above procedures and similar comparisons made in the other studies cited offer considerable support for the new methodology. As the superiority of the one or another of the models is demonstrated by repeated applications, increased reliance may be placed in the model and particular type Of data which it employs. Successful experience with a growing range of research contexts should increase the analyst's confidence.


The characteristics and importance Of validity has been outlined as it appears in several applied and theoretical realms. The scholar has more freedom to specify the nature of his measurements and the structure of his models. On the other hand, an applied worker will typically work with larger samples and simpler, better understood measurements and analytical methods. Each type of investigator has an obligation to continuously monitor the validity Of his research procedures. If analysts remain sensitive to the problem, numerous opportunities arise to test the validity Of measurements and models. In doing so, it is imperative that the results be reported. A model for the scholarly community can be found in the work Of the Michigan Institute for Social Research, Robinson sad Shaver,(1969), have documented the measurement and validity properties of a wide range Of instruments in the field Of social psychological attitudes. Lansing and Morgan present a useful general discussion of the appropriateness Of the variables and data reduction methods employed in IRS economic surveys. Finally, there is a grazing scholarly literature. Under the sponsorship of the National Bureau of Economic Research, the Annals of Economic and Social Measurement regularly publishes works on measurement of interest to the consumer and marketing research community.

In the non-academic research community, the Census Bureau (1970), and other government agencies have shown increasing concern with validity. Considerable attention is given to improving the data collection and reporting the methods employed. Also, firms producing a wide range of commercial research data are finding their buyers have become more sophisticated, calling for relatively complete disclosure about the quality of data, [Pessemier, 1971; Pessemier and Bruno, 1972]. The work of various professional associations has contributed to these developments. Clearly, the academic and applied research communities should continue the healthy trend towards improving validity standards in data collection and model development.


Bass, E. M., Tigert, D. J., & Lonsdale, R. T. Market segmentation: Group versus individual behavior. Journal of Marketing Research, 1968, 5, 264-270.

Braden, P. A study of attitudes toward nuclear power generating facilities. Unpublished stud-y in progress, Bureau of Business Research, University of Michigan, August 1974.

Carroll, J. D. Individual differences and multidimensional scaling, In R. If. Shepard, A.K. Kimball and S.B. Nerlove (Eds.), Multi-dimensional scaling. Vol. l.Theory. New York: Seminar Press, 1972, 105-157.

Jones, W. H. & Pessemier, E. A. Single subJect discriminant configuration: An examination Of reliability and validity and Joint-space implications. Institute Paper No. 451, Krannert Graduate School of Industrial Administration, Purdue University, March 1974.

Jones, W. H. A multidimensional joint-space approach to the design and selection of outdoor recreation facilities. Unpublished doctoral dissertation in progress, Purdue University, 1974.

Kruskal, J.P. Multidimensional scaling by optimizing goodness-of-fit to a nonmetric hypothesis. Psychometrika, 1961, 29, 1-27.

Lansing, J. B. & Morgan, J. N. Economic survey methods Institute for Social Research, University of Michigan, May 1971.

Moore, W. L. A comparison Of product spaces generated by multidimensional scaling and by single subJect discriminant analysis. Institute Paper No. 477, Krannert Graduate School Of Industrial Administration, Purdue University, August 1974.

Pessemier, E. A. Data quality in marketing information systems. Institute Paper No. 297, Krannert Graduate School Or Industrial Administration, Purdue UniversitY. February 1971.

Pessemier, E. A. Single subJect discriminant configurations. Institute Paper No. 406, Krannert Graduate School Or Industrial Administration Purdue University, April 1973.

Pessemier, E. A. and Bruno, A. V. An empirical investigation Or the validity of selected attitude and activity measures. In M. Venkatesan (Ed.), Proceedings 3rd annual conference. Iowa City, Iowa: Association for Consumer Research, 1972.

Pessemier, E. A., & Jones, W. H. Joint-space analysis Of the structure Of affect using single-subJect discriminant configurations, Part l. Institute Paper No. 435 (Rev.), Krannert Graduate School Of Industrial Administration, Purdue University, February 1974.

Pratt, R. W. Marketing applications Of behavioral economics In B. Strumpel, J. N. Morgan and E. Zahn (Eds.), Human behavior in economic affairs San Francisco: Jossey-Bass, 1972.

Robinson, J. P. & Shaver, P. R. Measure Of Social Psychological Attitudes, Institute for Social Research, University Of Michigan, August 1969.

Rogers, E. M. & Schoemaker, F. F. Communication Of innovations. New York: The Free Press, 1971.

Smallwood, J. E. The product life cycle: A key of strategic marketing planning. MSU Business Topics, Winter, 1973, 29-35.

U. S. Bureau of Census, Census bureau methodological research, August, 1970.