Advances in Consumer Research Volume 4, 1977 Pages 11-18
CONVERGENT AND DISCRIMINANT VALIDITY BY ANALYSIS OF COVARIANCE STRUCTURES: THE CASE OF THE AFFECTIVE, BEHAVIORAL, AND COGNITIVE COMPONENTS OF ATTITUDE
Richard P. Bagozzi, University of California, Berkeley
A general structural equation model is derived for determining convergent and discriminant validity. Using a maximum likelihood estimation procedure, the model provides for an overall c2 goodness-of-fit test, and it explicitly partitions the variance due to construct, method, and error. The procedure is illustrated on data from Ostrom's (1969) study of the affective, behavioral, and cognitive components of attitude.
An integral part of the concept formation, theory construction, and testing phases of scientific research is the determination of construct validity, i.e., the degree of correspondence between theoretical variables and their operationalizations. With few exceptions (c.f., Heeler and Ray, 1972; Bettman, et al., 1975), consumer behavior researchers have tended to ignore construct validity and assume that the variables in their models are well formed, measured without error, and not affected by systematic factors such as methods variance. Not only may the neglect of construct validity lead to erroneous conclusions, but it may mask important relationships in one's research.
The traditional method for determining construct validity has been to use the multitrait-multimethod matrix (MM) approach (Campbell and Fiske, 1959). In attitude theory research, for example, Ostrom (1969) and Kothandapani (1971) used the method to test the utility of representing attitudes through the tripartite model consisting of affective, behavioral, and cognitive components. Unfortunately, due to certain ambiguities and a lack of explicit objective measures for determining validity in the MM approach, only "minimal support" was found for construct validity. Consumer behavior researchers need a methodology for determining construct validity that rigorously represents the degree of fidelity between theoretical variables and their operationalizations.
In general, construct validity can be determined by analyzing the following six criteria for one's constructs (Bagozzi, 1976a):
1. Theoretical Meaningfulness of Concepts
2. Observational Meaningfulness of Concepts
3. Internal Consistency of Operationalizations (reliability)
4. Convergent Validity
5. Discriminant Validity
6. Nomological Validity
The first two are semantic criteria and refer to the internal consistency of the language used to represent a concept and the conceptual relationship between a theoretical variable and its operationalization(s), respectively. The third criterion is a strictly empirical one designed to determine the degree of internal consistency and single factoredness of one's operationalizations. Criteria 4 and 5 are the traditional objects of the MM approach. Finally, nomological validity is the degree to which predictions from a formal theoretical network containing the concept under scrutiny are confirmed (Campbell, 1960). It will typically involve syntactical considerations in one's theoretical structure as well as empirical testing (for a discussion of these topics, see Bagozzi, 1976b). An empirical analysis of nomological validity may be found in Sternthal, et al (1976).
This article attempts to derive and illustrate a methodology for determining the portion of construct validity represented in criteria 4 and 5 above: convergent and discriminant validity. The discussion begins with a brief summary of the MM approach and some of its shortcomings. Next, a general model following Joreskog's (c.f., 1970, 1971) procedures is presented. Building upon the MM approach, the general model represents the pattern of relationships among theoretical variables and operationalizations. Construct validity is determined through an overall c2 goodness-of-fit test, and the variance due to constructs, methods, and error is partitioned through the model. Finally, the procedure is illustrated through a reanalysis of a portion of the data of Ostrom (1969).
THE MULTITRAIT-MULTIMETHOD APPROACH TO CONSTRUCT VALIDITY
As developed by Campbell and Fiske (1959), the MM approach is designed to ascertain (1) the degree to which two or more attempts to measure the same concept through maximally different methods are in agreement (convergent validity) and (2) the degree to which a concept differs from other concepts (discriminant validity). The MM approach determines convergent and discriminant validity through an analysis of the pattern of correlations among two or more traits measured by two or more method. For example, table 1, which is adapted from Campbell and Fiske (1959, p. 82), presents a typical MM matrix where three traits (A,B,C,) are each measured by three methods (1,2,3). Taking attitudes as an example, A, B, and C might depict affective, behavioral, and cognitive components, respectively, while methods 1, 2, and 3 might represent Thurstone, Likert, and Guttman scales, respectively. In table 1, the bold face type indicates validity diagonals, heterotrait-monomethod triangles are shown as solid lines, and heterotrait-heteromethod triangles are drawn as broken lines. Campbell and Fiske (1959, pp. 82-83) summarize convergent and discriminant validity in terms of the pattern of correlations as follows:
Four aspects bear upon the question of validity. In the first place, the entries in the validity diagonal should be significantly different from zero and sufficiently large to encourage further examination of validity. This requirement is evidence of convergent validity. Second, a validity diagonal value should be higher than the values lying in its column and row in the hetero-trait-heteromethod triangles. That is, a validity value for a variable should be higher than the correlations obtained between that variable and any other variable having neither trait nor method in common. This requirement may seem so minimal and so obvious as to not need stating, yet an inspection of the literature shows that it is frequently not met, and may not be met even when the validity coefficients are of substantial size . . . A third common-sense desideratum is that a variable correlate higher with an independent effort to measure the same trait than with measures designed to get at different traits which happen to employ the same method. For a given variable, this involves comparing its values in the validity diagonals with its values in the heterotrait-monomethod triangles. . . . A fourth desideratum is that the same pattern of trait interrelationship be shown in all of the heterotrait triangles of both the mono-method and heteromethod blocks. . . . The last three criteria provide evidence for discriminant validity.
A MULTITRAIT-MULTIMETHOD MATRIX FOR THREE TRAITS AND THREE METHODS
Table 2 presents the Campbell and Fiske criteria based upon the pattern of correlations. Notice that a considerable number of comparisons of correlations must be made in order to establish convergent and discriminant validity. If all of the criteria are satisfied in table 2, then one may conclude that convergent and discriminant validity have been achieved. However, the procedure is not clear when, as often happens, some patterns of correlations do not satisfy the criteria while others do. In both the Ostrom (1969) and Kothandapani (1971) studies of attitudes, for example, many of the correlation patterns for discriminant validity were not satisfied. Lacking explicit, unambiguous criteria, the authors relied on arbitrary "counting procedures" to represent the proportion of times the correlation relationships were violated. Not only do these procedures fail to provide objective means for determining whether one has achieved construct validity or not, but they do not indicate the degree to which operationalizations measure concepts. As a result, the MM approach cannot be easily used to compare findings across populations, time, and settings, nor can it be readily used to diagnose why one's constructs or measures are fallible. The model developed below can be used in these cases.
THE GENERAL MODEL
In the following paragraphs, a general model is derived for determining construct validity for r = 2 traits (i.e., variables) and s = 2 methods. The model is based on Joreskog's (1970, 1973) structural analysis of covariance matrices and builds upon the author's use of the method for sets of congeneric tests and multitrait-multimethod data (c.f., Joreskog, 1971; 1974). Figure 1 presents a causal diagram for a multitrait-multimethod matrix such as shown in table 1. The notation in the diagram corresponds to that developed below. In particular, x1 , x2 and x3 represent three traits (variables), and x4, x5, and x6 depict three methods. Trait 1 (x1) is shown operationalized by Y1, Y4, and X1; and similar comments apply for traits 2 and 3 . The relative correspondence between each trait and the respective operationalizations are represented through the parameters l1 - l6 and l1' - l3'. Similarly, the relative impact of each method on each operationalization is modeled through the respective parameters,_l7 - l12 and l4' - l6'. Error terms are shown as e1 - e6 and d1 - d3.
CRITERIA FOR CONVERGENT AND DISCRIMINANT VALIDITY IN THE CAMPBELL AND FISKE MULTITRAIT MULTIMETHOD MATRIX (THREE TRAITS AND THREE METHODS)
The Structural Equations
Consider the most general model consisting of a random vector of true (i.e., measured without error) dependent variables h' = (hl,h2,...,hn) and true independent variables z' = (z1, z2, . . ., zm) which may be related compactly in the following system of linear structural relations (c.f., J÷reskog and van Thillo, 1972):
Bh = x + z (1)
In this equation, B(m x m) and G(m x n) are matrices of coefficient parameters, and z' = (z1, z2, . . ., zm) is a random vector of disturbances (i.e., errors in equations). It is assumed, without loss of generality, that E(h) = E(z) = E(x) = 0 . Furthermore, it is assumed that z is uncorrelated with x and that B is nonsingular. Equation (1) represents the set of all possible linear relationships among the true independent and dependent variables. The parameters contained in B and G model these relationships.
Since the theoretical constructs, h and x , are not directly observed, it is necessary to form the following equations relating theoretical constructs to observable variables;
Y = u + Lyh + e (2)
X = v + Lx x + d (3)
A CAUSAL DIAGRAM FOR A MULTITRAIT-MULTIMETHOD MATRIX WITH THREE TRAITS AND THREE METHODS
where Y' = (Y1, Y2,...,Yp) and X' = (X1, X2,...,Xq) . The matrices Ly(p x m) and Lx(q x n) are regression matrices of y on h and x on x , respectively. The symbols, e and d, are vectors of errors of measurements in Y and X , respectively and are assumed uncorrelated with each other and the theoretical constructs. In addition, u = E(Y) and v = E(X).
Forming the vector z= (Y', X') , the variance-covariance matrix of observed variables may be written as (c.f., Joreskog and van Thillo, 1972):
where F is the variance-covariance matrix of x, y is the variance-covariance matrix of z, q2e is a diagonal matrix of error variances for Y , and q2d is a diagonal matrix of error variances for X. The dimension of S is [(p + q) x (p + q)] .
In order to obtain an unique solution of the parameters in S ,it is necessary to have at least as many unique equations relating observable variables to the structural parameters as there are parameters. Typically, this will require that restrictions be placed on elements of S. In general, two classes of restrictions are possible: fixed parameters and constrained parameters. Fixed parameters are assigned values on an a priori basis, while constrained parameters are represented as functionally related to other parameters. The basis for restricting parameters may be due to prior theoretical knowledge, logical criteria, empirical evidence, or experimental design considerations. A third category of prior information may be accommodated when data from additional samples are provided. Rothenberg (1973) discusses a way of finding the joint probability function for multiple samples and the corresponding estimation procedures. Those parameters in S that are not restricted are termed free parameters. The issue of representing a system of structural equations in a form that allows for the unique estimation of parameters is known as the identification problem. As Joreskog and van Thillo (1972) note:
Identification depends on the specification of fixed, constrained, and free parameters. Under a given specification, a given structure Ly, Lx, B, G, F, y, qd, qe generates one and only one S but there may be several structures generating the same S. If two or more structures generate the same S, the structures are said to be equivalent. If a parameter has the same value in all equivalent structures, the parameter is said to be identified. If all parameters of the model are identified, the whole model is said to be identified. When a model is identified, one can usually find consistent estimates of all parameters (pp. 3-4).
The general identification problem for a system of multiple independent and dependent variables and multiple operationalizations has not been solved. However, a number of researchers have derived criteria for various submodels. Fisher (1966) discusses the identification problem for the general model of multiple independent and dependent variables (i.e., endogenous and exogenous variables) but does not deal with the case of multiple operationalizations of theoretical constructs. Geraci (1974) derives a number of special cases of the most general model including the determination of measurement error in certain instances. He does not, however, treat the case of prior restrictions on the covariance matrix of disturbances including the condition of correlated measurement errors. Further, neither Fisher nor Geraci deal with dynamic models allowing measurement error. Hsiao (1975, 1976) derives the identification criteria for a linear dynamic system with measurement error in both endogenous and exogenous variables and for a particular contemporaneous system. None of the above approaches allow for multiple indicators of each theoretical variable, however. Until the most general identification problem is solved, the researcher will have to determine identification for each specific model tested (as shown below in the reanalysis of attitude models, for example).
Estimation of Parameters and the Goodness-of-fit Test
The estimation of parameters for the general model may be conveniently accomplished using the maximum likelihood procedure. It is assumed that the vector of random variables, z , has a multivariate normal distribution. Omitting irrelevant constants the logarithm of the likelihood function may be written as
where N = M - 1 , M is the number of observations, log is the natural logarithm, and tr stands for the trace (c.f., Thiel, 1971, Chap. 10). In addition, the sample variance-covariance matrix is
where Z= (Y', X' ) is the maximum likelihood estimate of the mean vector of observations and Za are the individual instances of Z. Maximizing log L in (5) is equivalent to minimizing
J÷reskog and his colleagues have derived a procedure to minimize (7) (the details may be found in J÷reskog and van Thillo, 1972; J÷reskog, 1973; and Gruvaeus and J÷reskog, 1970). Briefly, the minimization method is based on the iterative procedure of Fletcher and Powell (1963). It makes use of the first-order derivatives and large sample approximations to the elements of the matrix of second-order derivatives to achieve minimization.
A computer program, LISP, EL, exists for estimating the parameters of the general model (J÷reskog and van Thillo, 1972). The program also calculated an overall goodness of fit X2 test. The X2 test is derived from the likelihood ratio technique, and it' tests the null hypothesis of a given model in S versus the alternative hypothesis that S is any positive definite matrix. Specifically,
Let H0 be the null hypothesis of the model under the given specifications of fixed, constrained, and free parameters. First consider the case when the alternative hypothesis H1 is that S is any positive definite matrix. Then minus twice the logarithm of the likelihood ratio is NF0 where F0 is the minimum value of F. If the model holds, this is distributed, in large samples, as X2 with
d = 1/2(p+q)(p+q+l) - t
degrees of freedom, where, as before, t is the total number of independent parameters estimated under H0.
Let H0 by any specific hypothesis concerning the parametric structure of the general model and let H1 by an alternative hypothesis. In large samples one can then test H0 against H1. Let F0 be the minimum of F under H1. Then F1 = F0 and minus twice the logarithm of the likelihood ratio becomes N(F0 B F1). Under H0 this is distributed approximately as X2 with degrees of freedom equal to the difference in number of independent parameters estimated under H1 and H0. (Joreskog and van Thillo, 1972, p. 8).
The meaning of a goodness of fit found for any model should be interpreted with care. Since the X2 test depends directly on the sample size, a sufficiently large sample could be achieved which would cause one to reject virtually any model. However, even in this situation, meaningful results are possible if one were to compare similar models differing, say, by one or a small number of restrictions. In particular, each model would be estimated separately, and the respective X2 tests would be compared. The difference in X2 is distributed asymptotically X2 with the degrees of freedom equal to the difference between the separate degrees of freedom. The goal in such an analysis should be to set up a hierarchy of hypotheses such that each is a special case of another. As demonstrated below, these procedures play a central role in the determination of construct validity.
THE EXAMPLE OF THE ATTITUDE CONSTRUCT
In order to illustrate the structural equation approach to construct validity, consider the study conducted by Ostrom (1969). Ostrom measured the affective, behavioral, and cognitive components of people's attitudes toward the church using the methods of equal-appearing intervals (Thurstone), summated ratings (Likert), scalogram analysis (Guttman), and self-ratings (Guilford). Although an inspection of the MM matrix in Ostrom (1969) clearly shows that convergent validity has been achieved, the degree of discriminant validity is not easily inferred from the MM matrix since some patterns of correlations do not satisfy the three discriminant validity criteria enumerated in table 2. (The original correlation matrix from Ostrom (1969) may be found at the end of the article).
In order to determine construct validity using the analysis of covariance procedures, it is useful to analyze the data in two stages. First, consider the following structural equations [Due to space limitations the illustration is limited to the tripartite attitude models as operationalized by the Thurstone, Likert, and Guttman methods only. Also, to be consistent with the computer program developed by Joreskog and van Thillo (1972), the following specifications are made: B(4H4) = I , G (4x4) = I, and z = 0 .] which represent the model due to attitude components and measurement error only (termed the congeneric model):
Four hypotheses can be derived from this system of equations:
EQUATIONS (H1)- (H4)
Table 3 presents the results of hypotheses H1 - H4 applied to the data of Ostrom (1969). First note that only hypothesis H4 provides a satisfactory fit in that p value for the X2 test is greater than the p = .10 rule of thumb (c.f., Lawley and Maxwell, 1971, p. 42). The p value gives the probability of obtaining a X2 value larger than that actually obtained, given that the hypothesized model holds. Hypotheses HI - H4 are each tested against the most general alternative that Z is unconstrained (see discussion in the previous section). The most interesting hypotheses are those based on HI - H4 , however. As shown in the right-hand portion of table 3, the tests for parallel measures are represented by the differences between H1 and H3 or H2 and H4. The tests for parallel measures hypothesize that the components each have equal true score variances and equal error variances. Since X212= 99.11 and X212 = 99.01 in table 3, one must reject the hypothesis of parallel measures. Because the three measurements use different scaling procedures, one would not expect parallel measurements. However, the check for parallel measurements is demonstrated here since two independent applications of the same measurement procedure are often used over time in attitude research where parallel measurements are assumed.
For construct validity, the most important hypothesis to examine is whether F12 = F23 = F13 = 1 or not. This test, termed congeneric in the literature, is a test of whether the three components of attitude are indeed measures of a single construct or not. The hypothesis may be tested by observing the difference between either hypotheses H1 and H2 or H3 and H4, depending on whether one assumes parallel forms or not. The evidence in table 3 indicates that a decision as to construct validity must be temporarily postponed since X23 = 16.47 and X23 = 16.37. When one cannot accept the hypothesis that the three components are congeneric, method factors should be introduced, and the following structural equation system should be solved:
OSTROM'S (1969) MM MATRIX DATA TESTED FOR CONGENERIC AND PARALLEL HYPOTHESES-- THURSTONE, LIKERT, AND GUTTMAN METHODS ONLY
These equations represent the system of figure 1. The solution of these equations using the methods described above yields a goodness-of-fit X2 = 5.03 (12d.f., p = .96) which represents a very good fit. However, an examination of the maximum likelihood estimates reveals that the method factor due to the Guttman measure correlates unity with the other measures. Therefore, only two method factors were used and the fit of the model was X2 = 10.16 (14d.f., p = .75) which is also a good fit. Table 4 presents a summary of the maximum likelihood estimates for this model, table 5 summarizes the residuals, and table 6 shows the variance components due to attitude components, methods, and error.
STANDARDIZED MAXIMUM LIKELIHOOD ESTIMATES OF PARAMETERS FOR THE DATA OF OSTROM (1969) WITH THREE METHODS (THURSTONE, LIKERT, AND GUTTMAN) AND TWO METHOD FACTORS
RESIDUALS FOR DATA OF OSTROM (1969) WITH THREE METHODS AND TWO METHOD FACTORS
PARTITIONING OF VARIANCE DUE TO ATTITUDE COMPONENT, METHOD, AND ERROR FOR THE DATA OF OSTROM (1969)--THREE METHODS AND TWO METHODS FACTORS
In general, from the goodness-of-fit test, one may conclude that convergent and discriminant validity have been achieved for the data of Ostrom (1969) when one confines the analysis to the Thurstone, Likert, and Guttman methods. A number of interesting findings from table 6 deserve mention. First, note that affect is measured best by the Thurstone method and worst by the Guttman scales. Second, note that the Likert scales contain a considerable amount of method variance, and the Thurstone and Guttman measures produce significantly less method variance. Third, the Guttman scales and, to a somewhat lesser extent, the Thurstone scales show relatively high error components. The Likert method contains relatively small error variance.
SUMMARY AND CONCLUSIONS
This article addressed the problem of construct validity in consumer research. After presenting the multitrait-multimethod approach to construct validity and discussing its limitations, a general model based on the analysis of covariance structures due to Joreskog (1970, 1971, 1973) was derived for determining convergent and discriminant validity. The model provides an explicit, overall X2 goodness-of-fit test, and it can be used to partition the variance due to construct, method, and error. The procedure was illustrated using the tripartite attitude model data of Ostrom (1969).
MULTITRAIT-MULTIMETHOD MATRIX OF CORRELATIONS FROM OSTROM (1969)
Richard P. Bagozzi, "Construct Validity in Consumer Research" (unpublished working paper), School of Business Administration, University of California, Berkeley, 1976a.
Richard P. Bagozzi, "Science, Politics, and the Social Construction of Marketing," Proceedings of the 1976 Fall Conference, American Marketing Association, 1976b.
James R. Bettman, Noel Capon, and Richard J. Lutz, "Multiattribute Measurement Models and Multiattribute Attitude Theory: A Test of Construct Validity," Journal of Consumer Research, l(March, 1975), 1-15.
Donald T. Campbell and Donald W. Fiske, "Convergent and Discriminant Validation by the Multitrait-Multimethod Matrix," Psychological Bulletin, 56(1959), 81-105.
Franklin M. Fisher, The Identification Problem in Econometrics (New York: McGraw-Hill, 1966).
R. Fletcher and M. J. D. Powell, "A Rapidly Convergent Descent Method for Minimization," The Computer Journal, 6 (1963), 163-169.
V. J. Geraci, "Simultaneous Equation Models with Measurement Error" (unpublished doctoral dissertation), Department of Economics, University of Wisconsin, July, 1974.
Arthur S. Goldberger and Otic Dudley Duncan (eds.), Structural Equation Models in the Social Sciences (New York: Seminar Press, 1973).
G. T. Gruvaeus and K. G. Joreskog, "A Computer Program for Minimizing a Function of Several Variables," Research Bulletin 70-14 (Princeton, New Jersey: Educational Testing Service, 1970).
Roger M. Heeler and Michael L. Ray, "Measure Validation in Marketing," Journal of Marketing Research, 9(November, 1972), 361-370.
C. Hsiao, "Identification and Estimation of Simultaneous Equation Models with Measurement Error," International Economic Review (forthcoming).
C. Hsiao, "Identification for a Linear Dynamic Simultaneous Error-Schock Model" (working paper), Department of Economics, University of California, Berkeley, 1975.
K. G. J÷reskog, "A General Method for Analysis of Co-variance Structures," Biometrika, 57 (1970), 239-251.
K. G. J÷reskog, "Statistical Analysis of Sets of Congeneric Tests," Psychometrika, 36 (June, 1971), 109-133.
K. G. J÷reskog and Marielle van Thillo, "LISP, EL: A General Computer Program for Estimating a Linear Structural Equation System Involving Multiple Indicators of Unmeasured Variables," Research Bulletin 72-56 (Princeton, New Jersey: Educational Testing Service, December, 1972).
K. G. J÷reskog, "A General Method for Estimating a Linear Structural Equation System," in A. S. Goldberger and O. D. Duncan (eds.), Structural Equation Models in the Social Sciences (New York: Seminar Press, 1973), pp. 85-112.
K. G. J÷reskog, "Analyzing Psychological Data by Structural Analysis of Covariance Matrices," in D. H. Krantz, R. D. Luce, R. C. Atkinson, and P. Suppes (eds.), Contemporary Developments in Mathematical Psychology, Vol. II (San Francisco: Freeman, 1974), pp. 1-56.
V. Kothandapani, "Validation of Feeling, Belief, and Intention to Act as Three Components of Attitude and Their Contribution to Prediction of Contraceptive Behavior," Journal of Personality and Social Psychology, 19(September, 1971), 321-333.
D. N. Lawley and A. E. Maxwell, Factor Analysis as a Statistical Method (London: Butterworth, 1971).
Thomas M. Ostrom, "The Relationship Between the Affective, Behavioral, and Cognitive Components of Attitude,'' Journal of Experimental Social Psychology, 5 (1969), 12-30.
Thomas J. Rothenberg, Efficient Estimation with a Priori Information (New Haven, Conn.: Yale University Press, 1973).
Brian Sternthal, Alice M. Tybout, C. S. Craig, and Richard P. Bagozzi, "In Search of the Holy Grail: The Validity of the Tripartite Classification of Attitudes" (unpublished working paper), Northwestern University, 1976,
Henri Theil, Principles of Econometrics (New York: Wiley, 1971).