Structural Modelling: an Application For Testing Attitude Models and Convergent Validity

Richard Staelin, Carnegie-Mellon University
ABSTRACT - The common theme of this session is validity. Both the Bagozzi-Burnkrant and John-Reve papers are concerned with estimating convergent validity, while the Cattin paper deals with the problem of predictive validity. My discussion concentrates on the first topic, i.e., convergent validity, although the discussion is done within the broader framework of testing and/or validating a theory which is stated in terms of unobserved constructs.
[ to cite ]:
Richard Staelin (1979) ,"Structural Modelling: an Application For Testing Attitude Models and Convergent Validity", in NA - Advances in Consumer Research Volume 06, eds. William L. Wilkie, Ann Abor, MI : Association for Consumer Research, Pages: 307-312.

Advances in Consumer Research Volume 6, 1979      Pages 307-312


Richard Staelin, Carnegie-Mellon University

[I wish to acknowledge the helpful comments of Robert Avery, V. Joseph Hotz, Andrew Mitchell and Robert Redinger.]


The common theme of this session is validity. Both the Bagozzi-Burnkrant and John-Reve papers are concerned with estimating convergent validity, while the Cattin paper deals with the problem of predictive validity. My discussion concentrates on the first topic, i.e., convergent validity, although the discussion is done within the broader framework of testing and/or validating a theory which is stated in terms of unobserved constructs.


A major methodological problem in consumer research is the testing and/or validation of a theory where the theory of interest is stated in terms of unobservable constructs while the variables used to test the model are fallible measures of these constructs. Until recently most analyses of consumer behavior models have not explicitly taken into account this difference between the theoretical constructs and the observed measures. However, in the last few years a new methodology, often referred to as structural equation modeling, has developed which allows consumer researchers to incorporate into their analyses both aspects, i.e., that the theory is in terms of theoretical constructs and the data used to test the theory are composed of fallible measures.

This structural equation modeling approach forms the foundation for the analyses performed by Bagozzi and Burnkrant (B-B) and John and Reve (J-R). It is my personal opinion that this approach has significant potential in testing specific consumer theories. However, as with most "sophisticated" methodological approaches, it is necessary for the user to understand the assumptions required by this approach. One of the goals of this paper is to critically review the assumptions (both implicit and explicit) made by the two sets of authors and to suggest an alternative formulation which circumvents one of the more restrictive assumptions.

Structural Equation Modeling

Both B-B and 3-R are concerned with modeling the concept of attitude and the process by which attitude is measured. Both sets of authors start out with the assumption that attitude is a multi-dimensional construct and that the measures used to tap each component are fallible and are correlated either because they measure correlated constructs or because they are obtained via similar data collection methods. Using this general conceptualization both studies rely on previously collected data to estimate the dimensionality of attitude and the validity of the specific measures used to measure each component of attitude.



This conceptualization is represented schematically in Figure 1 where the underlying attitude constructs are designated by circles, the squares represent the fallible measures, the double-headed arrows the links within the individual constructs, the single-headed arrows the postulated causal relationships between the constructs and fallible measures and the e's, the measurement errors associated with the measurement process. The causal model is further specified by assuming a functional form between the constructs and the observed variables. Both sets of authors make the "standard" assumption that this form is linear and additive with the errors assumed to come from a multivariate normal distribution with mean vector equal to zero and a specified variance-covariance matrix. For example, the functional form (e.g., the structural equations) assumed by B-B is

Yijk = ljk"ij + eijk   i = 1,2..., n;  j = 1,..., p;   k = 1,..., m;   (1)

where Yijk is the kth observed measure for person i on construct j, ljk is the coefficient of construct j for measure k, indicating the scaling between the unobserved construct Aij and the observed measure, and eijk is the measurement error.

The general approach used by both sets of authors to estimate parameters of a model of this form is maximum likelihood. Conceptually this is done by trying to find that set of parameters which best reproduces the covariance (or correlation) matrix of the observed data. To get a feel of how this is done, let us address our attention to a few elements in the covariance matrix of the y's assuming that the measurement process is modeled as stated in Equation 1. For example, the variance of the first observed measure for can be expressed in terms construct 1, i.e., Var(y11)parameters of the model's as follows:

Var(y11) = l211Var(A1) + Var(e11) + 2 l11Cov(A1e11).   (2)

Likewise the covariance between y11 and the first measure of the second construct y21 is

Cov(y11,y21) = l11l21 Cov(A1"2) + Cov(e11e21) + l11Cov(A1e21) + l21Cov(A2e11) ,  (3)

while the covariance between y11 and the second measure of the first construct y12 is

Cov(y11,y12) = l11l12Var(A1) + Cov(e11e12) + l11Cov(A1e12) + l12Cov(A1e11).  (4)

In an analogous manner each element of the observed variable covariance matrix can be represented in terms of the unobserved model parameters. It is often the case in analyses of this type that the number of possible model parameters obtained in the above fashion exceeds the number of observations. Consequently, it is not possible to uniquely identify the parameter without limiting the parameter space by making some assumptions about the permissible values for certain of these parameters. For example, the assumptions used in the B-B paper are as follows.

1. The variances of the unobserved constructs (e.g., Var(A1)) are equal to one. This assumption is normally costless since the constructs are never measured and thus can be arbitrarily rescaled to yield any variance desired.

2. The covariances between the error terms and the constructs (e.g., Cov(A1e11)) are equal to zero. This assumption is not costless. One might postulate conditions where the errors associated with the measurement process are a function of the level of the construct. For example, people with a high (favorable) attitude toward a product might try to please the interviewer and thus tend to give an answer which is more favorable than their true attitude while the converse would hold for respondents who have an unfavorable opinion about the product.

3. The covariance between one construct and an error term associated with another construct (e.g., Cov(A1e21)) is equal to zero. This does not seem to be too restrictive. It is hard to imagine situations where the error term associated with one construct would be correlated with another construct.

4. The covariance between two error terms (e.g., e11 and e21) are equal to zero. This is probably the least palatable of the assumptions made by B-B since it is easy to conjure up situations where the error terms for two measures are correlated because they either are designed to tap the same construct or they are obtained by a similar scaling method. The analysis of J-R is aimed at circumventing this problem by postulating a methods factor. [It is interesting to note that the model used by J-R was previously proposed by Bagozzi (1978), although he does not use this model in the current B-B paper.] Another approach compatible with structural equation modeling postulates error factors which are correlated unobserved constructs (J÷reskog and S÷rbom, 1977). I use this latter approach in developing an alternative model to that proposed by J-R.

Applying the above four assumptions to Equations 2-4 yields equations of the following form:

Var(y11) = l211 + Var(e11) ,  (5)

Cov(y11y21) = l11l21Cov(A1"2) ,  (6)

Cov(y11y12) = l11l12 .  (7)

It should be noted that these four assumptions are not the only restrictions placed on the model's parameters. Instead of using the covariance matrix, B-B use the correlation matrix for the y's. This means that the variance of each y is one. As a consequence, the authors are implicitly imposing another set of restrictions on-the parameters of the structural equations (i.e., l2ii + Var(eii) = 1). In other words the parameter space is further restricted by requiring lii to be dependent on the value of the variance of eii. Although this assumption is not necessarily restrictive, [I later show a case where this assumption is restrictive.] it is important to recognize this implicit assumption when determining a) the number of free parameters estimated and b) the degrees of freedom associated with any statistical tests of the model.

The next step in the analysis is to assume a specific distributional form for the y's conditional on the unobserved constructs (both sets of authors assume this distribution to be the multivariate normal) and then find that set of parameter values (e.g., the l's, Var(ejk)'S and the Cov(Ai'Ai) s)) which maximize the probability of getting the observed covariance matrix. These parameter estimates are often obtained via a computer program called LISREL (Joreskog and van Thillo, 1972), which also yields statistics that can be used to test the null hypothesis that the postulated model is correct. [Specifically, the null hypothesis is that the true model generating the data is the postulated model and that parameters of the model are equal to values estimated via the maximum likelihood estimation. The alternative hypothesis is that the null hypothesis is not true.] It is via this computer program or a variant of this program that both B-B and J-R a) estimated the model parameters b) tested for the convergent validity of specific attitude measures and c) tested for the veracity of their multiple component model of attitude.

Although it should go without saying, it should be noted that the above methodology can not be used to prove a particular theory is correct. As is true with any hypothesis testing, statistical tests of hypotheses are capable only of rejecting a given theory (subject to a type-I error) or conversely saying that the observed data is consistent with the postulated theory. Thus even though the final models postulated by B-B and J-R are not rejected, it is not proper to say that these models are correct. In other words, B-B and J-R do not show that the measures used have convergent validity or that attitude is two dimensional, they only show that they can not reject these statements.

In a somewhat similar vein it must be recognized that the structural modeling methodology does not prove that the model as postulated is valid (i.e., does not prove construct validity). However, the approach is consistent with the philosophy of science notion of testing a theory in that a) the structural equations explicitly state the theory to be tested and b) the methodology allows the theory to be rejected.


The B-B and J-R papers postulated a number of structural models of attitude and the measurement process. The simplest of these models is stated in Figure 1 of B-B. Using a data base collected by Fishbein and Ajzen (1974) they estimated the parameters of the model using LISREL. The parameter estimates and relevant test statistics for this model are given in Table 1 of their paper. They conclude, based on the X2 value for the two attitude factor model, that this model can not be rejected and thus "convergent validity is established for the two-factor affective/cognitive model of attitudes." As previously mentioned, this is too strong a statement since the X2 test only indicated that the data were consistent with their model.

On a more general level it is interesting to look at the estimated correlation between the two components of attitude (w"1"2 in their notation). This correlation is estimated as being .824 using data from the self-reported behavior sample and .927 from the behavioral intention sample. In either case, the correlation is high indicating that although there may be two components to attitude toward religion, these two components are highly correlated.

It is possible to test whether the two components of attitude can be lumped into one construct without any loss of statistical significance. J÷reskog and S÷rbom (1977) show that asymptotically, the difference between X2 value [The X2 value is obtained from -2 ln l where l is the ratio of the likelihood function of the model being estimated to the likelihood function of the model where the parameter values are not restricted.] associated with a particular model and the X2 associated with a model which is a subset of that model, is also distributed X2 with degrees of freedom equal to the difference in the number of parameters in the two models. Since B-B report the X2 values for both the one factor (construct) model and the more general two construct model, it is possible to test whether the two highly correlated components can be combined into one scalar quantity. [B-B did not perform this type of test. Instead they looked at the X2 value in each sample associated with the null hypothesis that attitude is composed of one component. In both cases, this X2 value allowed them to reject the null hypothesis, i.e., they concluded the one factor model of attitude was incorrect.] The X2 values (which have one degree of freedom) are 21.61(= 23.91 -2.30) and 5.58(= 13.98 - 8.30) respectively for the two different samples. In the first case the X2 value is significant, indicating the hypothesis that the true model consists of just one component of attitude should be rejected. However, for the second sample, the X2 value is not significant at the .10 level. Thus, contrary to the conclusions of B-B, the results from the two samples do not "establish" a two factor model, but instead yield mixed interpretations.



The approach of J-R was similar to B-B in that they also postulated a model which stated that attitude was a multi-dimensional concept and that the measures used were fallible. However, their model of the measurement process reflected the fact that the observed responses could also be affected by the method used to obtain the responses. More specifically, they postulated that the observed measures yijk are influenced not only by the level of the particular construct (Aij) they were intended to measure, but also that they were influenced by the method (Mik) used to obtain the measure. This conceptualization is schematically displayed in Figure 2 where the three A's represent three components of attitude and the three M's the three different methods used to obtain the nine y responses. Thus (using their notation) the observed response for person i for measure l on trait j using method k, is

yilllj"ijllkMik + eill = 1,2...9;  j = 1,2,3;  k = 4,5,6;  i = 1,2...n,  (8)

where the covariance matrix of the unobserved constructs (these being the A's and the M's) are allowed to be correlated.

The model as stated in Equation 8 has some interesting properties. First, the term llkMik "corrects" for method bias by postulating that everyone has an unobserved (but existing) true score on a method which is independent of any attitude measure. I know of no theoretical reason why a respondent would have such a score, nor do the authors provide the reader with one. In other words, although there are no theoretical grounds for invalidating their model of reality, I do not find the model intuitively appealing. Second, the authors assume that the effect of this "measurement bias" is linear and additive with the attitude score. Again, there is no justification for such an assumption. [In fairness to the authors, I must acknowledge that most model builders make the assumption of linear and additive terms since in general this assumption seems to be very robust.]

The second major observation about Equation 8 is more technical, but is probably more important. Although I haven't been able to prove it, I don't think their general model (i.e., Model 1 of their paper) is identified. This means that the estimates they report are not unique. My general approach to determine if the model was identified was to write out the covariance matrix of the y's in terms of the model parameters (in a manner similar to my Equations 2 through 4). Then after reducing the parameter space by making the assumptions similar to those made by B-B, I attempted to uniquely determine the parameters in terms of the observed y's. To the best of my ability I was not able to do this. Moreover, I I don't think it is possible to uniquely identify all of the parameters. I suspect that the authors did not attempt this analysis either but instead assumed their model was identified simply because the number of parameters estimated was less than the number of observations.

The problem of identification is always difficult in models as complex as that postulated by J-R. One of the standard ways of handling the identification problem is to finesse the type of analysis I attempted and instead input the specified model into the LISREL program and let the program "determine" identification via the information matrix. For identified models the information matrix will almost certainly be positive definite. Conversely, if the information matrix is not positive definite, the specified model is almost certainly not identified (J÷reskog and S÷rbom, 1977). Since positive definite matrices can be inverted while non positive definite matrices can not, inversion of the information matrix implies that the information matrix is positive definite. Thus, inversion is taken as "proof" that the model is identified. Unfortunately, inversion of the information matrix is only a necessary condition (in contrast to a sufficient condition) for identification. Thus even though I understand why J-R might have resorted to the above "rule of thumb," this rule may have misled them into believing that their general model is identified.

As further "proof" that this model is not identified, note that when the authors further restricted the general model (yielding their Models 2, 3 and 4) the program failed to converge. Lack of convergence is normally a sign that the model is not identified. Since it is impossible for a subset of a model not to be identified when a more general model is identified, my priors are that the general model is also not identified. Thus, their Model 1 results displayed in Figure 3 of their paper are meaningless.

The net result of the above discussion is that I believe it is impossible to identify (solve) for a general three-trait, three-method model where the methods and attitudes are allowed to be correlated using the J-R approach. However, I was able to develop a model which is identified (in fact many of the parameters are over-identified) that assumes that the observed score is affected by both the respondent's attitude and the type of measurement method used. This model is presented next.


Assume the data of Ostrom (1969) (i.e., the data used by J-R) where three measures are obtained for each of three different attitude components. Let yijk represent the observed measure for person i on attitude component j using measure k. Let Aij be person i's attitude score on component j and let Mik be person i's measurement score on measure k. Next, postulate the model

yijk = Aij + Mikl = 1,2,3;   j = 1,2,3;  k = (l-1)*3 + j,   (9)

where A1, A2 and A3 are allowed to covary as are the sets {M1, M2, and M3}, {M4, M5 and M6} and {M7, M8 and M9}; otherwise, the constructs are assumed to be independent. In words, Equation 9 says that a person's score is made up of two random variables, an attitude score, and a "measurement score" which represents the measurement error. A person's score on any one attitude is correlated with the score on the other two attitudes but is independent of the measurement error, similarly, the measurement error is correlated with errors on other measurements using the same method but independent of those not using the same method. Measures 1, 2 and 3 represent the Thurstone method, 4-6 the Likert method and 7-9 the Guttman method. The co-variance matrix of the y's for this model is given in Table 1 where wij is the covariance between attitude components i and j and Yij is the covariance between measurement errors i and j. Clearly, the wij's are over-identified and the Yij's are just identified. [To see this, note that, for example, it is possible to uniquely estimate ~11 from three different elements in the matrix, i.e., row 4, column 1; row 7, column 1; and row 7, column 4. Once the wij's are estimated, the 18 Yij's can be estimated from the 18 different entries 3 which include these parameters.]

The above model has some nice properties. First, it is possible to test for the dimensionality of the unobserved construct attitude. Second, it acknowledges that the observed measures may be correlated for two reasons, i.e., the measures tap correlated constructs and they are derived from similar methods. Finally, it partitions the response variance of yjk into two components, i.e., wjj and Ykk.

As with previously discussed models, this specification makes a number of assumptions. First, it assumes that the person's measurement error is not affected by the level of a person's attitude. Thus, for example, a person with a favorable attitude is no more likely to have a high (low) measurement error than one with an unfavorable attitude. Second, it assumes that measurement errors associated with different methods are independent. Finally, it assumes that the attitude score and measurement error are additive. Although I offer no proof as to the veracity of these assumptions they do not seem to strain one's credibility.

It should be noted that the model is stated in terms of covariance versus correlations. Moreover, it is not possible to arbitrarily rescale the model to get the y's to have unit variances without imposing restrictions on the magnitude of the variances of the unobserved variables. Thus, if the correlation matrix were used instead of the covariance matrix, the parameter space would he further restricted so that

w11 + Y11 = 1,  w11 + Y44 = 1,  and w11 +Y77 = 1

or Y11 = Y44 = Y77 .

Likewise Y22 = Y55 = Y88 and Y33 = Y66 = Y99 .

Clearly this is not a desired model property since there is no reason to believe that all the measurement errors using the same method have the same variance.

The above indicates that it is improper to use the correlation matrix as input. Unfortunately, Ostrom reported only the correlation matrix. Although there is some indication that he tried to insure that the variances for the observed scores for a given method are equal, use of the correlation matrix to estimate model parameters is incorrect. However, given the alternatives of doing nothing or doing something, I elected for the latter since it appeared that the variances of the y's for a given method were approximately equal. Consequently, I did not feel that there was too much error in this case in using the correlation matrix to estimate the model parameters. These estimated model parameter values are reported in Table 2.



The results indicate that it is not possible to reject the hypothesis that three components of attitude exist and that measurement process is correct as modeled. The estimated correlations between the three attitude components indicate a high degree of commonality between the element (i.e., between .79 and .91), although the estimated correlations are lower than those reported by J-R. In general, the Likert method seemed to produce the smallest error variance, with Thurstone being next best, followed by the Guttman method. Interestingly, the correlation between the scores on measures using the same method were usually very small. The only method to exhibit significant (managerial and statistical) correlations was the Likert method.


The above discussion is intended to supplement B-B's and J-R's presentation of the structural equation approach used by both sets of authors to test for convergent and discriminant validity. One of the major themes of this discussion is that if researchers are to use this approach they must first carefully think out all the assumptions which are implicitly or explicitly tied to the technique. In my opinion the J-R paper does not reflect this careful thinking in that they present a model which has no theoretical justification nor is it identified.

A second aim of my discussion is to put forward a model which simultaneously takes into account the influences on the observed responses of the constructs being tested, any relationships between the constructs and the methods used to obtain these measures. This structural model is an outgrowth of a model suggested by Bagozzi (1978) and implemented by J-R. Unfortunately the B-B paper does not use either the model proposed by Bagozzi or a variation of this model when testing the validity of the two component models of attitude even though the measures used to test the theory were obtained using the same method (see Figure 3 of B-B paper). Consequently, it is not possible for them to test if the estimated relationship between constructs A2(affect) and B (behavioral intentions) is due to the measurement technique or the basic underlying relationship between the two variables. Hopefully, future work in this area will control for such confounding influences by incorporating my approach or some improved variation.


The emphasis of the Cattin paper is on obtaining a model which has high predictive power. This is in contrast with the two previously discussed papers which were conceived with testing the validity of measures used to test a theory. Thus the only real commonality between Cattin's paper and the papers of B-B and J-R is the word "validity."

The Cattin paper first reviewed six formulae which estimate the correlation between an observation and the prediction of that observation (based on the regression model). It then presented an example which indicates for the data set selected that it is better to use an estimate of the population cross validation correlation than the more commonly used sample correlation estimate when selecting between regression models. Cattin offered no proof as to the generalizability of these results although he did review a series of Monte Carlo studies aimed at determining the properties of each estimator.

My major concern with the Cattin paper is the emphasis on predictive power versus theory as the criterion for selecting a model. Thus I don't believe a model normally should be selected just because it has a high population cross validation correlation. Instead, in most consumer behavior research, a researcher should start with a theory. This theory will determine the predictor variables and the functional form of the model. If the researcher wants to conduct an F test to determine if a subset of the model adequately represents the data, this is certainly permissible. However, selecting a model just because it has the highest predictive power is atheoretical and should be avoided in most consumer research settings. Thus, I am not in disagreement with Cattin over any technical issue. Instead, I question the value of the cross validation correlation coefficient in most consumer research situations. The only time this statistic would be of value is when a researcher wanted to develop a model that predicts well for different data sets. Even then, I would not just use the cross validation correlation coefficient in selecting the model, but also blend in my opinions as to the soundness of the alternative models.




R. P. Bagozzi, "The Construct Validity of the Affective, Behavioral and Cognitive Components of Attitude by Analysis of Covariance Structures," Multivariate Behavioral Research, 13 (January, 1978), 9-31.

M. Fishbein and I. Ajzen, "Attitude Towards Objects as Predictors of Single and Multiple Behavioral Criteria," Psychological Review, 81, (1974), 59-74.

K. G. J÷reskog and D. S÷rbom, "Statistical Models and Methods for Analysis of Longitudinal Data." In Latent Variables in Socioeconomic Models, ed. by D. J. Aigner and A. S. Goldberger, North Holland, Amsterdam 1977.

K. G. Joreskog and M. van Thillo, LISREL: A general computer program for estimating a linear structural equation system involving multiple indicators of unmeasured variables. Educational Testing Service Research Bulletin, (1972), 72-156.

T. M Ostrom, "The Relationship Between the Affective, Behavioral, and Cognitive Components of Attitudes," Journal of Experimental Social Psychology, 5 (1969), 12-30.