A Comparison of Causal Path and Econometric Modeling Approaches

Leonard J. Parsons, Georgia Institute of Technology
ABSTRACT - A tutorial is given on causal path and econometric modeling in consumer behavior. The conclusion is that conventional path-analytic techniques add little to the more substantial body of econometric theory and practice.
[ to cite ]:
Leonard J. Parsons (1981) ,"A Comparison of Causal Path and Econometric Modeling Approaches", in NA - Advances in Consumer Research Volume 08, eds. Kent B. Monroe, Ann Abor, MI : Association for Consumer Research, Pages: 203-208.

Advances in Consumer Research Volume 8, 1981      Pages 203-208

A COMPARISON OF CAUSAL PATH AND ECONOMETRIC MODELING APPROACHES

Leonard J. Parsons, Georgia Institute of Technology

ABSTRACT -

A tutorial is given on causal path and econometric modeling in consumer behavior. The conclusion is that conventional path-analytic techniques add little to the more substantial body of econometric theory and practice.

INTRODUCTION

Path analysis has been the primary methodology used in a spate of recent marketing papers (Lutz 1977, Moschis and Moore 1978, Bearden, Gustafson, and Mason 1978, Teas, Wacker, and Hughes 1979, Bearden, Teal, and Crocker 1980). These papers may have been influenced by an earlier piece by Christopher and Elliott (1970).

On the other hand, econometrics has had a long tradition in the study of buyer behavior. Applications range from an evaluation of the hypothesis of a hierarchy of effects (Palda 1966) to a test of Howard and Sheth's model of behavior (Farley and Ring 1970, 1972).

The purpose of this paper is to provide a brief tutorial on econometric and path analytic modeling. The goal is to assess what, if anything, path analysis adds to conventional econometric methodology. The paper begins with a review of selected aspects of econometrics, follows with an econometric reanalysis of a marketing path-analytic investigation, reviews the technique of path analysis, redoes the previous example from the perspective of path analysis, and ends with a comparison of the approaches. The discussion will be restricted to linear models and will omit the detail underlying the methodologies.

ECONOMETRIC MODELING

Parsons and Schultz (1976) argued that the scope of econometric research as applied to marketing must be broadened to include theory development as well as measurement and testing processes. They were interested in causal modeling and set forth eight "standards" necessary for econometric studies to qualify as causal models. They suggested that many applications of regression analysis in marketing failed to meet their criteria.

The construction of causal models begins by developing a theory of marketing behavior and expressing this theory, as a set of mathematical relations a model. The existence of a theory which predicts a relationship among the variables within an equation causes the equation to be called structural. This theory provides the a priori information necessary to perceive the structure to be estimated, that is, to distinguish it from the other structures capable of generating the observed data. This is known as the identification problem.

A linear model can be expressed in matrix notation as

YG + XB = E.   (1)

Only models containing as many equations as current endogenous variables will be considered. When this condition is met, the system is said to be complete. Suppose there are n observations on a system of L equations. Then the data on the current endogenous variables are placed in the matrix Y. The observations on the predetermined variables, either exogenous or lagged endogenous or both, are found in the matrix X. Since the system is complete, the matrix G is assumed to be nonsingular. The coefficients of the predetermined variables in each equation are contained in the matrix B.

EQUATION   (2)

The equations have been arranged so that the ith variable in the ith equation is that equation's dependent variable.

Finally, the disturbances are contained in the matrix E. The rows of the disturbance matrix are assumed to be stochastically independent and identically distributed with zero mean vector and an unknown but finite covariance matrix S.

EQUATION    (3)

The matrix S, the contemporaneous covariance matrix of disturbances in different equations, is the same for all periods.

Our discussion will be facilitated by a classification scheme based upon the nature of the matrix of coefficients of the current endogenous variables G and the contemporaneous covariance matrix S. The first step is to examine the matrix G. If the matrix G is diagonal, the matrix S is examined. If the matrix S is diagonal, there are no relationships among the equations and each can be treated as a separate single equation model. If it is not diagonal, there is a seemingly unrelated equations system. If the matrix G is triangular and the matrix S diagonal, there is a recursive equations system. For the remaining categories involving the matrix G and the matrix S, there is an interdependent simultaneous equations system. Figure 1 presents a diagram of this classification process.

FIGURE 1

CLASSIFICATION OF SIMULTANEOUS EQUATION SYSTEMS

The structural equation system can be transformed into a logically equivalent system of equations in which each member equation contains only one endogenous variable:

Y = -XBG-1 + EG-1 = XP + U.  (4)

This new system is called the reduced form. Each reduced-form equation contains all the predetermined variables in an equation system. The contemporaneous covariance matrix of the reduced-form disturbances is

W  = (G')-1  SG-1  (5)

The ability to express the structural parameters as explicit functions of the reduced-form parameters is unfortunately not automatic and, indeed is sometimes impossible. Determination of whether there is a one-to-one correspondence between the structural parameters and the reduced-form parameters is called the identification problem. Identification is logically prior to estimation. A discussion of how to test whether the relationships in system are identifiable as well as of how to apply such tests to Farley and Ring's operational version of the Howard-Sheth model of buyer behavior is given in Parsons and Schultz (1976).

If the identifiability condition is met, the parameters of the model can then be estimated. The alternative techniques include ordinary least squares, generalized least squares, seemingly unrelated equations estimation, indirect least squares, two-stage least squares, and three-stage least squares. Parsons and Schultz (1976, pp. 65-78) suggest which technique to use depending upon the structure of a model.

AN EXAMPLE

The heart of recent investigation of life satisfaction among elderly consumers (Bearden, Gustafson, and Mason 1978) is a two equation model. The variables can be defined in deviation form and consequently there are no intercepts shown in the equations. The first relationship says that satisfaction with level of living, X4 is a function of the level of alienation, X3, and a disturbance term, e4 or

X4 = b43 X3 + e4   (6)

The second relationship postulates that overall life satisfaction, X6, is a function of living level satisfaction, now an intervening variable, alienation, and health situation, X1, and another disturbance terms, e6, or

X6 = b61 X1 + b63 X3 + b64 X4 + e6    (7)

The disturbances are assumed to be independent of one another, i.e., s46 = 0. Moreover, alienation and health situation are explicitly believed to be intercorrelated. Alienation affects overall life satisfaction indirectly through living level satisfaction as well as directly. The data for this example are given in Table 1.

TABLE 1

VARIABLE CORRELATIONS, MEANS, AND STANDARD DEVIATION

Using the matrix notation introduced earlier, the key matrices for this two equation system are

EQUATION

Given our classification scheme, this simultaneous equation system is said to be recursive.

The parameters of the corresponding reduced-form equations are

P = -BG-1    (8)

or

EQUATION    (9)

Because the model is recursive each equation can be estimated by ordinary least square regression. The least squares estimator for a given equation is

B = (X'X)-1 X'y   (10)

If we write

SXX = (X'X)/n and sxy = (X'y)/n

We can express the estimator in terms of covariances

B = (Sxx)-1 sxy.   (11)

For the first equation the estimated parameter is simply

b43 = (s23)-1(s34) = (147.866)-1 (1.984) = .0134.

The estimated parameters in the second equation are

EQUATION

Sometimes there is interest in the "relative contribution" of each term in the regression and the beta coefficients are calculated. The beta coefficient is

betaij = (sj/si)bij

where xj is the dependent variable and xj is a predetermined variable. This is equivalent to running ordinary least squares regression on standardized variables. The result is that the estimator is expressed in terms of the sample correlation matrix

beta = Rxx-1 rxy.  Thus, beta43 = (1)-1r34 = .259   (12)

and for the second equation

EQUATION    (13)

A summary of these results as well as those for the reduced-form equations are given in Table 2.

TABLE 2

REGRESSION RESULTS

CAUSAL PATH ANALYSIS

Causal path analysis was developed by a geneticist (Wright 1921, 1934) as a means of relating correlation coefficients between variables in a system to the functional relations among them. Path analysis has been used in psychology (Werts and Linn, 1970). However, its main application has been in sociology (Duncan 1966, Land 1969). The technique appears in multivariate analysis texts for the social sciences such as Van de Geer (1971, pp. 112-27).

Path analysis usually begins with the construction of a path diagram. Land (1969, pp. 6-7) states that path diagrams are drawn according to the following conventions. First, the postulated causal relations among variables of the system are represented by unidirectional arrows extending from each variable dependent on it. Second, the postulated noncausal correlations among the exogenous variables in the system are symbolized by two-headed curvilinear arrows. Residual variables are also represented by unidirectional arrows going from the residual variable to the dependent variable. Finally, the quantities entered beside the arrows are the symbolic or numerical values of the path and correlation coefficients of the postulated causal and correlational relationships.

A path model refers to the set of structural equations representing the postulated causal and noncausal relationships among the variables under consideration. Traditionally, the variables are specified in standard unit form; that is, each variable is divided by its standard deviation as veil as being centered. The symbolic fore of the path coefficient is pij where the first subscript i denotes the dependent variable and the second subscript j denotes the variable whose determining influence is under consideration.

The path diagram for the earlier example involving life satisfaction among elderly consumers is

FIGURE

The corresponding path model is

x4 = p43x3 + e4   (14)

x6 = p61x1 + p63x3 + p64x4 + e6   (15)

The next step in path analysis is to multiply through each equation in the model by a predetermined variable in the equation, take expected values, and express the result in terms of path coefficients and correlations. For instance, application of this procedure to the x4- equation yields

E(x1x4) = p43 E(x1x3) + E(x1e4).    (16)

Earlier the exogenous variables were assumed to be independent of the disturbances so that E(xiej) = 0. The key is the fact the variables are in standard form. This means that E(xiej) = sij, the population correlation between xi and xj. Our last equation becomes

s14 = p43 s13    (17)

Similarly, we find that

s34 = p43.   (18)

Repeating this operation for the x6 equation and estimating pij by the sample correlation, rij, we obtain

r14 = p43r13   (19)

r34 = p43   (20)

r16 = p61 + p63r13 + p64r14   (21)

r36 = p61r13 + p63 + p64r34   (22)

r46 = p61r14 + p63r34 + p64   (23)

We now could solve for the p's in terms of the r's,

p43 = r43    (24)

and

EQUATION   (25)

But these are the same estimates that one obtains from ordinary least squares regression (see equation 13)!

Conversely, we could solve for the r's in terms of the p's. This would permit us, where relevant, to decompose the dependent variable. One could calculate the relative contributions of the components to the variation in the dependent variable and to determine how causes affecting the dependent variable are transmitted by means of the respective intervening variables.

More specifically, the total association between two variables is given by their zero-order correlation. The total effect of one variable on another is the part of their total association which are not due to noncausal components. These noncausal components, sometimes called spurious, are due to common causes, to correlation among their causes, or to unanalyzed correlation. The total effect is composed of both direct and indirect effects. Indirect effects ere effects which are transmitted or mediated by variables specified as intervening between the cause and effect under study. The direct effect is that effect which is not transmitted by means of intervening variables (Alvin and Hauser 1975).

The zero-order correlations found from manipulation of x6- equation can be interpreted in the following manner.

r16 = p16 + p63r13 + p64p43r13   (26)

.361 > .209 + (.313)(.374) + (.234)(.259)(.374)

        > .209 +        .117      +      .023

The total association is r16; the direct effect, p16; the direct correlation with another cause, p63r13; the indirect correlation with another cause, p64p43r13. The remainder is unanalyzed correlation. The second relationship is

r36 = p61r13 + p63 + p64p43   (27)

.452 = (.209)(.374) + .313 + (.234)(.259)

        =       .078       + .313 +      .061

The total association is r36; the direct affect, p63; the indirect effect, p64p43; the correlation with another cause, p61r13. The third-relationship is

r46 = p61p43r13 + p63p43 + p64   (28)

.346 > (.209)(.259)(.374) + (.313)(.259) + .234

        >       .020                 +      .081        + .234

The total association is r46; the direct effect, p64; the direct correlation with another cause, p63p43; the indirect correlation with another cause, p61p43r13. Again, the remainder is unanalyzed correlation.

These algebraic relationships can be read directly from the path diagram according to the following rule. Read back from variable xi, then forward to variable xj, forming the product of all paths along the traverse; then sum these products for all possible traverses. In no case can one trace back having started forward. The bi-directional correlation is used in tracing either forward or back (Duncan 1966, p. 6.) Thus, the correlation between x1 and x4, r14, simply equals (1) the indirect correlation with another cause (x3), p43r13.

EQUATION

The correlation r34 is equal to (1) the direct effect, p43.

EQUATION

The correlation r16 is equal to (1) the direct effect, p61;

EQUATION

(2) the direct correlation with another cause p63r13; (3) the indirect correlation with another cause; p64p43r13. The correlation r63 is equal to (1) the direct effect, p63;

EQUATION

(2) the indirect effect, p64p63; (3) the correlation with another cause, x61r13. Finally, the correlation r64 is equal to (1) the direct effect, p64; (2) the

EQUATION

correlation with a common cause, p63P43; (3) the indirect correlation with another cause, p61p43r13.

Reading of the path diagram must be done with care. A more prudent strategy would be to depend on algebraic manipulation.

Although path analysis provides some information about the noncausal associations among variables in a system, the focus is nonetheless on the causal influences whether direct or indirect. We have already seen that direct effects can be estimated from the structural equations using econometrics. Now we will observe that the total effects can be estimated from the reduced-form equations using econometrics (Alvin and Hauser 1975). The only indirect effects in our example occur in the path x3x4x6 which yields the effect p64p43. The corresponding direct effect between x3 and x6 is p63. The total effect is then p63 + p64p43. Equation 9 shows that this total effect is given by the appropriate reduced form equation.

CONCLUSIONS

There is considerable discussion in the literature about the conditions under which to use standardized measures such as path coefficients in contrast to unstandardized regression coefficients (Tukey 1954, Turner and Stevens 1959, Wright 1960a, Blalock 1967). One participant concluded that "Path coefficients are . . . appropriate if one wishes to measure the actual amount of impact that each variable has on the others in a given population. On the other hand, unstandardized regression coefficients seem more appropriate for describing . . . causal laws." (Blalock 1967, p. 675). In any event, both sets of coefficients may be obtained by econometric means.

Although attempts have been made to extend path analysis to interdependent systems (Wright 1960b), applications have focused on recursive systems. The danger is that the models have been formulated as recursive more because of tractability than because of theoretical soundness. The errant view that only recursive systems permit causal interpretation (Strotz and Wold 1960) was long ago debunked (Basmann 1963). Moreover, even where a recursive model might be the correct causal priority, the adjustment mechanism (feedback) might be rapid relative to any practical inter-measurement period. See, for instance, Farley and Lehmann (1977). Consequently, the empirical model would still be interdependent.

Path analysis adds little that is not obtainable from econometric analysis, which is more highly developed and broader in its coverage. Indeed, the founder of path analysis (Wright 1960, p. 444) was reduced to offering the following apology: "Path analysis is an extension of the usual verbal interpretation of statistics, not the statistics themselves . . . The purpose of path analysis is to determine whether a proposed set of interpretations is consistent throughout." Even the sociologists seem to be moving away from path analysis to structural equation analysis (Duncan 1975, Heise 1975). The latter seems to be an attempt to present econometric concepts without the use of matrix algebra.

Path analysis does not seem to warrant a high priority for study by buyer behaviorists. One's time would be better spent trying to understand the strengths and weaknesses of the various approaches to making causal inferences for unobservable variables (Goldberger 1974, Griliches 1974, Wold 1975, Hui and Jagpal 1979, Jagpal and Hui 1980, Joreskog and Sorbom 1979, Bagozzi 1980).

REFERENCES

Alwin, Duane F., and Hauser, Robert M. (1975), "The Decomposition of Effects in Path Analysis," American Sociological Review, 40, 37-47.

Bagozzi, Richard P. (1980), Causal Models in Marketing, New York: John Wiley & Sons.

Basmann, Robert L. (1963), "The Causal Interpretation of Non-Triangular Systems of Economic Relations," Econometric, 31, 439-48.

Bearden, William O., Gustafson, A. William and Mason, J. Barry (1978), "A Path-Analytic Investigation of Life Satisfaction Among Elderly Consumers," in Proceedings, ed. William L. Wilkie, Ann Arbor: Association for Consumer Research, 386-91.

Bearden, William O., Teal, Jesse E. and Crockett, Melissa (1980), "A Path Model of Consumer Complaint Behavior," in Proceedings, eds. Richard P. Bagozzi et al., Chicago: American Marketing Association, 101-4.

Blalock, J. H. (1967), "Path Coefficients Versus Regression Coefficients," American Journal of Sociology, 72, 675-6.

Christopher, M. G. and Elliott, C. K. (1970), "Causal Path Analysis in Market Research," Journal of the Market Research Society, 12, 112-24.

Duncan, Otis Dudley (1966), "Path Analysis: Sociological Examples," American Journal of Sociology, 72, 1-16.

Duncan, Otis Dudley (1975), Introduction to Structural Equation Models, New York: Academic Press.

Farley, John U. and Lehmann, Donald. R.(1977), "An Overview of Empirical Applications of Buyer Behavior System Models," in Proceedings, ed. William D. Perreault, Jr., Atlanta: Association for Consumer Research, 337-41.

Farley, John U. and Ring, L. Winston(1970), "An Empirical Test of the Howard-Sheth Model of Buyer Behavior," Journal of Marketing Research, 7, 427-38.

Farley, John U. and Ring, Winston(1972), "On Land R and HAPPISIMM," Journal of Marketing Research, 9, 349-53.

Goldberger, Arthur S. (1974), "Unobservable Variables in Econometrics," in Frontiers in Econometrics, ed. Paul Zarembka, New York: Academic Press, 193-213.

Griliches, Zvi (1974), "Errors in Variables and Other Unobservables," Econometric, 42, 971-98.

Heise, David R. (1975), Causal Analysis, New York: John Wiley.

Hut, Baldwin S. and Jagpal, Harsharanjeet S.(1979), "The Partial Least Squares Approach to Causal Marketing Models with Latent Variables," in Proceedings, eds. Nail Beckwith et. al., Chicago: American Marketing Association, 90-2.

Jagpal, Harsharanjeet S. and Hui, Baldwin 5.(1980), "Consumer Behavior Models with Unobservables: Measurement Reliability, Internal Consistency and Theory Validation," in Proceedings, eds. Richard P. Bagozzi et al., Chicago: American Marketing Association, 358-61.

Joreskog, Karl G. and Sorboom, Dag (1979), Advances in Factor Analysis and Structural Equation Models, Cambridge: Abt BOOKS.

Land, Kenneth C. (1969), "Principles of Path Analysis," in Sociological Methodology 1969, ed. Edgar F. Borgatta, San Francisco: Jossey-Bass, 3-37.

Lutz, Richard J. (1977), "An Experimental Investigation of Causal Relations Among Cognitions, Affect, and Behavioral Intentions," Journal of Consumer Research, 3, 197-208.

Moschis, George P. and Moore, Roy L. (1978), "Family Communication and Consumer Socialization," in Proceedings, ed. William L. Wilkie, Ann Arbor: Association for Consumer Research, 359-63.

Palda, Kristian S. (1966), "The Hypothesis of a Hierarchy of Effects: A Partial Evaluation," Journal of Marketing Research, 3, 13-24.

Parsons, Leonard J. and Schultz, Randall L. (1976), Marketing Models and Econometric Research. New York: North-Holland.

Strotz, R. H. and Wold, H. O. A. (1960), "Recursive vs. Nonrecursive Systems: An Attempt at Synthesis," Econometrics, 28, 417-27.

Teas, R. Kenneth, Wacker, John G. and Hughes, R. Eugene (1979), "A Path Analysis of Causes and Consequences of Salespeople's Perceptions of Role Clarity," Journal of Marketing Research, 16, 355-69.

Tukey, John W. (1954), "Causation, Regression, and Path Analysis," in Statistics and Mathematics in Biology, ed. Oscar Kempthorne et. al. Ames Iowa: Iowa State College Press, 35-66.

Turner, Malcolm E. and Stevens, Charles D. (1959), "The Regression Analysis of Causal Paths," Biometrics, 15, 236-58.

Van de Geer, John P. (1971), Introduction to Multivariate Analysis for the Social Sciences, San Francisco: W. H. Freeman and Co., 112-27.

Werts, Charles E. and Linn, Robert L. (1970), "Path Analysis: Psychological Examples," Psychological Bulletin, 74. 193-212.

Wold, Herman (1975), "Path Models with Latent Variables: The NIPALS Approach," in Quantitative Sociology, ed. H. M. Blalock et al., New York: Academic Press, 307-57.

Wright, Sewall (1921), "Correlation and Causation," Journal of Agricultural Research, 20, 557-85.

Wright, Sewall (1934), "The Method of Path Coefficients," Annals of Mathematical Statistics, 5, 161-215.

Wright, Sewall (1960a), "Path Coefficients and Path Regressions: Alternative or Complementary Concepts?" Biometrics, 16, 189-202.

Wright, Sewall (1960b), "The Treatment of Reciprocal Interaction, With or Without Lag, in Path Analysis," Biometrics, 16, 423-45.

----------------------------------------