# Investigating Causal Systems With Qualitative Variables: Goodman's Wonderful World of Logits

^{[ to cite ]:}

William R. Dillon (1981) ,"Investigating Causal Systems With Qualitative Variables: Goodman's Wonderful World of Logits", in NA - Advances in Consumer Research Volume 08, eds. Kent B. Monroe, Ann Abor, MI : Association for Consumer Research, Pages: 209-219.

^{[ direct url ]:}

http://acrwebsite.org/volumes/9813/volumes/v08/NA-08

INTRODUCTION

This paper is one of three directed at the important topic of causal modeling and its role in consumer research. The stimulating papers of my colleagues Professors Bagozzi and Parsons, appearing in this session, speak well to the issues that surface when causal system are investigated in which the data have (continuous) quantitative components with at least interval scale properties; however, in many circumstances the available measurements are categorical in nature, having less than precise qualities, yet we may still wish to consider various causal hypotheses either of the temporal or nontemporal kind.

A point I would like to make early on is that in the case of ordinal data I favor the use of standard path analysis techniques. While I am fully aware that ordinal-based results can be altered by applying a suitable transformation, the Nonce Carlo work suggests that violations of the interval assumption may not be very consequential, and this coupled with the fact that in most behavioral and consumer research settings the ordinal variables will generally have more than just two or three levels leads me to be more sanguine than some might be about the use of path analytic methods on ordinal scaled data.

This is not to say, however, that all caution should be discarded and that causal models of the Duncan (1966, 1975) or Wright (1960) variety be applied regardless of the distributional form of the data analyzed. Quite the contrary, the implicit or explicit assumptions of these models and methods that are violated when the variables are measured with only a small number of categories appear too severe to be ignored. Thus, if more than just heuristic or descriptive information is sought, the challenge is to find alternative approaches to modeling observable phenomena more compatible with the qualitative nature of the available responses. (The issue of unobservable variables and latent class models is also taken up in a later section.)

My remarks today are restricted to the case where the qualitative nature of the data is intrinsic to the attribute under study and not induced from some arbitrary collapsing of what would otherwise be a continuous variable. (I make this distinction since in the latter case the analysis may be best carried out by leaving the variables in their natural form and applying standard path analysis techniques. ) There are a number of approaches to causal modeling with qualitative variables that are in the spirit of structural equation models; for example, Boudon (1965, 1967), Coleman (1964), Goodman (1972, 1974a), Lazarsfeld (1971) and Lazarsfeld and Henry (1968) are just a few of the social scientists who have made contributions to the area. Perhaps the closest analog to path analysis for qualitative variables is Goodman's loglinear/logit approach. Essentially, Goodman's system results in path diagrams, based on what usually is a series of logit specifications, and possesses the following attractive features:

It provides the functional equivalent of structural equation models for qualitative data as well as the statistical tools necessary for testing.

It is flexible in the sense of being able to handle temporal and nontemporal systems, and the case of unobservable variables.

It can be easily implemented with currently available canned computer software packages suitable for analyzing multiway contingency tables.

The remainder of this paper represents an attempt, albeit an ambitious one, to satisfactorily do justice to the work of Goodman. While the primary intent of the piece is expository, I will attempt to address what I consider to be some interesting issues and problem areas. In preparation for the use of logit models as vehicles for investigating causal hypotheses, I begin the discussion with some basic ideas on fitting loglinear and logit models and, in addition, present the concept of spurious correlation in the context of cross-classified data.

SOME BASIC IDEAS

This section provides the foundation for the subsequent material on causal modeling which immediately follows. My remarks will be clustered into three subsections focusing on loglinear modeling, logit specifications and the concept of spurious correlations as it relates to conditional independence-type arguments. The reader sufficiently educated in these areas is well advised to skip this material and turn to the next section which explicates Goodman's approach to causal modeling.

Loglinear Models

The most comprehensive work to date on the subject of fitting loglinear models to discrete multivariate data is the text by Bishop, Feinberg and Holland (1975). The interested reader might also wish to consult the recent text by Shelby Haberman (1978) for alternative treatments.

For illustrative purposes I will discuss the case of a four-dimensional table. However, the ideas to follow are easily generalized to lower- or higher-order tables. Let A, B, C, and D stand for the variables of interest with levels i, j, k, and l, respectively, and denote by f_{ijkl} and F_{ijkl} the respective observed and expected frequency in cell (_{ijkl}). By expected frequency I mean the estimated count induced through fitting a specified model. If we further let n denote the total sample size, then the relationships hold that

The expected value in any cell of the IxJxKxL table formed by variables A, B, C, D can be parameterized in multiplicative form with parameters denoted by l, or in additive form via logarithms, with parameters denoted by m. My preference is to use the additive form which represents F_{ijkl} as

where

The right-hand side of (2) is interpretable, as in ANOVA models, in terms of individual and joint variable effects, and, consequently, has similar constraints. For completeness, it should be noted that multiplicative l-parameters and the additive m-terms are related by

where "exp" denotes the exponential fraction. Finally, the representation given in (2) will be called a saturated model since there are as many parameters to estimate as cells in the table and, not surprisingly, the expected (or fitted) values are identical to the observed counts; that is, F_{ijkl} = f_{ijkl }for all (ijkl).

In almost all practical discussions the class of loglinear models is restricted to those models which satisfy the __hierarchical__ property: If {f} and {f>'} are any two sets of indices having the property {f} e {f'}, then m_{{f}} = 0 implies m_{{f'} }= 0 and, further, if m_{{f" SIZE="2}} = 0 then all m-term parameters containing like subscripts, that is, are a subset of {f" SIZE="2}, are also not zero. Put simply, if uABij is included in a model then the principle states that uAi and uBj must also be present, whereas if uABij = 0 then we must have uABCijk = 0. Given these constraints, hierarchical models can be completely defined by the minimal set of sufficient margin totals which represent the set of highest-order interactions. Adopting what I think now is standard notation, which encloses the requested model in brackets, the set of [ABC], [D] defines the model

since the presence of the margin configuration [ABC] implies the presence of all its lower-order relatives.

Once the set of sufficient configurations-is specified, and assuming that some of their m-term parameters are set to zero, that is, an unsaturated model is requested, maximum likelihood estimates are readily found either directly, estimates are expressible in closed form, or through some iterative proportional fitting algorithm. Most computer algorithms use proportioned fitting whether or not direct estimates exist, and rely on a result attributable to Birch (1963) which forces maximum likelihood estimates of margin totals corresponding to specified sufficient configurations to be equal to observed marginal sums. Hence, if we specify [AB] and [AC] as fitted margins, then the maximum likelihood estimates of the expected frequencies obey the relationship

(F_{ij++}) = f_{ij++}

(F_{i+k+}) = f_{i+k+} (4)

where f_{ij++} (F_{ij++}) and f_{i+k+} (F_{i+k+}) are the observed (expected) marginal sums derived from summing over the appropriate "+" subscripts. The degree to which the quality of the fitted model is any good can be assessed by computing -2 times the logarithm of the likelihood-ratio test statistic used for testing that the model fitted is correct versus the unrestricted alternative. Under the hypothesis that the model is correct

is asymptotically x^{2} with degrees of freedom equal to *# *of states - # of independently fitted parameters.

Specific hypotheses can, under this framework, be easily tested by making use of the partitioning properties of the likelihood ratio statistic and the hierarchical principle of modeling. Consider, for example, the following listing of models which represent a nested hierarchy of loglinear models:

What characterizes the hierarchy is that each succeeding model differs from the model immediately preceding it by only one term. (The restriction to only one additional term is not a requirement.) Clearly, we would all agree that there is more than one set of nested hierarchical models that can be written down. For the above hierarchy denoted by G^{2}(M_{i}) the likelihood ratio statistic for model M_{i}, then, though I omit the details, it can be shown that

G^{2}(M_{4}) = G^{2}(M_{4}|M_{3}) + G^{2}(M_{3}), (7)

whereby G(M_{4}|M_{3}) I mean the conditional likelihood ratio statistic for model M_{4} given model M_{3}. It follows that

G^{2}(M_{4}) __>__ G^{2}(M_{3}) __>__ G^{2}(M_{2}) __>__ G^{2}(M_{1}). (8)

If G^{2}(M_{4}) and G^{2}(M_{3}) are both asymptotically c^{2 }random variables with degrees of freedom V_{1} and V_{2} respectively, then G^{2}(M_{4}|M_{3}) is also asymptotically c^{2 }with degrees of freedom V_{1}- V_{2}. This property of partitioning and ordering for a nested hierarchy does not necessarily hold for the case where the Pearson statistic is used, and it is primarily for this reason that the G^{2} statistic is preferred.

The conditional breakdown of G^{2} into additive components is extremely useful in model building. For example, notice that the only difference between models M_{4} and M_{3} in the above hierarchy is the uABij-term and, as your intuition might lead you to believe, it appears that G^{2}(M_{4}|M_{3}) could be viewed as a test statistic for determining whether uABij = 0. (Actually, in a strict sense, we would also have to assume that variables C and D are unrelated to at least variable A or variable B in order to ensure that this test is equivalent to the test statistic used to determine whether uABij = 0 from the marginal table for variables A and B. I will have more to say about this nuance when developing the approach to causal modeling.) It is precisely in this manner that specific effects corresponding to relevant hypotheses can be tested.

Logit Specifications

la the previous discussions I made no mention of any distinctions between any of the variables. If I can assume the response variable has two levels, then it is reasonable to model the behavior of the log odds of one level of the response to the other on the basis of the explanatory variables. Upon finding a suitable log linear model, a secondary analysis is performed and ultimately what we arrive at is a table of log odd effects from which we can better understand how changes in the combined levels of the explanatory variables affect the response measure.

To illustrate these ideas treat variable A in the IxJxKxL table as the response measure and the remaining variables B, C, and D as explanatory factors. In such settings we are typically interested in the relationship of the set of design or explanatory variables on the response measure. Assuming that the sample sizes for each combination of the explanatory variables are fixed by design, a little thought based on the previous discussion will show that we need to include the m-term parameters relating the interactions between such variables in the model to insure that the estimated margin totals conform to fixed margins. Under this setup, we usually are not interested in investigating relationships between the explanatory variables but restrict ourselves to the nature of their affects on the response variables. For convenience, we can view the table as an array of races such that

f_{.jkl} = f_{ljkl}/(f_{ljkl} + f_{2jkl}), (9)

with corresponding logits given by

w^{"}_{.jkl} = ln(f_{ljkl}/f_{2jkl}) = ln f_{ljkl} - ln f_{2jkl}. (10)

Corresponding to these observed odds, we can, in similar fashion, also define "expected odds" pertaining to variable A; that is,

Y^{"}_{.jkl} = ln(F_{ijkl}/F_{2jkl}) = ln F_{ljkl} - ln F_{2jkl}. (11)

With respect to model building and parameter estimation, we proceed essentially in the same way as was described in the previous section. The difference, as we stated, is that we must include the parameter uBCDjkl in any model; in addition, suppose the best fitting parsimonious model is given by uABij, uADil and, of course, uBCDjkl. The log linear model is, therefore,

The corresponding logit model is defined by

where the bar over superscript A simply is used to remind us that expected odds pertain to variable A, the dependent variable. Notice that the only interaction parameters that are included in the logit model are those terms which interact with the response variable in the corresponding loglinear specification.

We will see shortly that logit models form the basis of Goodman's approach to causal modeling. In particular, a series of logit specifications define a system of equations which are analogous to the kinds of systems of simultaneous linear equations encountered in the causal analysis of quantitative variables. Furthermore, the estimated B's can be used to construct what are essentially path diagrams.

Spurious Correlation

Before jumping into Goodman's method for testing causal hypotheses, I would like to digress, for just a moment, in order to present a few remarks about the concept of spurious correlation when cross-classified data are analyzed. I have two reasons for doing this: first, the concept of spurious correlation is, in its own right, fundamental to the study of causal systems, and, second, a discussion of spurious correlation will, hopefully, lead to a better understanding of conditional independence type arguments used to introduce the subject of unobservable variables and latent class models.

Put simply, the association between two variables is called spurious if it vanishes in the presence of a third variable which impinges upon both. Typically, with quantitative variables, to test whether a correlation is spurious, interest centers upon the behavior of certain partial correlation coefficients. For example, a nonzero simple correlation between the variables X_{2} and X_{3} would be said to be spurious when upon entry of a third variable X_{1} into the system, the partial correlation coefficient r_{23.1} goes to zero.

In the case of qualitative variables, while we cannot couch our statements in terms of correlation coefficients, we can, nevertheless, present analogous arguments. Let us begin with the simple situation of just two variables A and B. If we now introduce a third variable C having, say, two levels, and if we find that in each of the two I x J tables formed by conditioning on the level of C, variables A and B become independent of each other, then we may say that the original relationship is spurious. In other words, we have found another variable which "explains" the relationship between variables A and B.

Similarly, if we return to our four-way table formed by variables A, B, C, D, we can say that variable D, for example, explains the relationship among the three remaining variables if the three variables A, B, C become mutually independent when the conditional relationships among the three variables in the three-way table at each level of variable D are examined. The hypothesis that variables A, B, and C are mutually independent when the level of variable C is held constant can be denoted as

[A x B x C | D]

where x means independent of.

The hypothesis of conditional independence can be easily tested by assessing the goodness-of-fit of the model:

Notice that no m-term parameters pertaining to the two-way interactions between the pairs AB, AC, or BC are included and, therefore, the system could be described by the following simple diagram

where the presence of an arrow indicates a nonzero effect.

CAUSAL ANALYSIS: THE WORLD OF LOGITS

I will, in this section, attempt to provide an exegesis of Goodman's approach to modeling causal systems composed solely of qualitative variables. My explication of these methods and models will, for the moment, be presented in the abstract with the following section devoted to substantive examples which highlight the issues raised in the ensuing discussion.

Goodman first introduced the modification of loglinear analysis which allowed causal hypotheses to be investigated in two 1973 papers. He argued, to varying degrees throughout both pieces, that the proposed methodology produces a framework for analysis analogous to the usual analysis of causal systems with quantitative variables. Indeed, Goodman's approach can be used to study both recursive as well as nonrecursive systems, though in the latter system the analogy to the quantitative variable case is by far the weakest, and generates path diagrams much in the spirit of those obtained with systems of simultaneous linear equations. There are, however, a number of subtleties which can introduce certain ambiguities into the analysis.

This section is largely expository, and is organized around nonrecursive and recursive systems. After each major subsection, I devote some time to commending on the nuances that are inherent in Goodman's methodology.

The Analysis of Nonrecursive Systems: The Weakest Link

I will again make use of the four-way IxJxKxL multiway table assumed generated by variables A, B, C, and D. In addition, for simplicity, assume now that I=J=K=L=2, so that the table can be described as a 2^{4} contingency cable. I begin with causal modeling in nonrecursive systems, though admittedly the situation provides the weakest analogy to the analysis of systems of simultaneous linear models with quantitative variables, because without any order conditions on the variables, the use of loglinear analysis to investigate causal hypotheses will naturally lead to systems having reciprocal causation and feedback loops.

For illustrative and discussion purposes, assume that the model described by the margin totals

[AB], [AC], [BC], [BD], [CD]

provides an adequate fit to the four-way table. Recall, these margin configurations define the model

so that we would generate nonzero estimates for five two-factor interactions and four one-factor main effects. Following the notation used in the discussion of logit specifications, we could single out each variable in turn and define the expected odds pertaining to each as

where again the bar over the superscript in the Ys signifies that the expected odds pertain to the superscripted variable, and that these expected odds are the dependent variable in each respective formula. In similar fashion we can express (16)-(19) in terms of the logit b-effects which yields the following system of equations.

where we note

and, because of the assumptions imposed

The meaning of equations (20)-(23) should be clear; for example, equation(20) states that the expected odds pertaining to variable D are dependent upon the explanatory variables B and C, but not upon variable A, and, according to (10) and (11), the expected odds in favor of variable D is increased when the additive effects of variable B is introduced at level 1 and decreased when variable B is at level 2.

Equations (20)-(23) provide a system of logit equations which, in spirit, are analogous to the kinds of systems of simultaneous linear equations used in the causal analysis of quantitative variables. For those of us who might be having trouble seeing the analogy, let us introduce the dummy variables X_{1}, X_{2}, X_{3}, and X_{4}, where X_{1} assumes the value 1 when variable A is at level 1, 0 otherwise, X_{1} assumes the value 1 when variable B is at level 1, 0 otherwise, and so on for variables X_{3} and X_{4} corresponding to variables C and D. We can rewrite equations (20)-(23) as a linear function of the dummy variables; for example, in the case of equation (20) we have

where

(Note, in the above formulation, B^{AD} = 0.)

Finally, we can extend the analogy even further by expressing the observed odds wDijk. in terms of the expected odds yDijk by introducing an error component:

where

is the error component which, unlike in the analogous regression formulation, is not independent of the expected odds.

Systems of logit equations like those defined by (20)-(23) can also be cast in path diagrams (see Figure 1). In Figure 1, the two arrows emanating from D to C and B correspond to the effects b^{DC} and b^{DB} in(21) and (22) respectively; the three arrows emanating from C to D, B and A

PATH DIAGRAM FOR LOGIT EQUATIONS (20) - (23)

correspond to the effects bCD, bCB, and bCA in (20), (22) and (23) respectively; the three arrows emanating from B to D, C, and A correspond to the effects b^{BD}, b^{BC}, and b^{BA} in (20), (21), and (23) respectively; and finally, the two arrays emanating from A to C and B correspond to the effects b^{AC} and b^{AB} in (21) and (22), respectively. The absence of any arrows link,ag two or more variables signifies that the respective variables are not directly related. Because b^{AC}= b^{CA}, b^{BC} = b^{CB}, b^{AB} = b^{BA}, b^{BD} = b^{DB}, and b^{CD} = b^{DC}, a single double-headed arrow could be used, as Goodman does, instead of two separate single-headed arrows in the figure. I adopt the latter approach to emphasize the presence of reciprocal causation and because it has greater potential in characterizing more complex models.

Some thought concerning the system of logit specifications defined in equations (20)-(23) will show that there is more than one path diagram which is consistent with model (15). Recall in model (15) the following parameters were set to zero:

m^{AD}, m^{ABC}, m^{ABD}, m^{ACD}, m^{BCD}, m^{ABCD}

Comparison with Table 1, which lists the m-terms that are set to zero for each of the specifications (20)-(23), shows that any pair of equations (20)-(23), except for the pair (21)-(22), is equivalent to model (15). Further, any triplet of equations (20)-(23) is also equivalent. Thus, a number of path diagrams can be generated, all of which are consistent with the fitted model. For example, Figures 2(a) and (b) present path diagrams for the system of equations (20)-(21). In this figure, note that there is no reciprocal causation between variables B and D, B and C, or A and C and, therefore, only one arrow is used to describe each respective relationship; whereas, as in Figure 1, the two separate arrows from C to D and from D to C correspond to the effects b^{CD} and b^{DC} in (20) and (21), respectively.

THE m-TERM PARAMETERS IN THE SATURATED MODEL SET TO ZERO IN LOGIT SPECIFICATIONS (20)-(23)

Figures 2(a) and (b) are the same except for the stated relationship between variables A and b. The distinction comes about because although the m^{AB}-term parameter is included in model (15), there is no logit parameter counter-part specified in either (20) or (21). This means that there are no arrows emanating from variables C or D to variables A or B and my preference is to view variables A and 8 as "exogenous" as opposed to "endogenous" variables. However variables A and B are viewed, their relationship can be assessed by collapsing over the other endogenous variables, C and D, and analyzing the resulting two-way table. To distinguish b-effects estimated from condensed cables as opposed to those derived from complete tables, I will use a double-headed curved arrow and enclose the coefficient in parentheses as shown in Figure 2(a). That is, in the figure the double-headed curved arrow shown in part (a) signifies that, based upon the [AB] marginal table, variables A and B are not statistically independent, as contrasted to part (b) where the absence of any arrow(s) connecting variables A and B signifies that, based upon the reduced two-way table, variables A and B are statistically independent.

PATH DIAGRAM FOR LOGIT EQUATIONS (20)-(21)

Comment

Figures 1 and 2 both describe nonrecursive causal models in the usual sense that at least one pairwise relationship in each figure is characterized by two single-headed arrows which signifies reciprocal causation. With four quantitative variables Z_{1}, Z_{2}, Z_{3}, and Z_{4} the following structural equations define a nonrecursive system:

Z_{3} = b_{31}Z_{1} + b_{34}Z_{4} + e_{3}, (30)

Z_{4} = b_{42}Z_{2} + b_{43}Z_{3} + e_{4}, (31)

where e_{3} and e_{4} are random disturbance terms with zero means. Notice that in (31) variable Z_{3}, which appeared as a dependent variable in (30), appears as an independent variable used to explain variable Z_{4}. In the above set up, the "causal effects" of X_{4} on Z_{3} can differ (in magnitude) from the causal effects of Z_{3} on Z_{4}, that is, there is no restriction that b_{34} = 1/b_{34}. On the contrary, in the Goodman approach, the reciprocal effects are by necessity equal to one another. This is particularly troublesome, and while I could suggest that one fit logit models separately (independent to the complete table), which will induce unequal estimates of the reciprocal effects, this procedure is not totally satisfactory since it seems to ignore the fundamental simultaneity of the models.

Goodman (1973b) states, in closing the section on nonrecursive systems, that while the techniques required for the analysis of nonrecursive system when the variables are quantitative are usually more complicated than are those required for a corresponding analysis of recursive systems, when the variables are qualitative the methods and models derived from considering loglinear analysis are as easily applied to nonrecursive as well as recursive system. Though the ease of application cannot be denied, we will see very shortly that polytomous variables and/or the presence of higher-order effects can render the analysis complex. In addition, the multiplicity of path diagrams that are consistent with a given model means that a great deal of burden is placed on the analyst, and this, coupled with the fact that certain effects will be estimated from collapsed tables and others from the complete table, leads me to suggest that the technique be applied carefully so as to avoid misleading results. (I return to the issue of collapsing tables and the problem of path coefficients in the next section.)

Analysis of Recursive Systems

Let us return once again to our four-way table consisting of variables A, B, C, and D, each having two levels. If I impose order conditions such that variable A is causally prior to variable B, variable B causally prior to variable C, and variable C causally prior to variable D, then, under this setup, reciprocal causation and feedback loops have been ruled out. This assumed causal ordering would also seem to imply that any path diagram should be based on three (logit) models: (1) B given A, (2) C given A and B, and (3) D given A, B, and C, which when combined characterizes the conditional joint probability of B, C, and D, given A.

Suppose, on the other hand, that the true causal ordering is variables A and B are causally prior to variable C, and all three variables are prior to variable D. Under this setup Goodman would recommend that one first develop a model for the [AB] marginal table; call this model M_{1}. The relationship between variables A and B would be assessed on the basis of the usual likelihood ratio c^{2} statistic based upon the condensed table. Next, attention would be focused on the three-day table [ABC] and models would be fit which included in the set of fitted margins the [AB] margin total; call this selected model M_{2}. Finally, consider a model M_{3} for the complete four-way table [ABCD] which contains the margin total [ABC] in the set of fitted marginals.

Goodman (1973a, 1973b) shows that, under the assumption that the models M_{1}, M_{2}, and M_{3} are all true, the test for the combined model, call it M*, can be calculated by either estimating the expected frequencies from those obtained under M_{1}, M_{2}, and M_{3} or by the partitioning property of the likelihood ratio c^{2} statistic. In the former case, note

where F*ijkl denotes the expected frequencies under M*, and thus the F*ijkl can then be substituted into the formula for the likelihood ratio c^{2} statistic and the appropriate test performed. Alternatively, in the latter case, note that, because of the simple multiplicative form of the expected frequencies shown in (32), we have

c^{2}(M*) = c^{2}(M_{1}) + c^{2}(M_{2}) + c^{2}(M_{3}), (33)

with the degrees of freedom for M* equaling the sum of the degrees of freedom for the separate c^{2} statistics for M_{1}, M_{2}, and M_{3}.

The path diagram for this model is presented in Figure 3. We see from the figure that: (a) variable D is posterior to the other variables since there are no arrows pointing from D to any of the other variables; (b) the arrows emanating from variables B and C to variable D reflect the situation found in equation (20), namely,

And correspond to bBDj and bCDk, respectively; and, finally, (c) variable B is statistically independent of variable C, given the level of variable A.

PATH DIAGRAM FOR LOGIT EQUATIONS (34)-(39)

The bABi and bBAj effect parameters are induced from fitting the saturated model to the condensed [AB] marginal table; viz.:

which generates the logics

The conditional independence of variables B and C given the levels of variable A, that is, [BxC|A] can be expressed in a form which resembles equation (21). Letting F_{ijk} denote the expected frequency in cell (i,j,k) of the marginal table [ABC], the expected odds with respect to variable C at the joint level (i,j) on variable (A,B) are given by:

Comments. Some reflection on Figure 3 might lead one to consider fitting the following loglinear model to the complete four-way table: [AB]. [AC], [BC]. [BD], [CD], where each two-factor effect corresponds to each arrow in the path diagram. However, though this loglinear model is consistent with the figure, it ignores the hypothesized structure among the variables and, in general, effect estimates derived from a single loglinear model fitted to the complete table will differ from those induced from considering a series of logit models.

On the other hand, collapsing variables may require that certain parameters be fitted that otherwise would have been set to zero if the full table had been analyzed; that is, it is not too uncommon to find that, say, in the condensed [ABC] table, the two-factor effect [BC] is needed (the model of conditional independence of variables B and C does not fit well), whereas in the next stage this effect is nil in that the model [ABC], [AD], [BD], [CD] provides an adequate fit. In fact, several of the examples used by Goodman exhibit the property. For instance, in Goodman (1973a) a four-way table relating the attitudes of school children is analyzed and various path diagrams based on the model M*: [AB], [AC], [BC], [BD], [CD] are discussed. However, as Goodman admits, the simpler model, M**: [AC], [BC], [BD], [CD], which does not fit the [AB] margin total, also fits the data quite well. If we follow Goodman's approach this term must always be included since the effect is estimated from the collapsed [AB] table.

In the preceding two paragraphs I have attempted to play the devil's advocate by arguing that, on the one hand, fitting a single loglinear model to a complete table ignores the structure among the variables as evinced by a series of logit models, while, on the other hand, collapsing a variable may require one to fit parameters that would be unnecessary (if the complete table was analyzed). Though each argument has merit, I prefer to follow the hypothesized causal ordering of the variables and allow the reduced tables to dictate the effect parameters included, even in the face of including "additional" parameters. Further, I would argue that when causal priorities are imposed among the variables, the complete multiway table should only be analyzed in the context of the hypothesized relationship between the antecedent and posterior variable(s).

The final point that I wish to make relates to the path coefficients themselves. Though I will insert coefficients corresponding to b-effect values into the path diagrams presented in the following section, there is no for-real calculus of these coefficients as exists in analysis of qualitative variables. Consequently, it is difficult to decide what values to assign to arrows not explicitly accounted for by the system of logits, and anything other than "direct" effects is difficult to assess.

ILLUSTRATIVE EXAMPLES

I will now devote some effort to showing two different examples of how the methods and models of Goodman can be used in modeling causal systems. Both examples deal with recursive systems. (My earlier remarks support this selection.) The examples are, however, different: the first example is rather straight-forward yet amply illustrates the basic features of the analysis of recursive systems discussed in the previous section; the second example is intended to demonstrate some of the difficulties encountered when polytomous variables and higher-order interactions are present.

Example 1

My first example will analyze data collected in a survey of 2409 individuals on their preference for a particular branch of savings institutions. The data are displayed in Table 2; the table cross-classifies each individual according to four dichotomous (Yes, No) variables: (a) whether the person is familiar with the branch, (b) the person's opinion whether the branch is conveniently located, (c) prior experience (patronage) with the branch, and (d) whether the person strongly recommends the branch.

One plausible causal ordering is that A and B precede C and D; that is, familiarity with and opinions concerning locational convenience are antecedent to one's having previously patronized the branch, and all three affect whether a person would strongly recommend the branch. I view both variables A and B as antecedent to variable C since it is difficult to argue that one precedes the other, and I prefer to treat each as simply exogenous. The relationship between variables A and B was assessed in the condensed [AB] marginal table shown in part (b) of the table. Simple computation shows the cross product ratio a=2.7 (485 x 1037/240 x 547). Since an a=1 is consistent with the hypothesis of independence between A and B, we conclude that A and B are positively related. (We could, of course, compute a G^{2} statistic and test its significance but given the magnitude of a this seems unnecessary.)

Next, prior experience, variable C, is treated as the response measure and variables A and B as explanatory. With margin total [AB] fixed (the [AB] margin proved significant) there are three unsaturated loglinear models corresponding to such a logit model:

M_{1}: [AB] [AC] [BC] (1 d.f.; G^{2} = 1.01)

M_{2}: [AB] [AC] (2 d.f.; G^{2} = 31.02)

M_{3}: [AB] [BC] (2 d.f.; G^{2} = 533.84)

Clearly, the only one of these models that provides an adequate fit to the data is model M_{1}. Model M_{1} posits no second order interactions and corresponds to the logit model

The respective estimated effect parameters are 1.53 and 0.33.

The final step is to build a logit model treating variable D, whether the person strongly recommends the brand, as the dependent variable and the other three variables as explanatory. There are eight unsaturated loglinear models, each of the three two-factor affects may be absence or presence. Four of these models are listed below:

M_{4}: [ABC] [AD] [BD] [CD] (4 d.f.; G^{2} = 4.45)

M_{5}: [ABC] [BD] [CD] (5 d.f.; G^{2} = 39.86)

M_{6}: [ABC] [AD] [CD] (5 d.f.; G^{2} = 6.24)

M_{7}: [ABC] [AD] [BD] (5 d.f.; G^{2} = 8.19)

Models M_{4} and M_{6} provide reasonable fits to the data while models M_{5} and M_{7} are unacceptable. Since Model M_{6} differs from model M_{4} in only one term, [BD], a test for the significance of this effect can be accomplished by simply taking the difference in the G^{2} statistics associated with each and then comparing this value to a c^{2} distribution with 1 d.f. Noting that G^{2}(M_{6}) - G^{2}(M_{4}) = 1.79, which is nonsignificant, the preference is for Model M_{6}. The corresponding logit is

The effect estimates are 0.39 and -0.12, respectively.

The path diagram characterizing this system is shown in Figure 4. Notice I use the double-headed curved arrow to indicate that no decision as to whether A causes B or B causes A has been made and have enclosed the corresponding effect-estimates in ( ) to emphasize that these variables are viewed as exogenous to the system. The causal interpretation of Figure 4 is that the exogenous variables, branch familiarity and opinions concerning its locational convenience, are related; these variables, familiarity and locational convenience, affect prior patronage; and whether the branch received a strong recommendation is affected by familiarity and prior patronage. Interestingly, the b-effect associated rich prior patronage and recommendation (bCD) is negative which implies that previous experience with the branch __decreases__ the likelihood of the branch receiving a strong recommendation. It seems, unfortunately, that the old adage: familarity (in the form of prior contact) breeds contempt, does actually hold.

PATH DIAGRAM SHOWING CAUSAL CONNECTIONS IMPLIED BY LOGIT MODELS (38)-(39)

Note, finally, that if we were not concerned with the hypothesized structure among the variables as expressed in terms of logit specifications, the system shown in Figure 4 might lead us to fit the loglinear model:

M_{8}: [AB] [AC] [AD] [BC] [CD]

to the four-way table, where each arrow in the figure is represented by a two-factor effect. The G^{2} statistic for this model has a value of 7.25 with 6 d.f. Hopefully, most of us can now understand why this value is identical to the numerical value obtained from the pair of recursive logics; that is if the G^{2 }statistic corresponding to the combined recursive system by G2* then we have:

G^{2}(M_{8}) = G^{2}_{*} = G^{2}(M_{4}) + G^{2}(M_{6})

7.25 = 7.25 = 1.01 + 6.24.

Example 2

The next example makes use of data, originally collected by Sewell and Shah (1968), on the relationship among five variables: (A) socioeconomic status (high, upper middle, lower middle, low), (B) intelligence (high, upper middle, lower middle, low), (C) sex (male, female), (d) parental encouragement (low, high), and (E) college plans (yes, no). The five-way cable is shown in Table 3. The data present a number of problems not previously illustrated, namely, polytomous variables and logit models that will include second-order effects. I note that this example has been analyzed elsewhere and my discussion will follow Feinberg's (1977) analysis.

A causal model proposed by Sewell and Shah specifies

"

and

B precede D precedes E;

and

C

two logits characterize the system where in one socioeconomic status, intelligence and sex are viewed as antecedent, exogenous variables to parental encouragement; while in the second, college plans is the response measure and all four retaining variables are treated as explanatory. With respect to the exogenous variables A, B, and C, the only reasonably fitting model included all three two-factor effects: [AB], [AC], [BC] (9 d.f.; G^{2} = 11.50). We thus conclude that socioeconomic status is associated with intelligence and with sex, and intelligence and sex are also positively related. Again we do not distinguish between any of these causal orderings.

Treating, first, variable D, parental encouragement as the response measure, some plausible logit models are:

M_{1}: [ABC] [AD] [BC] [CD] (24 d.f.; G^{2} = 55.81)

M_{2}: [ABC] [ABD] [CD] (15 d.f.; G^{2} = 34.60)

M_{3}: [ABC] [BCD] [ACD] (18 d.f.; G^{2} = 31.48)

M_{4}: [ABC] [ABD] [BCD] (12 d.f.; G^{2} = 22.44)

M_{5}: [ABC] [ABD] [ACD] (12 d.f.; G^{2} = 22.45)

M_{6}: [ABC] [ABD] [ACD] [BCD] ( 9 d.f.; G^{2} = 9.22)

The only model providing an acceptable fit is model M_{6} which included three second-order effects. The logit counterpart to model M_{6} is

The presence of second-order effects presents no real problem, except they are, of course, more complex, but we have yet to consider path diagrams induced from such effects.

Next, variable E, college plans, becomes the response measure. Some plausible logit models are

M_{7}: [ABCD] [E] (63 d.f.; G^{2} = 4497.51)

M_{8}: [ABCD] [AE] [BE] [CE] [DE] (55 d.f.; G^{2} = 73.82)

M_{9}: [ABCD] [BCD] [AE] [DE] (52 d.f.; G^{2} = 59.55)

M_{10}: [ABCD] [BCD] [ACE][DE] (49 d.f.; G^{2} = 57.99)

Both models M_{9} and M_{10} provide acceptable fits. For reasons of parsimony (and since the difference in G^{2} values of these two models is nonsignificant) model M_{9} is selected; the corresponding logit is:

Note, this model includes one second-order effect corresponding to the effects of intelligence and sex on college plans, and two first-order effects which show that college plans are affected by socioeconomic status and parental encouragement.

Though a simple path diagram like those presented earlier is not possible, a diagrammatic representation, albeit more complex, of the induced logit system is possible. Consider Figure 5 which portrays the path diagram showing the causal connections implied by models M_{6} and M_{9}. In the figure, (----) lines pertain to variable A, (**....**) lines to variable B, (- - -) lines to variable C, and ( - **. - . -** ) lines to variable D; association between the exogenous variables is shown by double-headed curved arrows and higher-order effects are shown by a solid single-headed arrow which emanates from the effect in question. Path coefficient effects have been included, where column and row headings have been inserted to identify particular effects.

The first order effect estimates for variables A, B, and C are in the expected direction. The likelihood of low versus high parental encouragement decreases monotonically as we move from low to high socioeconomic status and intelligence, and it seems males get more encouragement than females. The second-order effects of sex by intelligence show low intelligent males receiving relatively less encouragement than low intelligent females and high intelligent males receiving relatively more than high intelligent females; the second order effects of sex by socioeconomic status shows males receiving less parental encouragement than their female counterparts. With respect to college plans, the first order effects indicate that there are positive affects on college plans from being male, of high socioeconomic status, or of high intelligence. Notice the monotonic order to the effect estimates within the socioeconomic and intelligence categories. The second-order effects between sex and intelligence are also monotonically ordered with males having low intelligence less likely to plan for college than a female with comparable intelligence, whereas the obverse is true for males and females with high intelligence.

PATH DIAGRAM SHOWING CAUSAL CONNECTIONS IMPLIED BY LOGIT MODELS (41) - (42)

LATENT CLASS MODELS

I have already introduced the concept of spurious correlation with qualitative variables. The approach taken in that section was to argue that the association between two variables is spurious if it vanishes in the presence of a third variable which impinges upon both. In the context of our illustrative four-way table we said if three variables. say, A, B, and C, become mutually independent when the levels of the remaining variable, D, are controlled for, then their original relationship is spurious.

Consider the situation where some other variable, say, variable X, not part of the original cross-classification, can explain the relationships among the four variables when this new variable is controlled for. Once one accepts the possibility of some "outside" factor affecting the apparent relationship among the system variables, the step to considering latent class models is a rather easy, direct one since one need only assume that the explanatory variable in question is unobservable (or latent).

The situation described in the preceding paragraph can be diagrammed as

where A, B, C, and D are the observable (manifest) variables and variable X is the explanatory latent variable. To model this relationship, let p_{ijkl} denote the observed proportion (probability) of individuals in cell (i, j , k, l) where

P_{ijkl} = f_{ijkl}/n. (43)

For some specified hypotheses about the system^{ }of variables in the four-way table let P_{ijkl} be the expected proportion of individuals in cell (i, j, k, l) where the relationship holds that

F_{ijkl} - nP_{ijkl}. (44)

Assume further that variable X has t latent classes so that PXt denotes the probability that an individual will be in class t and PABCDXijklt denotes the probability that an individual will be at level (i, j, k, l, t) with respect to the joint variable (A,B,C,D,X). Letting PAXij, PBXjt, PCXkt, and PDXlt be conditional probability that an individual will be at level i with respect to variable A, level j with respect to variable B, etc., given that s/he is at level t on variable X then we can express the hypothesized system as

and

Formula (45) simply states that the latent variable explains the relationships among the manifest variables in the sense that any relationships disappear (the manifest variables are mutually independent) when the latent variable X is held constant. It is very easy to see how path diagrams based on logit models of the fore previously discussed can be used in the analysis of latent class models by simply defining uA.jklt as the odds that an individual by will be at level 1 rather than level 2 on variable A given his joint level on the remaining variables (B,C,D,X) is

However, from (45) we have

since the hypothesis under study in that the expected odds pertaining to variable A are affected by the level of variable X, but not by the level (j, k, l) of the other variables (B, C, D).

Due to space constraints I will say no more about this class of models. However, at this point, the reader should be able to see the approach that Goodman will take in the analysis of such systems: Various loglinear models are fitted which describe the relationships among the manifest variables when one or more latent variables are controlled for. Corresponding logits are then formed with the estimated effect parameters inserted as coefficients in the path diagram. The interested reader wanting further details, especially with respect to maximum likelihood estimation, should consult Goodman (1974a, 1974b).

CONCLUDING REMARKS

I have restricted my remarks to the use of loglinear model in the analysis of causal systems, and specifically to the work of Goodman. In doing so, I have no doubt slighted many. For example, Bahadur (1961), Boudon (1968), Coleman (1964), and Davis (1974, 1975) are just a few of those who have made important contributions in the area. The work of Davis (1975) is particularly noteworthy in that he describes an interesting way of modeling cross-classified data which utilizes linear flow graphs based on differences in proportions instead of odds ratios. Even with respect to the methods and models championed by Goodman I have been remiss. Model testing, parameter estimation, and the whole class of latent models have only been superficially covered.

Critics of Goodman's system might cite me for not being more critical of his approach. After laboring through Goodman's papers several times (this itself is, I think, a major feat--reading Goodman can overwhelm one, especially his footnotes) I believe his system to be sound, if, of course, care and good judgment on the part of the user are exercised. Critics have been quick to point to situations where the analogy the analysis of quantitative variables via systems of linear simultaneous equations is the weakest, namely: (1) the absence of a calculus for the path coefficients and (2) the problems caused by polytomous variables and higher-order effects. We have seen, however, that point (2) presents no serious problems; in particular, Example 2 demonstrated that path-like diagrams, although admittedly more complex, can be constructed in such a way as to convey the character of the causal system under study. With regard to point (1) the situation is, unfortunately, more serious. Because the Goodman approach does not impose a structure analogous to the usual normal equations and since the i's do not actually become independent variables in succeeding logit specifications, the Wright multiplication theorem does not operate. Nevertheless, I see no reason why the estimated b-effects cannot be inserted in the path diagram for they serve a very important interpretive purpose. The user who, for whatever reasons, wants to employ Wright-like multiplication rules when analyzing qualitative data, especially in the case of dichotomous variables, should refer to the methods and models proposed by Boudon (1965) and Davis (1975).

The other uncomfortable feature of Goodman's approach relates to the multiplicity of loglinear models that can be fitted which are consistent with a given series of logit specifications. This, as I have indicated, places an extreme burden on the analyst. Yet this burden may well be a blessing in the sense that it can possibly provide the motivating force leading one to rely on theoretical arguments in order to generate explicit causal orderings among the variables. It is in situations where the hypothesized causal system is induced from theory that the methods and models proposed by Goodman work best; without theory to guide us, I'm afraid, the intent of most studies, no matter what the form of analysis, becomes explanatory and often subject to Procrustes-like solutions.

REFERENCES

Bahadur, R. R, (1961), "A Representation of the Joint Distribution of Responses to N Dichotomous Items," In __Studies in Item Analysis and Prediction__, edited by H. Solomon, Stanford, California: Stanford University Press.

Birch, M. W. (1963), "Maximum Likelihood in the Three-Way Contingency Tables," __Journal of the Royal Statistical Society__ (ser. B), 25, 383-400.

Bishop, Y. M. M., Feinberg, S. E., and Holland, P. W. (1975), __Discrete Multivariate Analysis__, Cambridge, Massachusetts: MIT Press.

Boudon, R. (1965), "A Method of Linear Causal Analysis: Dependence Analysis," __American Sociological Review__, 22, 677-682.

Boudon, R. (1967), __L'Analyse Mathematique des Faites Sociaux__, Paris: Plan.

Boudon, R. (1968), "A New Look at Correlation Analysis," In __Methodology in Social Research__, eds. H. M. Blalock, Jr., and A. Blalock, New York: McGraw-Hill, 199-235.

Coleman, J. S. (1964), __Introduction to Mathematical Sociology__, Glencoe, Illinois: Free Press.

Davis, j. A. (1974), "A Survey Metric Model of Social Change," Pt. 1 (February), Lithographed, Chicago: National Opinion Research Canter.

Davis, J. A. (1975), "Analyzing Contingency Tables with Linear Flow Graphs: D Systems," __In Sociological Methodology 1976__, edited by David R. Heise, San Francisco: Jossey-Bass, 111-145.

Duncan, O. D. (1966), "Path Analysis: Sociological Examples," __American Journal of Sociology__. 72, 1-16.

Duncan, O. D. (1975), __Structural Equations Models__, New York: Academic Press.

Fienberg, S. E. (1977), __The Analysis of Cross-Classified Categorical Data__, Cambridge, Massachusetts: The MIT Press.

Goodman, L. A. (1972), "A General Model for the Analysis of Surveys," __American Journal of Sociology__, 77, 1035-1086.

Goodman, L. A. (1973a), "Causal Analysis of Data from Panel Studies and Other Kinds of Surveys," __American Journal of Sociology__, 78, 1135-1191.

Goodman, L. A. (1973b), "The Analysis of Multidimensional Contingency Tables When Some Variables are Posterior to Others: A Modified Path Analysis Approach," __Biometrika__, 60, 179-192.

Goodman, L. A. (1974a), "Exploratory Latent Structural Analysis Using Both Identifiable and Unidentifiable Models," __Biometrika__, 61, 215-231.

Goodman, L. A. (1974b), "The Analysis of Systems of Qualitative Variables when Some of the Variables are Unobservable, Part I: A Modified Latent Structure Approach," __American Journal of Sociology__, 79, 1179-1259.

Goodman, L. A. (1978), __Analyzing Qualitative/Categorical Data__, Cambridge, Massachusetts: ABT Books.

Haberman, S. J. (1978), __Analysis of Qualitative Data__, New York: Academic Press.

Lazarsfeld, P. F. (1970), "A Memoir in Honor of Professor Wold," In __Scientists at Work__, edited by T. Dalenius, G. Karlsson, and S. Malmquist, Uppsala: Almquist & Wiksells.

Lazarsfeld, P. F., and Henry, N. W. (1968), __Latent Structure Analysis__, Boston: Houghton Mifflin.

Sewell, W. H. and Shah, V. P. (1968), "Social Class, Parental Encouragement, and Educational Aspirations," __American Journal of Sociology__, 73, 559-572.

Wright, S. (1960), "Path Coefficients and Path Regressions: Alternatives in Complementary Concepts?" __Biometrics__, 16, 423-445.

----------------------------------------

Tweet
window.twttr = (function (d, s, id) { var js, fjs = d.getElementsByTagName(s)[0], t = window.twttr || {}; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = "https://platform.twitter.com/widgets.js"; fjs.parentNode.insertBefore(js, fjs); t._e = []; t.ready = function (f) { t._e.push(f); }; return t; } (document, "script", "twitter-wjs"));