# Theory Based Monitoring of Social Constructs

^{[ to cite ]:}

James B. Wiley and Gordon G. Bechtel (1983) ,"Theory Based Monitoring of Social Constructs", in NA - Advances in Consumer Research Volume 10, eds. Richard P. Bagozzi and Alice M. Tybout, Ann Abor, MI : Association for Consumer Research, Pages: 157-162.

^{[ direct url ]:}

http://acrwebsite.org/volumes/6104/volumes/v10/NA-10

A stochastic response model is presented for addressing attitude data from national surveys and commercial sources. The attendant statistical analysis by generalized least squares adjusts for the heterogeneous variances and covariances characterizing the data in such surveys. The power of this analysis, in conjunction with that of aggregation, is illustrated using attitude data from the Netherlands' social-indicator program. This illustration suggests the types of lawful relationships that are domestrable at the societal level of analysis.

It is evident that in future years market and consumer researchers increasingly will rely on information taken from secondary sources or they will commission primary research and, in either case, the information will be derived from sample data. Furthermore, this information will increasingly relate to "subjective" variables: Attitudes, consumer satisfaction (Lingoes and Pfaff, 1977), social norms, and the like. Since subjective indicators have little meaning in an absolute sense, analysis of the information will emphasize change from one time period to another, one population segment to another. or one Product to another.

Attitude theory and measurement has a long history of dealing with subjective measures based on sample data. However, despite the prima-facie relevance of this body or knowledge to analyzing societal information of a subjective nature, little work has actually been done in -this regard.This paper presents a method for aggregation over subjects, stimuli, occasions and measures using data from a compact western democracy. Our illustrative data were generated from a small set of items tapping societal changes in attitude toward womens' roles in the Netherlands. These survey data were gathered during a particularly volatile period in the 1970's in connection with a national program for monitoring life quality and societal change. The analysis below may be regarded as a pilot assessment of a measurement method for handling attitude data from continuing surveys, conducted by governments, trade associations, subscription services, and the like. The procedure should be or interest to analysts in government agencies, corporate research divisions, academic consumer researchers, private research agencies, and so forth.

ENDORSEMENT PROPORTIONS IN THE DUTCH SAMPLES

THE ENDORSEMENT DATA AND MODEL

In order to fix ideas, Table 1 presents the five attitude items employed in the 1970 and 1975 Dutch surveys. The endorsement proportions in the table, along with the response sizes for each item, have been drawn from the Netherlands' Social and Cultural Report 1976 (Social and Cultural Planning Office, 1977). The wording of each item in Table 1 is in the direction of "feminine constraint". That is, each item is assumed to be monotonically related to this particular target construct. Also presented in the table are the number of individuals responding to each of these five questions in the 1970 and 1975 surveys and the proportion of those responding who endorsed each category.

It is hypothesized that aggregate response to the categories for a given item is mediated by the following model:

H_{"} : u_{ij} = a_{i} - k_{j}, (1)

where

a_{i} = attitude toward "feminine constraint" in year i,

k_{j} = intensity of item j on the attitude continuum.

According to this hypotheses, the endorsement scale, which is demarcated by "disagree", "neither", "agree", is mediated by a latent attitude continuum. Moreover, the endorsement strength u_{ij} is postulated as the difference between the population attitude a_{i} at time i and an item intensity k_{j}. The latter is not time dependent and serves as a temporally stable reference point for monitoring attitude change upon the latent continuum.

In accordance with successive Internal Scaling Theory, two category boundaries upon the endorsement scale (1) are postulated; namely, that between "disagree" and "neither" and that between "neither" and "agree". Accordingly, each trio of response proportions in Table 1 may be converted to a pair of proportions by accumulating to the right above each of these boundaries. For example, the 1970 proportions for the first item are .429 (= .138 + .291), which estimates the probability of a "neither" or "agree" response, and .291, which estimates the probability of an "agree" response. In general, we will let P_{ijk} represent the cumulated proportion above k for the jth item in the ith year. In the present analysis, i denotes 1970 or 1975, j represents any of the five items in Table 1, and k labels the respective boundaries. Since each of the ten rows in Table 1 produces two cumulated proportions, there are twenty of these values constituting our endorsement "data", each of which estimates a true probability r_{ijk} of i being rated above boundary k in year i.

While the subscript i in the present analysis refers to time, there is nothing intrinsic in the scaling theory that is time specific. Thus, i could parameterize segments or products in a cross-sectional analysis. No change would be required in the model or estimation procedure. On the other hand, a cross-sectional, time-series analysis that accommodates segments, products or concepts, and time periods can be performed with an appropriately parameterized model.

A SCALING THEORY FOR THE DATA

We postulate that the probability can be accounted for with the following model:

P_{ijk} = e^{lijk}/ (1+e^{lijk}) , or (2)

l_{ijk} = u_{ij} - r_{ik} + y_{jk} + E_{ijk}, where (3)

l_{ijk} = the logit of P_{ijk} = ln [P_{ijk}/l-p_{ijk})].

U_{ij} = the population endorsement strength in year i for item j (1),

Y_{jk} = a quality control, or unscalability term.

T_{ik} = boundary k's value upon the endorsement continuum in survey i.

With the exception of the unscalability component, all terms, including the errors, e_{ijk}, range over the temporal subscript i. The parameters in (3) are identified by introducing the following "effect coded" side conditions (Kerlinger and Pedhazar, 1973, pp. 176-185):

The errors e_{ijk} are the usual three-way interactions. Now let E(l_{ijk}) = y_{ijk}. The parameters of (3) then are given by:

where the dot notation indicates the average over ths subscript it replaces in each of the above formulas. The first formula indicates that endorsement strengths are a linear function of the "true logits", which could be calculated by fixing time periods (i) and items (j) and averaging over logit values. The boundary value is a deviation score calculated by fixing time periods and boundaries and averaging over items. The last formula emphasizes that the Y terms are classic two-way interactions.

Relationship to Latent Trait Theory

It should be noted that if we set Y_{jk} = e_{ijk} = 0 in (3), we have the basic Rasch latent trait model (Masters, 1982). That is, Rasch (1966) has suggested

as a test theory model, where 0*i is an ability parameter for the ith individual E*a is an easiness parameter for the ath item and P_{ia} is the probability that the ith person will respond correctly to the at item . If (6) is rewritten as

and (7) is solved for (0i = ga), the result is

yia = 0j - ga. (8)

The relationship between (8) and (3), with Y_{jk} = e_{ijk} =0, is evident. The two models differ in that the parameters of (8) refer to individual constructs while (3) has an additional subscript and the parameters refer to social constructs.

Relationship to Factor Analysis

In an unpublished study in which one of the authors participated, the statistic EQUATION was related to corresponding LISREL errors derived from analysis of inter item correlations. The items were similar in format to those in Table 1. For each item j, the value EQUATION was matched with j's "error" in the LISREL analysis. The magnitudes of these two types of errors were consistently (positively) related in the study. However no precise relationship has yet been developed linking the Rasch latent trait formulation (3) to the latent structures utilized in the LISREL approach.

ESTIMATION

The 20 logits in our analysis occur in couplets corresponding to the ten presentations recorded on each line in Table 1. Each couplet is constructed from the cumulated, observed proportions, P_{ij1} and P_{ij2}, associated with year i and item j. Due to the accumulation procedure, P_{ij1} and P_{ij2} are correlated proportions, and their correlation induces a covariation between their transforms l_{ij1} and l_{ij2}. Also, these two logits are heterogeneous in their respective variances, and therefore, the assumptions of the usual OLS procedures do not hold here.

In view of these properties of the logits, we carry out a generalized least squares (GLS) analysis upon the l_{ijk} generated from Table 1. General descriptions of this type of analysis may be found in the statistics literature (Grizzle, Starmer, and Koch, 1969;) and in the econometric literature (Theil, 1970; Huang, 1970; Johnston, 1972; Maddala, 1977). As these references indicate, the GLS procedure adjusts for the variances and covariances of the observations by including their covariance matrix in the analysis. In the present case this matrix takes a block-diagonal form, where each 2 x 2 submatrix along the "diagonal" contains the variances of l_{ij1} and l_{ij2}, along with their covariance. Since these two logits correspond to a single line i, j in Table 1, we have ten of these submatrices, which, when strung out as diagonal blocks, fill out our 20 x 20 covariance matrix for the 20 logits.

The construction of each of these 2 x 2 submatrices is easily carried out from the information in Table 1. For each line i, j in Table 1 it may be shown that

var (l_{ij1}) = [n_{ij}P_{ij1} (1 = P_{ijl})]^{-1}, (9a)

var (l_{ij2}) = [n_{ij}P_{ij2} (1 - P_{ij2})]^{-1}, (9b)

cov (l_{ij1}, l_{ij2}) = n_{ij}P_{ij1} (1 - P_{ij2})^{-1}, (9c)

where n_{ij} is the number responding to item j in survey i (see Lehnen, R.G. and G.G. Koch, 1974). The proportions P_{ij1} and P_{ij2} are cumulated from the three proportions on this same line of the table. For example, the cumulative proportions and number responding on the first line, which addresses "child rearing" in 1970, are P_{111} = .429, p_{112} = .291, and n_{11} = 1878. Hence, the 2 x 2 covariance matrix for this first line is

.00217 .00175

S11 =

.00175 .00258

The nine subsequent matrices follow S_{11} as diagonal blocks, and, with the remaining off-diagonal zeros, form our 20 x 29 covariance matrix S.

Letting l_{ijk} above be an element of the vector l, e_{ijk} an element of e, we then have the general linear model

l = XB + e (10)

where X is the (reduced rank) design matrix for (3) and b is the vector of parameters on the right side of (3). A best asymptotic normal (BAN) estimate of b is then

b = (X' S^{-1} X)^{-1}X'S l (11)

where S is the (estimated) covariance matrix of the e_{ijk}, which we estimate with (9a-c). Coding of the design matrix is illustrated in Appendix A1. A test of fit of the general model can be obtained using modified chi-square methods (Grizzle, Starmer, and Koch, 1969):

X^{2}_{SSM} = SS[E(l) = X B] = l'S^{-1}l-b' (X'S^{-1}X) b (12)

Given the fit of the model, the test of the hypothesis Cb = 0 is produced by:

X^{2}_{SSH} = SS [CB=0] = b'C'[C(X'S^{-1}X)^{-1} C']^{-1}Cb (13)

where C is an appropriately defined contrast matrix s illustrated in Appendix A2.

HYPOTHESES

Since the origin of the attitude scale may be placed at the mean of the item values, it may be shown that the attitude and item parameters in (1) are

a_{i} = u_{i.} , (14)

k_{j} = u_{. .}- u_{.j} .

For example, attitude is the average of the separate item endorsements. This averaging over items follows from the (linear) monotone relation between u_{ij} and a_{i} postulated in hypotheses (1) and the condition that Ej kj = 0 which sets the origin of the attitude scale at the mean of the items. This linear form for a_{i} in (14) is a direct analog of the well-known Likert Method of Summated Ratings (Lickert, 1932). This type of redundancy is a useful measurement tool whenever several items are employed as specific instances of a social construct.

The first test associated with these parameters is that of hypothesis (1), which may be shown to be equivalent to

H'_{"} : u_{ij} - u_{i.} - u_{.j} + u_{..} = 0 . (15)

The reader will recognize the linear form on the left of (15) as two-way interaction. Thus, our attitude hypothesis is equivalent to a statement of zero interaction in the endorsements which is stated in the form to = 0 of (13). The test of this hypothesis is given below.

The second hypothesis of interest is that of zero attitude change between the first and second surveys (i = 1,2). This hypothesis is stated as

H_{C} : a_{1} = a_{2} , (16)

which is equivalent to

H_{C.} : u_{1.} -u_{2.} = 0 (17)

Hypothesis (17) also is stated in the form CB = 0 of (13). the-test of this hypothesis concerning attitude change in the Netherlands between 1970 and 1975 also is given below (also see Appendix A2).

RESULTS

Estimates and Tests for Endorsement Model (3)

The three sets of parameters in (3) are estimated by the values in Tables 2, 3, and 4. The underlined values in Tables 2 and 3 were not directly estimated, since they follow from the remaining values in each Table. That is the uniqueness conditions upon (3) require the unscalabilities to sum to zero within each row and column in Table 2 and the category boundaries in Table 3 to sum to zero within each year

The sum or squares for fitting model (3) to the logits is calculated as SSM in (12). In the present analysis SSM = 5.59, which is interpretable as a chi-square with a (=20-16) degrees of freedom. This nonsignificant chi-square value supports the fit of (3) to the logits.

The further evaluation of the subtractive structure (3) is carried out by testing the hypothesis that all the Y_{jk} in (3) are zero. In the present analysis this SSH for interaction is 51.97, which is interpretable as X24.

Although this highly significant chi-square indicates that these parameters department from zero, the deviations in Table 2 are of negligible magnitude. Therefore, we regard the subtractive endorsement structure (3) as an adequate approximation to the logits.

GLS ESTIMATES OF THE UNSCALABILITIES

The boundary values in Table 3 indicate the widths of the "neither" category in 1970 and 1975. The greater breath of this middle response option in the second survey reflects the consistently greater usage of "neither" in 1975, as shown by the endorsement proportions in Table 1.

GLS ESTIMATES OF THE CATEGORY BOUNDARIES

Finally, the substantive impact of this first-stage analysis is provided by Table 4. These GLS estimates indicate generally negative item endorsements, with the exception of the aggregate response to the statement "A woman is more suited to bringing up small children". However, even here endorsement strength falls from 1970 to 1975. In fact, all the endorsement strengths fall from the first to the second survey, and the chi-square value for each of these items indicates that each decrement is highly significant.

GLS ESTIMATES AND CHI-SQUARE TESTS FOR THE ITEM ENDORSEMENTS v_{ij}

The Subtractive Substructure of Endorsement

Hypothesis (1) asserts that endorsement v_{ij} is itself subtractive in attitude a_{i} and item intensity k_{j}. This hypothesis is equivalently expressed as (15), which is a linear form in the u_{ij} The hypothesis sum-of-squares (13) for this structure is 7.38. This value in interpretable as a chi-square upon four degrees of freedom. Since this chi-square is not statistically significant, we support hypothesis (1), and, hence, a unidimensional attitude scale mediating the item endorsements.

Attitude change

The null hypothesis (16) of no attitude change is treated in a manner similar to that for the attitude hypothesis. The assertation is converted to the equivalent statement (17) containing a linear form in the endorsements u_{ij} The corresponding hypothesis sum of squares is 136.93 with one degree-of-freedom. This highly significant value summarizes the shift in attitude between 1970 and 1975 in the direction of easing "feminine constraint."

DISCUSSION

The social construct of "feminine constraint" has been abstracted from the five items in Table 1, which address societal change in attitude toward womens' roles. These data, which are aggregated over subjects, have been drawn from the Netherlands' Social and Cultural Report 1976 (Social and Cultural Planning Office, 1977). The present analysis illustrates how group data, common in survey research, may be- processed in terms of a measurement model which provides other, supplemental forms or aggregation.

This "force of aggregation" (see Green, 1978) has revealed highly systematic attitude and attitude-change structures at the societal level in the Netherlands in the early 1970s. Table 4 shows a consistent liberalization in the perception of womens' roles in several different areas. Moreover, a parsimonious account of the endorsements in Table A- was sustained in a test of the hypothesis of unidimensional attitude structure. A second, attitude-change hypothesis, stated in terms of this mediating structure, revealed a highly significant shift toward alleviating "feminine constraint" between 1970 and 1975. This latter result summarizes the shifts observed for the separate items in Table 4.

These clear-cut results, although limited to the present, restricted data set, suggest the types of lawful relationships which can become visible in the aggregate. The situation here is analogous to the differential magnitudes of multiple correlation coefficients observed for cross-sectional and time-series data. Theil, for example has observed:

It is well known that, by and large, multiple correlation coefficients tend to be smaller when the regression is computed on data which are characterized My a lower degree of aggregation. It- is not too difficult to obtain an R larger than .9 when running a time series regression of total per capita consumption on per capita income. The R- is usually smaller when the dependent variable is consumption of a particular commodity group, such as meat. It is lower still when we used cross-section rather than time-series data in such a meat regression, because there is then no aggregation over consumers. The R2 will typically be further reduced when the cross-section data refer to consumption during a month instead of a year, because there is then less aggregation over time. Disaggregation typically raises the importance of accidental factors and thus lowers R2. (Theil 1973, p. 133)

These econometric findings are reflected in the psychological literature by Epstein (1980), who echoes Katona's (1978) call for a macro-psychology using aggregation over subjects, stimuli, occasions, and measures. Katona has aggregation over subject, stimuli, occasions, and measures. Katona has illustrated the power of aggregation in bringing into focus the relationship between attitude, as measured by the well-known Index of Consumer Sentiment, and behavior in the form of discretionary consumption.

The present paper carries this principle one step further by imposing a stochastic model upon the aggregate data. Our logistic response analysis demonstrates the effectiveness of this type of explication '" ordering and summarizing unwieldy masses of percentages typically associated with national surveys and commercial data sources. It is hoped that this kind of analysis will serve to monitor social change, identify opportunities, diagnose problems as well as lead us to societal laws waiting to be discovered.

APPENDIX A1

THE DESIGN MATRIX

The design matrix for a GLS analysis must be in reduced, full rank form, i.e., its columns must be linearly independent, representing only the "free" parameters in the model. In the present case of (3) we have sixteen of these parameters which are estimated by those values in Tables 2, 3, and 4 that are not underscored. The side conditions (4) generate the underscored values in Tables 3 and !,, as well as given rows in the design matrix X in Table A1.

The rows of this matrix, which are labeled by the twenty logits, are ordered in the couplets l_{kj1} and l_{ij2}. The first ten rows represent the five couplets corresponding to the 1970 lines in Table 1 of the text, while the last ten rows of x contain the five couplets associated with the 1975 lines. The columns of X are labeled by the sixteen "free" parameters in our analysis. The elements in each row describe the linear combination in these parameters dictated by (3) for the logit of that row. For example, (3) indicates that

E(l_{152}) = v_{15} - r_{12} + y_{52},

which is the expected value or the logit labeling the tenth row of X. However, due to the side conditions above, r_{12} = -r_{11}, and

y_{52} = -y_{51}

= -(-y_{11} - y_{21} - y_{31} - y_{41})

= y_{11} + y_{21} + y_{31} + y_{41}, (18)

which express T_{12} and T_{52} in terms of the "free" parameters labeling the columns of S. Making these substitutions. we have

E(1_{152}) = v_{15} - (-r_{11}) + (y_{11} + y_{21} + y_{31} + y_{41})

= v_{15} + r_{11} + y_{11} + y_{21} + y_{31} + y_{41},

which is the linear combination indicated in the tenth wor or Table A1.

APPENDIX A2

THE HYPOTHESIS MATRICES

Table A) contains the C matrices for testing the several hypotheses described in the text. The scalability hypothesis operates upon the "free" Y_{ik} parameters, with the side conditions implicitly placing this hypothesis upon the remaining Y_{jk}. For example, Y_{11} = Y_{21} = Y_{31} = Y_{41} = 0 implies Y_{52} = 0 in equation (18).

Each line in Table A2 expresses a given linear combination in the parameters, and the associated hypothesis in the text asserts the corresponding set or Linear combinations to be zero. The scalability hypothesis and H_{"} each have four degrees of freedom, which is the number Or rows in its C matrix. The remaining hypotheses all have :1 single degree of freedom.

The scalability matrix simply indicates that the four Y_{jk} are to be set equal to zero, while the matrix for H_{C} describes the linear form u_{1}. - u_{2}., which this hypothesis equates to zero. The C matrix for "Schooling," for example, contains the coefficients of the

THE DESIGN MATRIX X FOR MODEL (3)

form u_{12} - u_{22}, which, when set to zero, asserts the null hypothesis of no change in the societal endorsement of this item.

The construction of the C matrix for H_{"} is slightly more involved, due to the greater complexity of the linear form contained in this hypothesis. First, we note that this form is a double interaction in the endorsement parameters and that double interactions are doubly centered (i.e., sum to zero by rows and columns). Therefore, since these endorsement interactions constitute a 2 X 5 matrix, we effect H_{"} by equating (9 - 1) X (5 - 1) = 4 of them to zero. The interactions chosen here are chose associated with u_{11}, u_{12}, u_{13}, u_{14}, which, due to double centering, automatically equate the remaining six interactions to zero as well.

For example, consider the interaction

u_{12} - u_{1.} - u_{.2} + u_{..}

corresponding to the leading term u_{12}. This effect is composed of the four linear forms given in Table A3. When combined as indicated in this interaction, the rows in Table A3 generate the composite form in the second row of the C matrix for H_{"} in Table A2.

THE SEVERAL HYPOTHESIS MATRICES C

INGREDIENTS FOR THE SECOND ROW OF THE C FOR H_{"}

REFERENCES

Bechtel, G.E. (1981), "Measuring Subjective Social Indicators". Journal of Economic Psychology, 1, 165-181.

Epstein, S., (1980), "The Stability of Behavior", American Psychologist, 35 (September), 790-806.

Green, D. S. (1978), "In Defense of Measurement", American Psychologist, 33 (July), 664-679.

Grizzle, J. E.,C. F. Starmer, and G. G. Roch (1969), "Analysis of Categorical Data by Linear Models", Biometrics, 95 (September), 489-504.

Huang, D. S. (1970), Regression and Econometric Methods, New York: McGraw-Hill Book Co., Inc.

Johnson, J. (1979), Econometric Methods: Second Edition. New York: McGraw-Hill Cook Co., Inc.

Kerlinger, F. N. and E. J. Pedhazar (1973) Multiple Regression in Behavioral Research, New York: Holt. Rinehart. and Winston. Inc.

Lehmen, R. G. and G. C. Koch (1971) "A General Linear Approach to the Analysis of Non-Metric Data: Applications for Political Sciences," American Journal of Political Science, 18, 283-313.

Lickert, R. (1932), "A Technique for the Measurement of Attitudes. Archives of Psychology 140. 1-55.

Lingoes, J. C., and M. Pfaff, (1972 "The Index of Consumer Satisfaction: Methodology." In. Venkatesan (ed. ), Proceedings Association Research, Ann Arbor, 689-712.

Maddala, G. S. (1977), Econometrics, New York: McGraw-Hill Book Co., Inc.

Masters G. N. (1982) "A Rasch Model for Partial Credit Scoring: Psychometrica, 47, 149-174.

Miller, G. (1977), Econometrics, New York: McGraw-Hill Book Co., Inc.

Rasch, G. (1966) "An individualistic Approach to Item Analysis, P. E. Lazarfeld and N. W. Henry (eds.) Readings in Mathematical Social Science. Chicago: Science Research Associates

Social and Cultural Planning Office (1977), "Social and Cultural Report, 1976" RijswiSk, The Netherlands.

Theil, H. (1970), "On the Estimation Of Relationships Involving Qualitative Variables", American Journal of Sociology, 76, 103-154.

----------------------------------------

Tweet
window.twttr = (function (d, s, id) { var js, fjs = d.getElementsByTagName(s)[0], t = window.twttr || {}; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = "https://platform.twitter.com/widgets.js"; fjs.parentNode.insertBefore(js, fjs); t._e = []; t.ready = function (f) { t._e.push(f); }; return t; } (document, "script", "twitter-wjs"));