Varying Approaches to Data Analysis Discussant's Remarks

Stephen J. Arnold, Queen's University
[ to cite ]:
Stephen J. Arnold (1990) ,"Varying Approaches to Data Analysis Discussant's Remarks", in NA - Advances in Consumer Research Volume 17, eds. Marvin E. Goldberg, Gerald Gorn, and Richard W. Pollay, Provo, UT : Association for Consumer Research, Pages: 434-437.

Advances in Consumer Research Volume 17, 1990      Pages 434-437

VARYING APPROACHES TO DATA ANALYSIS DISCUSSANT'S REMARKS

Stephen J. Arnold, Queen's University

The title of this session reflects the diversity of the four papers. One paper models choice of ground coffee. The second paper models market share as a function of number of newspaper stories. The third paper compares two optimal stimulation level scales. The fourth paper prescribes alternatives for accounting for systematic measurement error in structural equation models.

Unable to establish a common theme among these papers, I decided to focus upon the Wahlers and Etzel paper. It raises an important issue--whether or not a scale can be multidimensional. After commenting on this paper, I will make brief remarks about the other three papers.

WAHLERS AND ETZEL

The paper by Wahlers and Etzel examines the "internal structure of two measurement models used to operationalize the optimal stimulation level (OSL) construct in consumer research." Specifically, they consider two forty-item scales, the Sensation Seeking (SS) scale and the Arousal Seeking Tendency (AST) scale. They hypothesize four and five dimensions underlie the SS scale and the AST scale, respectively.

Wahlers and Etzel attempt to confirm the expected dimensionality of each scale using a second-order factor model. In this model, OSL is the second order construct, the four or five dimensions are the first-order constructs and the indicators &-e single-item measures. The research objective is "to examine both the measurement congruence between the scales and to assess the dimensionality of the measures. The results should provide consumer behavior researchers some direction in OSL scale selection--particularly when dimensionality is relevant. "

Although initial analyses of the SS scale produced "unsatisfactory fit," Wahlers and Etzel make modifications until they are able to show that thirty-nine of forty items load as per the underlying four-dimensional structure. In the AST scale, however, they find seventeen items in the AST scale are inappropriately cross-loaded on the five-dimensional structure and that seven items fail to load on their respective hypothesized dimensions. They conclude by saying "researchers interested in selecting a measurement model to investigate the dimensions of the OSL construct would be advised to employ Sensation Seeking rather than Arousal Seeking Tendency."

I question whether these results "provide consumer behavior researchers some direction in OSL scale selection." Specifically, I believe that finding multidimensionality in a scale is an undesirable result. Since the AST results show more unidimensionality, I conclude researchers would be advised to employ the AST rather than the SS scale.

Several arguments can be made against pursuing and finding multidimensionality in scales. One argument is that it is contrary to the unidimensionality thesis advanced by Gerbing and Anderson (1988) and others. Anderson and Gerbing (1988), for example, state:

"that the computation of this composite score is meaningful only if each of the measures is acceptably unidimensional. Unidimensionality refers to the existence of a single trait or construct underlying a set of measures" p. 186

Other things being equal, a scale exhibiting multidimensionality g have a lower coefficient alpha, a measure of reliability, than will a unidimensional scale. The structure displayed in Figure 1 of the Wahlers and Etzel paper implies items associated with a specific first-order construct will be highly correlated with each other. However, these same items will be less correlated with items representing the other first-order constructs. Thus, the average intercorrelation among all items is less than it would be under unidimensionality. A lower average intercorrelation in turn drives down coefficient alpha. It is not surprising that reliability estimates are 0.851 for the SS scale and 0.853 for the AST scale. With a forty-item unidimensional scale, it would seem reasonable to expect reliabilities in the 0.9 range.

There are inherent contradictions in the concept of a multidimensional scale. For example, each of the forty items are stated to be indicative of the one second-order OSL construct. This implies all of the items are related to each other. At the same

time, each of the forty items are posited to be indicative of one and only one of the four or five dimensional OSL constructs. This implies the items not indicative of the same construct are unrelated.

The clear factor pattern and "good fit" of the SS multiple construct model underlying Table 3 in Wahlers and Etzel means the four first order scales exhibit discriminant validity. In fact, these reputedly alternative OSL scales should exhibit convergent validity.

The unclear factor pattern and "several problems" of the AST results in Table 4 suggest at least some degree of convergence among the five alternative OSL scales. This provides some evidence of convergent validity for the AST scale. Thus, if the objective is to assess the construct validity of the SS and AST scales, the AST scale gives better results. This conclusion is opposite to that drawn by Wahlers and Etzel.

In general, a unidimensional set of items representative of the OSL construct should be resistant to demonstrating any stable underlying factor structure. As a consequence, any attempt to fit a model which expresses the items as a function of multidimensional constructs should result in a poor fit.

FIGURE 1

TWO-STAGE SCALE ASSESSMENT

Substituting four or five constructs for one construct defeats the criterion of parsimony in a theoretical model. A simple structure has been made more complex without any apparent gain.

Nomological validity is also not demonstrated when variations of the OSL construct, e.g., Thrill and Adventure Seeking, Experience Seeking, etc., are expressed as functions of the OSL construct. Nomological validity is demonstrated in a structural context when either a. the OSL construct is confirmed to be a function of two or more antecedents, e.g., situation characteristics, personality traits, or, b. the OSL construct is confirmed to have certain hypothesized consequences, e.g., innovative behavior, risk taking.

Having raised questions about the Wahlers and Etzel approach to scale assessment, I feel obliged to present an alternative. Following Cronbach and Meehl, Campbell and Fiske, Nunnally, Heeler and Ray, Churchill, Bagozzi, and Peter among others, scaling research should work to establish construct validity of the scale. Construct validity, in turn, means finding evidence of convergent validity (scales indicative of the same construct are highly related), discriminant validity (scales indicative of different constructs are not related to each other) and nomological validity (expected relationships between constructs are confirmed).

To evaluate these three forms of validity, scale assessment appears to require estimation of two models rather than one, contrary to Wahlers and Etzel. Estimation of an "item analysis model" indicates discriminant validity. Estimation of a "structural model" demonstrates convergent validity and nomological validity.

The item analysis model is illustrated in Figure 1. Note that it is a confirmatory factor analysis model where only one construct is OSL-related. The other constructs are unrelated to OSL. This is a second fundamental difference from the Wahlers and Etzel approach.

The indicators in the item analysis model are groups of single-item measures indicative of each construct. The set of indicators associated with the OSL construct will be the group of single-item measures under investigation, e.g., the forty-item SS scale. Each of the other groups of single-items represent each of the other constructs, e.g., a group of innovativeness items represent the innovativeness construct.

When this model is estimated, the l's are indicative of the relationship between each item and its corresponding scale. High l's are desired. Anderson and Gerbing (1988) show that a function of the l's also provides an expression for computing the reliability of the scale.

The f's between the OSL construct and the other constructs are indicative of the discriminant validity of the SS scale (Widaman 1985).

The structural model to be estimated for scale assessment is also illustrated in Figure 1. In this model, the construct related to the scale under investigation is expressed as either an antecedent to, or consequence of, the other constructs. The indicators of the construct of interest are three alternative multiple-item scales (the SS scale, the AST scale and another OSL scale). Single-item measures are not used.

When this model is estimated, the cross-product of the l's for the SS and AST indicators demonstrate the degree of convergent validity (Widaman 1985). The g's between the OSL and other constructs demonstrate the existence of nomological validity.

GRUCA

Imagine a new supermarket is built in your neighborhood. What happens to your chances of shopping at each of the existing supermarkets? It seems they would either stay the same or fall. It doesn't seem logical they would rise. This is the regularity assumption which is tested in the Gruca paper. Instead of supermarket data, however, he uses the ubiquitous coffee data.

I believe Gruca understates how well his results support the regularity assumption. Table 4 gives the share changes for the seven brand loyal segments. The percentage change to focus on in each segment is obviously the one for the brand to which the segment is loyal. In seven out of the seven segments, the portion loyal to the brand drops after entry of-the new brand.

The small number of households used in the estimation (52 in Pittsfield and 44 in Marion) raises questions of external validity. Systematically excluded are the light users and those who purchase beans, instant coffee, decaffeinated coffee, minor brands or the new brand. Also excluded are coffee purchasers who are not on either of the two panels.

Obviously, no one test of an assumption can be definitive. Continued testing is justified and the validity of the assumption at any point in time depends on the accumulated evidence.

FAN AND SHAFFER

In marketing communications courses, we always refer to the potential influence of public relations activities in addition to the effects of advertising and personal selling. However, I can't recall any empirical advertising/sales or market share models which included measures of this form of unpaid media presence or "information pressure" as it is referred to by Fan and Shaffer. Drawing our attention to this variable and describing a measurement approach is a contribution made by these authors. They also highlight the use of electronic databases which I think is an under-utilized resource in marketing research.

Whether or not information pressure will explain additional variance in measures of sales and market share after accounting for the other major variables--advertising spending, distribution and pricing--is another question. I can anticipate a high correlation between advertising spending and unpaid media presence. If this was the case, unpaid presence would not have a significant effect once the effects of advertising spending were partialled out.

This question in turn leads to a more general question--how valid are these results? What we see is a significant relationship between information pressure and market share when the data for one measure is related to the data for one other measure. However, any series of data with up to one inflection point can be fitted quite well with a two-parameter model. A spurious relationship is another possibility. Thus, obtaining measures for the other major variables and including them in the model is an obvious and important next step.

If obtaining these other measures is a problem, establishing predictive rather than nomological validity is an alternative. This could be done using-a holdout sample. The model would be estimated on the first half of the data and tested on the second.

Consistency in formulation is also important which applies to the current effort as well as to any future models which might be estimated. By this I mean that the dependent variable, market share, is a relative or proportionate measure. The specification for the independent measures should be the same. Thus, information pressure should be specified as share of new stories rather than absolute number of news stories.

COTE AND GREENBERG

This paper is an important follow-up to Cote and Buckley's 1987 paper which alerted us to the potential magnitude of method variance. The current paper outlines alternative techniques for accounting for method variance as well as other forms of potential measurement error.

I have only one observation and it is to emphasize a disadvantage which was acknowledged for the first three alternative techniques: "identification may be difficult to achieve." With a few exceptions, I believe identification will be extremely difficult to achieve. This only emphasizes the importance of the correlated errors model alternative. It may also point to the utility of a fifth alternative technique which is not mentioned: Perhaps we will have to revert lo inspection of the multitrait, multimethod correlation matrices. For example, an inspection of the correlation matrix for the eighteen measures should reveal that Y1. Y2 and y3 correlate more with y4, y5 and Y6 than with y7 through Y18.

In any case, determining whether or whether not there is method variance is dealing only with nomological validity. It will still be necessary to consider the convergent and discriminant validities as I discussed earlier when commenting on the Wahlers and Etzel paper.

REFERENCES

Gerbing, David W. and James C. Anderson (1988), "An Updated Paradigm for Scale Development Incorporating Unidimensionality and its Assessment," Journal of Marketing Research, 25 (May), 186- 192.

Widaman, Keith F. (1985), "Hierarchically Nested Covariance Structure Models for Multitrait-Multimethod Data," Applied Psychological Measurement, 9 (March), 1-26.

----------------------------------------