Estimating Interactions in Conjoint Analytic Tasks: the Use of Sequential Designs

Sunil Gupta, Columbia University
ABSTRACT - This paper shows how sequential designs, tailored to the responses of each subject, may be used to estimate interactions in data collected using conjoint-analytic methods. A simulation study to compare the proposed procedure versus existing methods for estimating interactions is also suggested.
[ to cite ]:
Sunil Gupta (1985) ,"Estimating Interactions in Conjoint Analytic Tasks: the Use of Sequential Designs", in NA - Advances in Consumer Research Volume 12, eds. Elizabeth C. Hirschman and Moris B. Holbrook, Provo, UT : Association for Consumer Research, Pages: 177-182.

Advances in Consumer Research Volume 12, 1985      Pages 177-182

ESTIMATING INTERACTIONS IN CONJOINT ANALYTIC TASKS: THE USE OF SEQUENTIAL DESIGNS

Sunil Gupta, Columbia University

ABSTRACT -

This paper shows how sequential designs, tailored to the responses of each subject, may be used to estimate interactions in data collected using conjoint-analytic methods. A simulation study to compare the proposed procedure versus existing methods for estimating interactions is also suggested.

INTRODUCTION

Consumers' utility measurement for multiattribute alternatives has received a lot of attention in the marketing literature. In recent years the use of conjoint measurement as the research tool has been very popular (Green and Srinivasan 1978, Cattin and Wittink 1980). As has been pointed out by Green and Devita (1975), most applications of the technique have "emphasized noninteractive models of either an additive or multiplicative nature". This disproportionate emphasis probably stems from two main causes:

i) The number of judgements that would be required from a subject goes up very rapidly if we try to estimate the two-factor or higher order interactions. If the number of levels for each of the factors, or the number of factors themselves is even moderately large, the task can become very burdensome for the respondent. Main effects plans alleviate this problem.

ii) Main effects alone, very often, do seem to explain a large part of the total variance. Thus it is argued that the additional explanation is not enough justification for a bigger task.

Previous Methods of Estimating Interactions

While the above arguments are often valid, situations do occur (Green and Devita 1974) when an interaction mode] of utility is more appropriate. Based on a simulation study, Carmone and Green (1979) report significant improvements in the prediction of first choices if two factor interactions are estimated. The use of interactive models is especially appropriate when the purpose of the researcher is not merely to predict choice at an aggregate level but also to explain the structure of the choice model. Researchers utilizing the functional measurement approach (Anderson 1970) have repeatedly emphasized the benefits to be gained from the estimation and interpretation of at least some of the interactions. In marketing, the approach was introduced by Bettman, Capon and Lutz (1975). .sore recently, in a study examining preferences for apartments, Curry, Levin and Gray (1977) report the existence of significant non-crossover interactions. Similar results are reported by Tantiwong (1982) in a study of foot store preferences among the elderly consumers.

The most important difference between functional measurement and conjoint measurement lies in the scale quality of the dependent variable. Whereas conjoint measurement uses ranking data, functional measurement calls for ratings data. A result of this difference is that conjoint measurement uses non-metric analysis techniques such as MONANOVA to decompose the judgements whereas functional measurement typically employs the metric ANOVA procedures. However, the latter difference is becoming less important in light of the virtual equivalence of results reported from the use of metric or non-metric techniques (Jain et. al., 1979; Currim, 1980).

The problem facing the researcher then, is that, on the one hand, the estimation of interactive utility models could provide meaningful insights into consumers' choice model. On the other hand, the relatively large data requirements that would be needed to estimate individual interactive utilities would be prohibitive. As has been noted earlier, the choice has often been resolved in favor of staying with main effects models requiring less data. Notable exceptions, however, exist.

Green and Devita (1974) use a complimentarily model of consumer utility to estimate main effects and selected two-factor interactions using a combination of MONANOVA and MDPREF. However, the model is not necessarily usable for the usual applications of conjoint measurement since it decomposes the starting attributes (e.g., entree and dessert) into more primitive characteristics (e.g., caloricness and serving temperature). In most typical applications, the researcher begins with the most basic descriptors (e.g. miles per gallon, or price etc.) and thus cannot gain the extra degrees of freedom needed to estimate this model.

Another approach suggested by Green and Devita (1975) involved a combination of orthogonal arrays and two-at-a-time tables. The orthogonal arrays are used to estimate the main effects and the trade-off tables to uncover possible two-way interactions. As the authors themselves note, if the number of factors to be studied is substantial (8 to 12), the number of two-way tables required can make the task very tedious. Other shortcomings of the approach are also noted by the authors (the inappropriateness of significance tests for interactions, the judgmental decomposition of interactions using MDPREF, and context effects of using 2 way tables). More recently, Green, Goldberg and Montemayor (1981) have proposed the hybrid utility estimation model. In this approach, self-explicated weights are combined with profile ratings (as in functional measurement) to estimate main effects at the individual level and two-way interactions at the segment level. A similar procedure has been used by Tantiwong (1982).

The shortcoming of this approach is in the use of purely additive data to cluster respondents. If in fact there are significant interactions, we would like to be able to use that information in forming the clusters. In addition, no information on interactions is gained at the individual level. Thus, it would seem helpful to have a procedure that allows us to estimate, idiosyncratically, the most important interactions for each individual without burdening the respondent with profiles that would be helpful only in estimating other potentially less significant interactions, or only to help reduce the error variance for already estimated effects. The sequential fractional factorial does this. The availability of micro-computers and their rapidly increasing role in data collection make such an approach feasible. The basic concept required is that of estimability of parametric functions and its relation to sequential fractional factorials. In the rest of this paper we shall:

i) define estimability and how the concept can be used to selectively estimate interactions;

ii) give two examples, using some popularly employed plans, of how to use the technique; and

iii) propose a Monte-Carlo simulation to allow us to compare the proposed approach with those reviewed earlier in this section.

ESTIMABILITY

The use of orthogonal arrays and other reduced plans has become very popular among consumer researchers employing conjoint-like procedures since Green (1974) introduced them to the marketing literature. In particular, the main effects plans provided by Addelman (1962) have been quite popular (Malhotra 1982; Krishnamurthy, 1981). The economy achieved, in terms of number of judgements required by a respondent, is gained at the expense of the number of effects that may be estimated. Typically, the higher order interactions are confounded among themselves or, in some cases with the main effects. Though such plans have found common use, very little attention has been paid to identifying the confounding patterns. For example, in several of Addelman's plans, second order interactions are confounded with main effects. Thus, a large main effect could be the result of a significant two-factor interaction, and yet be interpreted as an additive main effect.

There is, therefore, a need to deduce precisely the estimable functions in a given design. A parametric function (c>) is estimable if and only if these exists a linear combination of the observations ay such that ay = cB . Here y is the vector of observations, and cB is the contrast of constants being investigated. Scheffe (1959) has proved the theorem that a parametric function ct is estimable if and only if there exists a vector z such that c - aX, (i.e., c is in the row space of X, where X is the design matrix). An immediate consequence of this is that the maximum number of functions that may be estimated is equal to the dimension of the design matrix.

Thus, if the rank of the design matrix is equal to the number of parameters of interest, each one of them is estimable This can be seen from the definition of estimability by noting that the least squares estimate cE = c(X X)-l Xy is a linear function of y with expected values cE.) This is the case when we have a complete factorial design.

When a rank of the design matrix is less than the number of parameters of interest (number of columns of X), some, but not all, parametric functions are estimable. Consider the design matrix shown below:

       b0   b1     b2

        1   -1     1

        1   -1     1

X=   1   -1     1

        1     1     1

        1     1     1

According to Scheffe's theorem, a parametric function cb is estimable if and only if c is a linear combination of the row space of X. It can clearly be seen that bo and b2 are completely confounded. Thus, the estimable functions are bn + b2 and b1. (bo is aliased with b2).

Suppose we have already made the runs given above and have had a chance to analyze the data. Now, suppose it turns out that bo + b2 is significant. We would then like to de-alias b0 from b2. A possible solution would be to add at least one run where b0 and b2 have opposite signs. Thus, -1 1 1 could be one such run. Now, the rank of the matrix is equal to the number of parameters of interest (3); therefore, they are all estimable. On the other hand, if b0 + b2 were not significant, we need not have made the additional run.

What we have just described are the rudiments of planning and conducting sequential fractional factorial designs. The important steps in the process are:

i) Determine which functions are estimable in the smaller model (e.g. the main effects only model).

ii) Run an experiment based on this smaller design.

iii) Analyze the results to see if there is any need to make some additional runs (e.g. if we need to de-alias a main effect from some two-factor interactions).

iv) Reanalyze the data, after incorporating the new data.

In some cases (e.g., the example given above or, fractions of 2n plans), it is relatively easy to uncover the confounding pattern (Green, Carroll and Carmone 1978). However, for the more saturated designs (e.g., Plackett and Burman 1946, or the more commonly used Addelman 1962 Basic Plans), the confounding patterns are more involved, and the researcher needs to pay greater attention to them. In such cases, a simple way to determine which functions are estimable has been suggested by Box and Wilson (1951).

Suppose a consumer's utility could be represented exactly (up to experimental / measurement error) by a model involving L constants (the main-effects, 2-factor interactions, 3-factor interactions, etc.). However, the researcher wishes to assume a simpler model involving only M < L (e.g., main effects only) constants. He then asks the respondent to judge" > M profiles. Then we have.

n = X1b1 (researcher's assumption)

n = X1b1 + X2b2 (true model)

where

B1 is the set of parameters of interest and

X1 the corresponding design matrix

B2 is the set of parameters teemed insignificant

X2 the design matrix corresponding to these parameters

n = E(y)

The least squares estimates b1 = (X1'X)-1X1'y will in general be biased, since

E(b1) = (X1'X1)-1X1'n

          = (X1'X1)-1X1'X1b + (X1'X1)-1X1'X2b2

          = b1 + (X1'X1)-1X1'X2b2

Thus, (X1'X1)-1X1'X2 is the alias matrix of interest. We illustrate the procedure with the help of the following examples:

Example 1: We first consider the use of this approach with Addelman's Basic Plan 2 for a 31 x 24 experiment in 8 runs.

Under the usual side conditions, the three level factor A will be represented by the 2 columns. Each two-level factor is represented by a single column.

b'  =  u    A1    A2    B     C     D     E

         1    -1      1    -1    -1    -1     -1

         1    -1      1      1     1      1      1

         1      0    -1    -1    -1      1      1

X1 = 1     0    -1       1     1    -1     -1

         1      1      1    -1      1    -1      1

         1      1      1      1    -1      1    -1

         1      0     -1    -1      1      1    -1

         1      0     -1      1    -1     -1    -1

The matrix of 2 factor interactions X2 can be written by multiplying, element by element, each pair of columns above.

then, (X1'X1)-1X1'X2 can easily be computed to give the following results:

E(A1) = A1 - BC - DE - BE - CD

E(A2) = A2 + BD + CE

E(B)   = B - A1C = A1E + A2D

E(C)   = C - A1B - A1D - A2E

E(D)   = D + A2B - A1C - A1E

E(E)    = E - A1B - A2D + A2C

Suppose A and B are the only significant effects. Then, we might be interested in determining whether the reported effect of B is due to A's interaction with C,D or E or a main effect. So we need some run where we would have B + A1C + A1E - A2D. (Added to B - A1C- A1E + A2D, this would give us B unconfounded). One run that could give us the desired result would be

A hi B hi C hi D lo E hi

Another is

A lo B hi C lo D hi E lo

This would give us the following additional rows in the design matrix.

"1   A2    B     C    D    E   A1C   A1E   A2D

1      1     1      1   -1    1     1        1      -1

-1    -1    -1    -1    1   -1     1        1      -1

Possibly both these runs could be included and an estimate of B mate. If B is significant we would have some more faith in the additive model. On the other hand, if B is not significant then the interactions between A and C and A and E are of importance. If desired, one or two additional runs could be used to resolve which of these two is significant.

ii) Next we consider a 27-4 Resolution III design (main effects are confounded with second order interactions). The alias pattern for this design is given by:

A = A + BD + CE + FG      B = B + AD + DF + EG

C = C + AE + BF + DG      D = D + AB + EF + CG

E = E + AC + DF + BG       F = F + BC + DE + AG

G = G + CD + BE + AF

Suppose the main effect of D was the largest one. We might, then, wish to de-alias all the 2 factor interactions involving D to determine if some significant interactions exist. Using the same sort of reasoning as in the above example, we add runs which, when combined with the previous sets of runs, will give us unconfounded estimates of D. A simple way to do this would be to repeat the fraction with the levels of D switched. The alias pattern then for the 16 runs combined would be:

A = A + CE + FG      B = B + CF + EG       BD = BD      CD = CD

C = C + AE + BF      D = D                         DE = DE       DF = DF

E = E + AC + BG      F = F + BC + AG       DG = DG      AB = AB + EF + CG

On the other hand, if no single main effect seems to dominate, a second fraction could be run with signs switched in all the columns. This will give the main effects clear of 2-factor interactions.

Thus, based on the results, we can choose to investigate the most promising aspect of the choice structure further.

IMPLEMENTATION

In the previous section, we have shown how sequential use of fractional factorial designs can be employed to estimate some of the interactions at the individual level. Addelman (1969) has provided a whole series of sequences which can be used with various 2n plans. If this procedure is combined with the blocking technique, suggested by Green, Goldberg and Montemayor (1981), and implemented in Tantiwong (1982), we should be able to make much more accurate assessments of choice structures. In addition to the information regarding the additive effects that we usually have, we would also have some information about the most important interactions at the individual level. This information could be used to develop better clusters. Having developed the clusters we could then estimate all the interactions of interest at the cluster level. In this section, we briefly describe how the plan might be implemented.

The availability of microcomputers as a data collection instrument is an important development. It is in effect, possible to tailor the conjoint task to best elicit data from respondents. Most microcomputers have fairly efficient regression programs available. Since we are asking for ratings data, at the end of the first set of profiles, a regression could be run and the beta weights examined. Branching routines needed to clean up the effects (based on this information) interactions could be pre-programmed. Thus, from the point of view of the respondent, there will be no real difference between the first set of profiles and the additional runs.

The branching rules to be used could be based on strictly statistical criteria such as "If the effect is significant at the .1 level, get judgements to de-alias the 2-factor interactions." Alternatively, a judgmental statement such as "Ask for additional judgements only if the effect is 4 times as large as the next biggest effect," could be used. The branching rules could be made more or less conservative based upon the researcher's objectives. Finally, an absolute stopping rule could be included which set a maximum limit for the number of profiles to be judged by any one respondent.

Empirical tests of the usefulness of these procedures is needed. In the next section, we propose a Monte Carlo simulation to test the proposed approach versus the three approaches reviewed earlier.

A PROPOSED MONTE CARLO STUDY

We would like to test how well the procedures outlined above can uncover the important interactions and to compare the results with those from the use of the three models discussed in the review section (complementarity, utility interaction, and the hybrid model). In addition, to make the comparison more complete, we would like to test the simpler additive model and the more complex complete model. Though it would be extremely desirable to collect data from respondents regarding their choices for multiattribute alternatives. a Monte Carlo study is suggested at this stage. The greater control afforded is desirable. Simulation studies have been found to be useful in similar contexts in the Past (Carmone and Green 19795.

Independent Variables

To get a reasonable range of situations that such estimation procedures might actually be employed in we suggest the following independent variables for the simulation:

Choice Structure: Preferences for the hypothetical respondents will be simulated using the five utility generating models listed below.

i) A main effects only model

ii) A main effects model with noncrossover interactions

iii) A limited main effects model with noncrossover interactions

iv) A main effects model with crossover interactions

v) A limited main effects model with crossover interactions

There are the first five models listed in Carmone and Green (1979). Their sixth model is not appropriate to our task.

Based on the above generating models, we shall simulate 5 samples of 180 respondents each. After having estimated the parameters using each of the models, we shall randomly select various proportions from these samples. The purpose of the latter step is to test the quality and nature of the clusters that would be formed if the parameters estimated were to be used as the clustering variables.

Error Distribution: We shall use two forms of error distribution. The sequential approach suggested here should be favored by normally distributed error since the tests of significance are based on this assumption. However, very often the data does not conform to the normality assumption. So, we shall generate the data using normally distributed and uniformly distributed error.

Error Variance: Here, once again, following Carmone and Green (1979) we shall use three levels of error variance to simulate the measurement error usually encountered when collecting such data from respondents. The three levels are:

i) Zero-level error: no error is added to the individual observations

ii) Medium-level error: while the mean of the error will be zero, the standard deviation will be such that the coefficient of variation (standard deviation of error term divided by the average response over the full design) is equal to 5%.

iii) High level error: the mean is again zero, but the coefficient of variation is 10%.

Factors and Levels: The utility generating models will use the following two factorial structures as the basis for simulating the choices of the respondents:

i) a 31 x 24 design

ii) a 28 design.

The Dependent Measures

For each sample, and combination of samples, parameters will be estimated using the following models:

i) a main effects only model. This is the simplest model that could be used;

ii) a complete model. This is the most complex model that could be used;

iii) the interactive utility model;

iv) the complimentarily model;

v) the hybrid utility model; and

vi) the sequential approach suggested in this paper.

The criteria by which we shall judge the performance of each of these models are:

i) correlation between actual rating and predicted rating of holdouts.

ii) prediction of first choice from among the holdouts.

iii) number of judgments required from the subjects.

iv) the quality of the cluster analysis solution based on the derived parameters.

SUMMARY

In this paper we have noted the usefulness and importance of studying the possibility of there being meaningful interactions in data collected using conjoint-analytic methods. The concept of estimability is explained and examples of how it might usefully be employed are given. The notion of sequential designs, tailored to the responses of each subject was then introduced. Such designs were not feasible in marketing contexts in the past because it was not possible to observe some of the responses given by a respondent before deciding which question to ask next. So, the researcher was forced to adopt an all or nothing approach. Now, with the availability and increasing using of microcomputers in the data collection phase, this approach has become feasible. If the results from the simulation study suggested are encouraging, then field tests of the methodology should be conducted.

The programming effort for the simulation study is currently under way. The best results will probably realized if the method is incorporated at the individual level in the hybrid utility model. This will give us information about the most important interactions at the individual level and other interactions at the segment level.

REFERENCES

Addelman, S. (1962), "Orthogonal Main-Effect Plans for Asymmetrical Factorial Experiments," Technometrics, 4 (February), 21-45.

Addelman, S. (1969), "Sequences of Two-Level Fractional Factorial Plans," Technometrics, 11 (August), 477-509.

Anderson, N.H. (1970), "Functional Measurement and Psychophysical Judgment," Psychological Review, 77, 153-70.

Bettman, J.R., N. Capon and R.J. Lutz (1975), Cognitive Algebra in Multiattribute Utility Models," Journal of Marketing Research, 12 (May). 151-164.

Box, G.E.P. and K.B. Wilson (1951), "On the Attainment of Optimum Conditions," Journal of the Royal Statistical Society, Series B (Methodological), 13 (1), 1-38.

Carmone, F.J. and P.E. Green (1979), "Model Misspecification in Multiattribute Parameter Estimation," Working Paper, Wharton School. University of Pennsylvania.

Cattin, P. and D.R. Wittink (1981), "Commercial Use of Conjoint Analysis: A Survey," Research Paper No. 596, Graduate School of Business, Stanford University.

Currim, I.S. (1980), "Disaggregation Issues Within a Consumer Based Brand Decision Support Model," Unpublished dissertation, Graduate School of Business, Stanford University.

Curry, D.J., I.P. Levin and 'I.J. Gray (1977), "Comparison of Conjoint Measurement and a Functional Measurement Analysis of Apartment Preferences." Working Paper. University of Iowa.

Green, P.E. (1974), "On the Design of Choice Experiments Involving Multifactor Alternatives," Journal of Consumer Research, l (September) 61-68.

Green, P.E., Caroll, J. Douglas, and F.J. Carmone (1978), "Some New Types Fractional Factorial Designs for Marketing Experiments," in Research in Marketing, Vol. l, ed. J.N. Sheth Greenwich, CT: Jai Press.

Green, P.E. and M.T. Devita (1974), "A Complimentarily Model of Consumer Utility for Item Collections," Journal of Consumer Research, 1 (December), 56-67.

Green, P.E. and M.T. Devita (1975), "An Interaction Model or Consumer Utility, "Journal of Consumer Research, 9 (September), 146-153.

Green, P.E., S.M. Goldberg and M. Montemayor (1981), "A Hybrid Utility Estimation Model for Conjoint Analysis " Journal of Marketing, 45 (Winter), 33-41.

Jain, A.K., ,. Acito, N.K. Malhotra and V. Mahajan, "A Comparison of Internal Validity of Alternative Parameter Estimation Methods in Decompositional Multiattribute Preference Models," Journal of Marketing Research, 16, 313-22.

Krishnamurthy, L. (1981), "Modelling Joint Decision Making Through Relative Influence," Unpublished dissertation, Graduate School of Business. Stanford University.

Malhotra, N.K. (1982), "Structural Reliability and Stability of Nonmetric Conjoint Analysis," Journal of Marketing Research, 19 (May), 199-207.

Plackett, R.L. and J.P. Burman (1941), "The Design of Optimum Multifactorial Experiments," Biometrika, 33, 305-325.

Scheffe, H, (1959), The Analysis of Variance, John Wiley and Sons.

Tantiwong, D. (1982), "Shopping Habits, Store Evaluation and Preferences of the Elderly Consumers," Unpublished dissertation, Graduate School of Business Administration, University of California, Berkeley.

----------------------------------------