# Attribute Importance Weights in Conjoint Analysis: Bias and Precision

^{[ to cite ]:}

Sanjay Mishra, U. N. Umesh, and Donald E. Stem, Jr. (1989) ,"Attribute Importance Weights in Conjoint Analysis: Bias and Precision", in NA - Advances in Consumer Research Volume 16, eds. Thomas K. Srull, Provo, UT : Association for Consumer Research, Pages: 605-611.

^{[ direct url ]:}

http://acrwebsite.org/volumes/6970/volumes/v16/NA-16

Consumer researchers have used conjoint analysis to evaluate the importance of an attribute in forming preferences. Although past researchers have tested the validity and reliability of the overall conjoint analysis results, some of the properties of the individual importance weights have remained unknown. Using a simulation, the current paper estimates the bias and precision of the importance weights. The bias and precision are each found to vary as a function of the estimation algorithm, judgmental error level, evaluation strategy used, number of profiles and attributes in the evaluation task and the number of attribute levels. When the dominant attribute evaluation strategy is used, the estimates of the importance weights have large biases and poor precision.

INTRODUCTION

Conjoint analysis is used extensively to model consumer preference. It is also widely used in other areas of consumer behavior (Green and Srinivasan 1978). One of the principal applications of conjoint analysis is the assessment of attribute importance weights (Jaccard, Brinberg and Ackerman 1986; Klein and Bither 1987). Consumer behavior researchers treat the magnitude of the importance weight as a measure of the effect of an attribute on the consumer's preference structure. They generally assume that the importance weights are not biased. This implies that the estimated importance weights will converge to the true importance weights when measured repeatedly over different samples. Consumer researchers would prefer an estimator used to study any behavioral phenomena to be unbiased.

A second desirable quality of estimated parameters is precision or low variance (Bickel and Doksum 1977). If the variance is low, the estimated importance weights lie in a narrow range around the mean. This increases the confidence in the results. Ideally in any study, the estimated importance weights should be unbiased and have a high precision. Some researchers suggest the mean squared error as an appropriate criterion for evaluating a good estimator where mean square error is a combination of the bias and the variance:

Mean squared error = bias^{2} + variance.

However, past research in the conjoint analysis area has not systematically evaluated importance weights on the basis of these two properties providing a basis for this study.

PAST RESEARCH

Past research evaluating the quality of conjoint analysis results has focused on measuring the reliability and the validity of the estimated importance weights. Carmone, Green and Jain (1978), Acito and Jain (1980), and Wittink and Cattin (1981) studied the robustness of conjoint analysis algorithms in recovering the observed rank order of the product profiles. McCullough and Best (1979) and Malhotra (1982) established the structural reliability and the stability of the parameters estimated using conjoint analysis.

Leigh, MacKay and Summers (1984) compared the predictive validity of self-explicated weights and conjoint analysis weights and found the former to be marginally superior. Segal (1982) and Parker and Srinivasan (1976) used the alternate forms method to establish reliability.

However, none of these studies address two of the most important issues relevant to the applicability of a psychometric measure in consumer behavior - the bias of the estimate and its precision (or standard deviation) as a function of the conjoint design. A simulation was used in this study to estimate the bias and precision by comparing observed estimates against true values. In empirical studies, where the consumer's true preference is never known, the bias cannot be calculated; thus making simulation the preferred method for this study.

The aims of this study are to compare the biases and the precision of the importance weights across conjoint design conditions. The different conditions include algorithm used, evaluation strategy, judgmental error, attribute level, number of attributes and profiles. The expected effects of these conditions are discussed in the next section.

BACKGROUND

Several factors are expected to influence the bias and the precision of the estimated importance weights.

Algorithm. Three algorithms are used in this study: LINMAP (Srinivasan and Shocker 1973), MONANOVA (Kruskal 1965), and OLS. When used in conjoint analysis, OLS is sometimes referred to as rank regression because it uses ranks as the dependent variable (Carmone, Green and Jain 1978). Past studies have indicated that LINMAP produces slightly better results than MONANOVA and OLS (Green and Srinivasan 1978). Jain et al. (1979) found that the estimated importance weights of attributes were different across estimation algorithms used. Therefore, the estimation algorithm is expected to influence the bias and precision of the attribute importance weights.

Evaluation strategy Two evaluation strategies are considered. The first is an equal weighted linear compensatory strategy, henceforth referred to as an equal weighted model, where a consumer places the same importance on each attribute. The second is a dominant attribute linear compensatory strategy, henceforth referred to as a dominant attribute model, where one of the attributes has an over whelming influence on the preferences (Wittink and Cattin 1981). The evaluation strategy has been found to affect the recovery of the preference order of the profiles. In a simulation study by Wittink and Cattin (1981) preference predictions using OLS were superior for the equal weighted models whereas predictions using LINMAP were superior for the dominant attribute models. In this study, in a dominant attribute model 90 percent of a consumer's preference was determined by one attribute, while the remaining attributes contributed 10 percent of the model's explanatory power.

Judgmental error. Judgmental error in conjoint analysis is the difference between the consumers' "true" preference and their self reported preferences. Based on measurement theory, the precision of any judgment based estimate is expected to decrease with increasing judgmental error (Nunnally 1978). Mean error levels used in this study were five percent and twenty five percent of the variance of the true preferences.

Number of Levels. Researchers have found that increasing the number of level within an attribute increases the estimated importance weight of the attribute (Creyer and Ross 1987; Currim, Weinberg and Wittink 1981; Wittink, Krishnamurthi and Nutter 1982). Three level and two level attributes were investigated in this study.

Number of attributes and profiles. Four different designs were used to simulate the data collection. The choice of designs w s intended to represent product profile and attribute combinations commonly used in conjoint simulation studies (Wittink and Cattin 1981). A profile is a description of a product that is provided to the consumers in a conjoint analysis task. Two attribute conditions were used - five attribute profiles with two of the attributes having three levels and three of the attributes having two levels, and eight attribute profiles with three of the attributes having three levels and the other five having two levels. Profiles in the evaluation set were defined using the orthogonal sets presented by Addelman (1962). Sets of sixteen and thirty two profiles were described using five attributes. Similarly, two more sets of sixteen and thirty two profiles were described using eight attributes.

METHOD

Simulation

A Monte Carlo simulation was conducted to test the influence of the design factors on bias and precision of the estimated importance weights. As described in the background section, utility functions were specified for each of the conjoint designs at two levels of experimental error. Individuals indicating their preference for the hypothetical product profiles could be expected to make some intransitive judgments. Therefore. it was appropriate to add some error to the true assessment of preferences before calculating the rank order (Carmone, Green and Jain 1978). The following additive linear model was used for defining consumer preference.

(1) U_{j} = S a_{i}X_{ij}

where:

U_{j} is the utility of profile j,

a_{i} is the importance weight of attribute i,

X_{ij} is the level of the attribute i for profile j.

To this model the specified levels of error, e_{j}, were added, which results in the preference function given in Equation 2.

(2) U'_{j} = U_{j} + e_{j}

In the simulations, the error, e_{j}, was generated as a normal distribution N(0, S^{2}), with mean zero and standard deviation S, using the GGNML subroutine (IMSL 1982). The normality of the generated error corresponding to each set of stimuli was tested using the Shapiro-Wilk statistic (Shapiro and Wilk 1965). Less than five percent of the sets failed this test. The GGNML subroutine was used to regenerate these sets.

After the preference, U'j, was calculated from

Equation 2, the profiles were ranked. Two thousand sets of rankings were simulated for each treatment condition. For this study, mean error levels were set at five percent and twenty five percent. Thus, an added error of 25% in Equation 2 implies that the ratio of variance (e_{j}) to variance (U_{j}) is 0.25.

Measures

Bias. Bias is the deviation of the estimated importance weight from the true weight. However, this measure cannot be averaged across all attributes within a model since the sum of the biases equals zero. Therefore, an alternative measure of bias was defined by taking the absolute value of the bias of each attribute. The absolute measure of bias is commonly used by researchers in marketing (Chapman and Staelin 1982; Lee and Sabavala 1987; Malhotra 1987 and Srinivasan and Weir 1988). For this study bias is expressed as a percentage, as shown in Equation 3.

(3) B = (Absolute (E - T)/ T) *100

where:

B is the percentage absolute bias of an importance weight,

E is the estimated importance weight, and

T is the true importance weight.

This statistic was averaged across the attributes to obtain a mean absolute bias for each treatment combination The true importance weight is the value used to simulate the preference process. Division soy the magnitude of the importance weight helps to compare the bias across dominant attribute and equal weighted model.

Precision In order to avoid magnitude effects, precision was defined as the ratio of the standard deviation of the importance weight to the magnitude of the importance weight, expressed as a percentage. This statistic is sometimes referred to as the Coefficient of Variation (Snedecor and Cochran 1974, p. 62). The advantage of using this coefficient is that it adjusts the standard deviation for the magnitude or scaling of the variable of interest. It is computed as given in Equation 4.

(4) P = (S/E) *100

where:

P is the precision of the estimate,

S is the standard deviation of the estimate, and

E is the estimated importance weight.

RESULTS

The bias for each attribute importance weight was calculated using Equation 3 and averaged across all the attributes. This mean absolute bias was then averaged over 2000 replications for each design condition. An analysis of variance was conducted to test the effect of the design factors on the mean absolute bias. A visual inspection of the data indicated that the bias was distributed as a truncated normal distribution. Since the ANOVA model is extremely robust with regard to the normality assumption, the use of ANOVA was justified (Tiku 1971). The results of this analysis arc presented in Table 1. The most important result is that all the three algorithms provide biased estimates of the importance weights (p < 0.05). These biases exist even after averaging over 2000 replications and therefore cannot be attributed to random effects. All the main effects, except for judgmental error, are statistically significant (p < 0.05). The evaluation strategy had the greatest impact in determining mean absolute bias. In order to avoid saturation, only those interactions that were expected to have an affect on bias, on an a priori basis, were tested in the ANOVA model. For instance, the interaction of algorithm and evaluation strategy was tested based on the results on predictive validity obtained by Wittink and Cattin (1981). This interaction is statistically significant (p < 0.05), as shown in Table 1. Similarly, the interaction of the number of attributes and the number of profiles had significant impact on the mean absolute bias (p < 0.05). These results clearly indicate that the magnitude of the mean absolute bias is strongly influenced by the design factors. The c%tent of the mean absolute bias corresponding to the design factors can be observed by examining the means for each level of the design factors and their significant interactions. These results are presented in Table 2, separately for each algorithm.

MEAN PERCENT OF ABSOLUTE BIAS OF ESTIMATED IMPORTANCE WEIGHTS FOR LINMAP, MONANOVA, AND OLS

For all the three algorithms, the mean absolute bias is substantially larger (17 to 23 times larger) for the dominant attribute model as compared to the equal weighted model. The large percentage bias for the dominant attribute model results, primarily, from the bias in the less important attributes. For instance, in the dominant attribute model with five attributes, the true importance weight of each of the four less important attributes is 0.025 (= (1 - 0.9)/4). If the estimated importance weight turns out to be 0.05, the bias is 100%. Such extreme biases were not observed for the equal weighted model. A noteworthy result is that the mean absolute bias is quite low for the equal weighted models for all the three algorithms.

The mean absolute bias of the importance weights using eight attributes is worse than when using five attributes. The larger the number of attributes, the higher the chance for one or two attributes to have a particularly large bias. Further, the bi s is lower when using 32 profiles than when using 16 profiles. In standard regression, where the estimates are unbiased, a large sample size loads to a decline in the standard error of estimation. Apparently, the effect of sample size on non-metric estimation and rank regression (OLS) is to reduce the bias. Differences in judgmental error do not result in changes in the biases. All the above results hold across the three estimation algorithms. Further, the mean absolute bias, when averaged across all treatments, is about equal for LINMAP and MONANOVA. However the mean absolute bias for OLS is about twice that for the other two algorithms (44.5% vs. 86.7%).

The precision for each importance weight was calculated using Equation 4 and averaged across all Fe attributes. This mean precision was averaged across 2000 replications for each design condition. An ANOVA was used to test the affect of the design factors on the precision of the estimated importance weights. The results are presented in Table 3. The algorithm, the evaluation strategy used and the interaction of the two had statistically significant effects on the precision of the estimated importance weights (p < 0.05). The interaction of the number of attributes and the profiles was only marginally significant (p < 0.10). Comparing Tables 1 ant 3, the design factors appear to have a stronger influence on the mean absolute bias of the importance weights than on the precision of the estimated weights. These results arc noteworthy since they differ considerably from the results expected from ordinary regression where estimates are unbiased and the judgmental error is the principal determinant of precision.

For each algorithm, the mean levels of precision are presented in Table 4.

The precision of the estimated weights for the dominant attribute models is substantially worse (i.e., higher deviation) than for equal weighted models. However, unlike the results for mean absolute bias in Table 2. the precision for the equal weighted model is about equal for LINMAP, MONANOVA, and OLS. The precision of the estimated weights m the dominant attribute model is somewhat better for OLS as compared to LINMAP. MONANOVA appears to be the worst in estimating the dominant attribute model. Across all the treatment conditions, the mean values of precision appear to be consistently worse for MONANOVA.

Using the criterion of minimizing both mean absolute bias and precision (Tables 2 and 4), LINMAP appears to be the best algorithm. As compared to LINMAP, OLS produces much large bias and MONANOVA provides worse precision. Therefore, as a general use algorithm, LINMAP appears to be well suited.

In equal weighted models, two level attributes have lower mean absolute bias than the three level attributes (p < 0.05). Further, the estimated importance weights for the two level attributes are downward biased and for the three level attributes upward biased This result conforms to prior observations by researchers that in most conjoint analysis studies, the attributes with more levels appear to be more important (cf., Wittink, Krishnamurthi and Nutter 1982).

CONCLUSIONS

The results show that the importance weight estimates obtained in any of the conjoint analysis algorithms, used in this study, are biased. In any study, the researcher must note that the estimated importance weights will have an associated bias and standard deviation. If the bias is large, increasing the number of replications or the sample size is unlikely to yield accurate estimates of importance weights.

MEAN PERCENT OF PRECISION OR ADJUSTED STANDARD DEVIATION OF ESTIMATED

Both bias and precision are influenced by design factors; the former to a greater extent.

Several recommendations can bc made using the results obtained in this study. OveralL in order to minimize the bias and increase the precision, LINMAP is the preferred algorithm. This recommendation conforms to past research suggesting superior performance of LINMAP based on predictive validity of preference rank order. However, if precision is of principal interest, OLS should be used. If minimizing the bias is of concern, larger profile sets should be designed. However, this number should not exceed the information overload threshold of the consumer. Researchers must be particularly concerned with the quality of the estimates when the dominant attribute evaluation strategy is used by the consumers. On a percentage basis, the more important attribute has a much lower bias than the less important attribute when such a strategy is used.

REFERENCES

Acito, Franklin and Arun K. Jain (1980), "Evaluation of Conjoint Analysis Results: A Comparison of Methods," Journal of Marketing Research, 17 (February), 106-12.

Addelman Sidney (1962), "Orthogonal Main Effect Plans for Asymmetrical Factorial Experiments," Technometrics, 4 (February), 21-58.

Bickel, Peter J. and Kjell A. Doksum (1977), Mathematical Statistics: Basic Ideas and Selected Topics, San Francisco, CA: Holden-Day, Inc.

Carmone, Frank J., Paul E. Green, and Arun K. Jain (1978), "Robustness of Conjoint Analysis: Some Monte Carlo Results," Journal of Marketing Research, 15 (May), 300-3.

Chapman, Randall G. and Richard Staelin (1982), "Exploiting Rank Ordered Choice Set Data Within the Stochastic Utility Model," Journal of Marketing Research, 19 (August), 288-301.

Creyer, Elizabeth H. and William T. Ross (1987), The Effects of Range-Frequency Manipulations on Conjoint Importance Weight Stability," in Advances in Consumer Research, Vol. 15, ed. Michael J. Houston, Provo, UT: Association for Consumer Research, 505-509.

Currim, Imran S., Charles B. Weinberg, and Dick R. Wittink (1981), She Design of Subscription Programs for a Performing Arts Series: Issues in Applying Conjoint Analysis," Journal of Consumer Research, 8 (June), 67-75.

Green, Paul E. and V. Srinivasan (1978), "Conjoint Analysis in Consumer Research: Issues and Outlook," Journal of Consumer Research, 5 (September), 103-23.

IMSL (1982), IMSL Library Reference Manual, vol. 2, 9th ed., Houston, TX: International Mathematical and Statistical Libraries.

Jaccard, James, David Brinberg, and Lee J. Ackerman (1986), "Assessing Attribute Importance: A Comparison of Sis Methods," Journal of Consumer Research, 12 (March), 463-68.

Jain, Arun K, Franklin Acito, Naresh K. Malhotra, and Vijay Mahajan (1979), "A Comparison of the Internal Validity of Alternative Parameter Estimation Methods in Decompositional Multiattribute Preferences Models," Journal of Marketing Research, 16 (August), 313-22.

Klein, Noreen M. and Steward W. Bither (1987), "An Investigation of Utility-Directed Cut off Selection," Journal of Consumer Research, 14 (September), 240-55.

Kruskal, J. B. (1965), "Analysis of Factorial Experiments by Estimating Monotone Transformations of the Data" Journal of the Royal Statistical Society, Series B, 27, 251-63.

Lee, Jack C. and Darius J. Sabavala (1987), "Bayesian Estimation and Prediction for the Beta-Binomial Model," Journal of Business and Economic Studies, 5 (July), 357-367.

Leigh, Thomas W., David B. MacKay, and John O. Summers (1984), "Reliability and Validity of Conjoint Analysis and Self-Explicated Weights: A Comparison," Journal of Marketing Research, 21 (November), 456-62.

Malhotra, Naresh K. (1982), "Structural Reliability and Stability of Nonmetric Conjoint Analysis," Journal of Marketing Research, 19 (May), 199-207.

Malhotra, Naresh K. (1987), "Analyzing Marketing research Data with Incomplete Information on the Dependent Variable," Journal of Marketing Research, 24 (February), 74-84.

McCullough, James and Roger Best (1979), "Conjoint Measurement Temporal Stability and Structural Reliability," Journal of Marketing Research, 16 (February), 26-31.

Nunnally, Jim C. (1978), Psychometric Theory, 2nd ed., New York: McGraw Hill Book Co.

Parker, Barnett R. and V. Srinivasan (1976), "A Consumer Preference Approach to the Planning of Rural Primary Health Care Facilities," Operations Research, 24, 991-1025.

Segal, Madhav N. (1982), "Reliability of Conjoint Analysis: Contrasting Data Collection Procedures," Journal of Marketing Research, 19 (February), 139-43.

Shapiro, S. S. and M. B. Wilk (1965), "An Analysis of Variance Test for Normality (Complete Samples)," Biometrika, 52, 591-611.

Snedecor, George W. and William G. Cochran 1974, Statistical Methods, 6th Edition, Ames, Iowa: The Iowa State University Press.

Srinivasan, V. and Allan D. Shocker (1973), "Linear Programming Techniques for Multi-Dimensional Analysis of Preferences," Psychometrika, 38, 337-69.

Srinivasan, V. and Helen A. Weir (1988), "A Direct Aggregation Approach to Inferring Microparameters of the Koyck Advertising-Sales Relationship from Macro Data," Journal of Marketing Research, 25 (May), 145- 156.

Tiku, M. L. (1971), "Power Function of the F-test Under Non-Normal Situations," Journal of the American Statistical Association, 66, 913-916.

Wittink, Dick R. and Philippe Cattin (1981), "Alternative Estimation Methods for Conjoint Analysis: A Monte Carlo Study," Journal of Marketing Research, 18 (February), 101-6.

Wittink, Dick R., Lakshman Krishnamurthi, and Julia B. Nutter (1982), "Comparing Derived Importance Weights Across Attributes," Journal of Consumer Research, 8 (March), 471-74.

----------------------------------------

Tweet
window.twttr = (function (d, s, id) { var js, fjs = d.getElementsByTagName(s)[0], t = window.twttr || {}; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = "https://platform.twitter.com/widgets.js"; fjs.parentNode.insertBefore(js, fjs); t._e = []; t.ready = function (f) { t._e.push(f); }; return t; } (document, "script", "twitter-wjs"));