Importance Weight Effects on Self-Explicated Preference Models: Some Empirical Findings

Paul E. Green, University of Pennsylvania
Catherine M. Schaffer, University of Denver
[ to cite ]:
Paul E. Green and Catherine M. Schaffer (1991) ,"Importance Weight Effects on Self-Explicated Preference Models: Some Empirical Findings", in NA - Advances in Consumer Research Volume 18, eds. Rebecca H. Holman and Michael R. Solomon, Provo, UT : Association for Consumer Research, Pages: 476-482.

Advances in Consumer Research Volume 18, 1991      Pages 476-482


Paul E. Green, University of Pennsylvania

Catherine M. Schaffer, University of Denver

[The authors would like to acknowledge support of the Citibank Fellowship from the Sol C. Snider Entrepreneurial Center and the SEI Center for Advanced Studies in Management at the Wharton School.]

The self-explicated preference model (in which preference is assumed to be a simple additive function of attribute importances times attribute level desirabilities) has received renewed attention in recent developments of hybrid conjoint and computer-based (adaptive) conjoint models. This paper explores the effect of varying attribute importance weights on the self-explicated model's predictive validity across conjoint-based full profile evaluations.

Counter to conventional wisdom, we find that modifying importance weights, so that the less important attributes are reduced even further in relative importance, reduces predictive validity. That is, individuals' originally stated importances are the best predictors of their subsequent full-profile evaluations.


The venerable self-explicated preference model, in which preference is assumed to be a simple additive function of attribute importance weights times attribute-level desirabilitics, has recently undergone a resurgence of interest. At the academic research level, new classes of models, such as hybrid conjoint (Green, Goldberg, and Montemayor 1981; Cattin, Hermet, and Pioche 1982; Akaah and Korgaonkar 1983), have utilized self-explicated data as a first stage in the development of a compositional/conjoint model.

At the industry al plication level, Sawtooth Software's Adaptive Conjoint Analysis (Johnson 1987) incorporates self-explicated data as a first step prior to "Bayesian" updating, based on individuals' evaluations of paired comparisons stimuli. M/A/lVC, a national marketing research firm, has developed CASEMAP (Srinivasan 1988), a telephone interviewing method that collects only self-explicated data for preference modeling (i.c., there is no conjoint data collection stage).

Self-explicated preference models frequently predict conjoint-based profile responses reasonably well (Leigh, MacKay, and Summers 1984; Srinivasan 1988; Green and Helsen 1989). Of course, one could describe sets of conditions (c.g., preference evaluations of "holistic" stimuli such as food/beverage formulations, package designs, and physical stimuli in general) where the collection of self-explicated data would not make much sense.

Still, these are applied contexts in which the self explicated model may be appropriate. Since the inception of linear additive models in marketing (see Wilkie and Pessemier 1973), questions have been raised about whether importance weights are really necessary in self-explicated judgments and, if so, how they should be applied (Dawes and Corrigan 1974; McClelland 1978). Green and Krieger (1986) have examined this question in the context of marketing as have Curry and Faulds (1986).

The question addressed in the present paper, however, appeals to be almost as old as the self-explicated model itself. In the early 1960s Shepard (1964), citing research on self-explicated models by Hoffman (1960) and Pollack (1962), stated (p. 266) that:

In both cases subjects were asked not only to make an overall evaluation of each stimulus but also to judge the extent to which each attribute of the stimuli was subjectively weighted, on the average, in making these evaluations. The degree of "insight" of a subject could then be assessed by comparing his announced subjective weights with the weights that were in reality controlling his overall evaluations (as determined by multiple regression procedures, for example). The results suggest that although the weights controlling the subjects' responses are usually concentrated on only one or two attributes, the subjective weights reported by the subjects tended to be more evenly distributed over the whole set of attributes. Indeed, there is some indication in Pollack's findings that the announced subjective weights tended to err in the opposite direction of ascribing too much importance to the less important variables

Shepard's comments seem plausible and in accord with one's own intuitions. They have remained a part of the conventional wisdom to this day.

The purpose of the study reported here is to examine the issue of whether appropriate transformations of individuals' self-explicated importances can actually improve cross-validities in sets of holdout stimuli where respondents evaluate conjoint profiles holistically from a preference standpoint. We first briefly review the main characteristics of the self-explicated model. We then describe the empirical study and report cross-validation findings. We conclude the paper with a further discussion of the problem and areas for future research.


The self-explicated preference model has been described by Green (1984). Following his notation, we let

i = (i1, i2, ... , ij, ... , iJ)

denote a multi-attribute profile in which the vector component ij denotes level ij (ij = 1,Ij) of attribute j (j = 1, J). Next, we let

uijk = respondent k's (k = 1, K) self explicated desirability (or acceptability) score for level i of attribute j,

wjk = respondent k's self-explicated importance weight for attribute j; wjk > 0;




denotes respondent k's overall preference score, or utility U, for profile i as a weighted sum of the desirability scores uijk.

The u's are usually obtained as rating scale values on (say) a 0 to 10 scale. Depending upon the number of attributes, the w's may be obtained from constant sum tasks or from rating scales (where the non-negative importance ratings are later normalized to sum to unity). However, Johnson (1987) obtains desirability scores as integer rank numbers across the levels of each attribute and importance weights as ratings on a 4-point scale.

Data collection techniques also differ on how the u's are normalized. Here we shall first assume that, within attribute, the original ratings are simply transformed by a multiplicative constant to vary between 0 and 1.0. In some cases, however, the researcher may translate and stretch the original scale so that, within attribute, the lowest desirability scale value is coded 0 and the highest is coded 1, with interpolated intermediate values. (We comment later on the advisability of this transformation.)


The stimuli of this experiment consisted of privately offered, unfurnished apartment descriptions, a rather popular area for academic research in conjoint analysis (Johnson and Meyer 1984; Green, Helsen, and Shandler 1988). Subjects for the experiment were business students, most of whom were already living in a student apartment or were considering renting one during the next school year. Complete data were obtained from 177 respondents; all data were collected during March-April 1989.

Table 1 shows the attributes and levels used in the study. Data collection entailed four phases. Phase I consisted of the self-explicated task. For each attribute level of Table 1 the respondent was asked to rate its acceptability on a 0-10, equal-interval rating scale, ranging from completely unacceptable to completely acceptable. Following this, each respondent was asked to allocate 100 points across the six attributes, so as to reflect their relative importance in selecting an apartment for rent.

In phase II, each respondent received (in randomized order) 18 full-profile cards developed from an orthogonal main effects plan (see Table 2). Ln each case the respondent was asked to indicate the likelihood (on a 0-100 scale) of renting an apartment of that description, assuming he or she were in need of an apartment within walking distance of the university.

In phase III each respondent received 16 profiles, based on an orthogonal design, utilizing levels 1 and 3 of t attributes shown in Table 1. The same 0-100 likelihood of renting scale was used again. The data of phases I-III were all collected in one sitting. (Respondents received class credit for their participation.)

Two weeks later, each respondent participated in phase IV. A different experimental design (one that was constructed to make the profiles easier to judge) was used to construct 16 apartment profiles, again drawn from levels 1 and 3 of Table 1. Respondents rated these 16 "easier" profiles on the same 0-100 likelihood of renting scale used earlier. Hence for each of 17.7 respondents, data were available for constructing an individual-based self-explicated model and then validating that model on three separate holdout samples of 18, 16, and 16 profiles, respectively.

Analysis of the Individual Self-Explicated Models

All analyses were made at the individual subject level and then summarized. The principal experimental variable under study was the vector of self-explicated importances, i.e., the w's described earlier.

In keeping with the spirit of Shepard's comments, we operationalized the idea of giving differential weight to the most important attributes by means of a power function B, where the following transformation of the original wjk's was made:


for B = 1, 2, 4, 8, 16, respectively. For example, suppose a subject's original importance weights (from the constant sum task) were 0.15, 0.03, 0.42, 0.10, 0.08, and 0.22 for walking time, noise level, safety, condition, size, and rent, respectively. If B = 2, the respective weights (after normalization) would be 0.085, 0.003, 0.667, 0.038, 0.024, and 0.183, respectively. As noted, the most important attribute (safety) receives differentially high importance after the power transformation.







In addition, we set up an experimental condition were B = 0, thereby constructing a set of unit (equal) weights in accord with the research of Dawes and Corrigan (1974). In this case, an effect that is opposite to Shepard's remarks would occur in which respondents' transformed weights are made more equal (i.e., precisely equal) than their original, self-stated weights.

Response Measures

Two kinds of tests on the effect of importance weight modifications were run. First, the control condition (B = 1) and the five test conditions (S = 0, 2, 4, 8, 16) along with the acceptability ratings, were used to obtain self-explicated part worths. Each of the six self-explicated models was then used to predict responses to the Phase II, III, and IV full-profile evaluations. As measures of cross-validity we employed Pearson's product moment correlation between actual and predicted likelihoods of renting and the incidence of predicted first-choice hits.


We first report results involving the cross-validation of each importance-weight power transformation with the three holdout samples obtained from phases II, III, and IV. However, by way of background, Figure 1 shows the average acceptability scores for the sample of 177 respondents (in their original, 0-10 rating scale units). As might be surmised, all attribute level ratings, on average, maintained monotonicity.

We now turn to the cross-validation results (Table 3) in which, for each individual, the appropriate self-explicated model was used to predict the actual evaluations of the holdout profiles of phases II, III, and IV.

Product Moment Correlations

As Table 3 shows, the effect of the power transformation (for B = 2, 4, 8, and 16) is to reduce the self-explicated model's validity as B increases beyond B = 1. This pattern holds true across all three sets of holdout profile evaluations. We note that the use of equal weights (i.e., B = 0) also fails to improve on the original, self-stated attribute importances. (This result is less surprising since the gist of Dawes and Corrigan's remarks relates to errors in regression-estimated weights rather than self-stated weight transformations.) The more important point is that cross-validation performance is diminished as one departs on "either side" of the self-stated weights, via a specified monotonic function of the original weights.




With the exception of the B = 0 condition, we note that the correlations are always in the rank order of phase IV (best), followed by phase III, and finally by phase II. To a large extent this ordering is due to the fact that the stimuli in phase IV (which were not constructed by an orthogonal design) were easier to judge. In phase IV there were two highly desirable profiles (where five out of six attributes were all at their high levels) and two extremely poor profiles (only one of the six attributes was at its high desirability level). The remaining 12 profiles were all "intermediate," with exactly three high-level and three low-level attributes each.

The orthogonally-designed stimuli of phase m had only one extremely good profile (five out of six attributes at their high levels) and one extremely poor profile (one out of six at its high level). Of the 14 remaining stimuli, five profiles had two attributes at their high levels, six profiles had three attributes at their high levels, and three profiles had four attributes at their high levels.

Phase IV and phase III stimuli were expressly constructed to examine the differential impact that more easily evaluated profiles might have on various validation measures. (In this regard the phase II orthogonally-designed profiles were even more difficult to judge, given the fact that three levels of each attribute were independently varying.)

First-Choice Hits

Table 3 also shows the counterpart results for the first-choice hit measure, by phase. As noted, for the control case there is a 0.371 chance that the respondent's actual first choice among the 18 profiles of phase II is predicted by his/her self-explicated model. Because of the presence of ties in either the actual ratings and/or the self-explicated predictions, all first-choice incidences have been adjusted to allow for multiple first-choice hits. For example, if in phase III the subject's highest rated profile is #14 and the highest self-explicated predictions show ties among profiles #1, #11, and #14, the first-hit incidence is recorded as 0.33, instead of 1.0. (This procedure adjusts for the fact that as the power of B increases, there is a greater incidence of tied self-explicated predictions.)

We note from Table 3 that the incidence of first-choice hits also decreases as B departs on either side of its control value of 1.0. Similar to the correlations pattern, we note that the predictive accuracies of phase IV dominate their counterparts in phases II and III. With one exception (B = 0), phase III accuracies are higher than their phase II counterparts. In short, both validation measures (correlations and first-choice hits) result in similar conclusions.

Significance Tests

The three-way matrix (of order 177 X 6 X 3) of correlations was analyzed by repeated measures ANOVA. Both main effects (levels of B and the three sets of holdout profiles) were significant beyond the 0.01 level.



The incidence of first-choice hits (Table 3) was also analyzed. Significant results were also found for the effect of S and the effect of holdout sample (phases II, III, and IV).

Data Rescaling

The preceding analyses were then repeated for the original acceptabilities data that were rescaled so that, within subject, each acceptability score was "stretched" to range from 0 (least acceptable) to 1.0 (most acceptable); the original intermediate scale values were linearly interpolated on the new scales. We found that the resulting correlations and first-choice hits were dominated by their counterpart values of Table 3, based on the original acceptabilities scaling. Given this poor performance, no additional analyses were carried out with the alternative scaling. For this data set, at least, the alternative scaling (of stretching each acceptability range to be anchored at zero and one) resulted in significant information loss.


To the best of our knowledge this is the first empirical study that has examined the empirical consequences of Shepard's 1962 comments. We have found that transformations which make the attribute importances more disparate (e.g., higher values of B) lead to poorer cross-validations. Assigning equal importance weights also leads to lower cross-validations.

Moreover, although not shown here, part worth model misspecification also shows effects at the market share level, particularly if choice rules approximating the maximum utility rules are applied. Finally, transformations that alter the original acceptabilities ratings so that all acceptabilities exhibit the same range (i.e., 0 to 1) significantly lower the cross-validations, at least for this data set.

Clearly, the findings of this study are limited. First, we have considered only six attributes. It is possible that Shepard's comments could be appropriate for larger numbers of attributes and levels within attributes. More complex attribute combinations, including non-monotonic functions e.g., ideal points) could also prompt the greater use of simplifying strategies that focus on a reduced number of attributes and attribute-level variations.

What about other transformations of the original importances that also have the effect of making the less important attributes even less influential? To this end, we examined another transformation rule, referred to as a "zeroing-out" rule. Under this procedure one takes the attribute that is least important (from the self-explication task) and sets its importance weight to zero. The remaining attribute importances are renormalized to sum to unity. The procedure is repeated for the two least important attributes, and so on.

This rule is somewhat gentler than the B procedure, described earlier, inasmuch as the B rule effectively lowers all attribute importances except the attribute carrying the highest importance. The zeroing-out rule was followed for all possible cases, leaving out 1, 2, 3, 4, or 5 of the less important attributes. We then ran validation correlations similar to those shown earlier in Table 3.

Table 4 shows the results of the zeroing-out rule. Again we note the same pattern of decrease in predictive accuracy as predictions are made on fewer and fewer attributes. We also note the same pattern in which phase IV correlations exceed those of phase III which, in turn, are higher than those of phase II. In sum, the "gentler," zeroing-out rule produced similar results, i.e., lower validities compared to the control case in which all attribute importances were retained.

While we have examined (in Table 3) cross validation behavior for both correlations and the incidence of first choice hits, we have a predilection for the former measure. Correlations consider the whole set of predictions, not just the top choice (under a maximum utility rule). Correlations are also more sensitive to the types of transformations considered here. With the increasing application of logit-type models (Johnson 1987) and other varieties of share-of-utility rules, we feel that sole reliance on first-choice prediction could present, for some data sets and contexts, an incomplete description of cross-validation performance.


Akaah, Ishmael P., and Pradeep K. Korgaonkar (1983), "An Empirical Comparison of Predictive Validity for the Self-Explicated, Huber-hybrid, Traditional Conjoint and Hybrid Conjoint Models," Journal of Marketing Research, 20 (May), 187-97.

Cattin, Philippe, Gerard Hermet, and Alan Pioche (1982), "Alternative Hybrid Models for Conjoint Analysis: Some Empirical Results," in Raj Srivastava and Allan D. Shocker (eds.), Analytical Approaches to Product and Market Planning: The Second Conference. Cambridge, MA: Marketing Science Institute (October), 14252.

Curry, David J. and David J. Faulds (1986), "Indexing Product Quality: Issues, Theory and Results," Journal of Consumer Research, 13 (June), 134-45.

Dawes, Robyn M. and Bernard Corrigan (1974), "Linear Models in Decision Making," Psychological Bulletin, 81, 95-106.

Green, Paul E. (1984), "Hybrid Models for Conjoint Analysis: An Expository Review," Journal of Marketing Research, 21 (May), 155-9.

Green, Paul E., Stephen M. Goldberg, and Mila Montemayor (1981), "A Hybrid Utility Estimation Model for Conjoint Analysis," Journal of Marketing, 45 (Winter), 33-41.

Green, Paul E. and Kristiaan Helsen (1989), "Cross-Validation Assessment of Alternatives to Individual-Level Conjoint Analysis," Journal of Marketing Research, 26 (August), 346-350.

Green, Paul E., Kristiaan Helsen, and Bruce Shandler (1988), "Conjoint Internal Validity Under Alternative Profile Presentations," Journal of Consumer Research, 15 (December), 392-7.

Green, Paul E. and Abba M. Krieger (1986), "The Minimal Rank Correlation, Subject to Order Restrictions, with Application to the Weighted Linear Model," Journal of Classification, 3, 67-96.

Hoffman, Paul J. (1960), "The Paramorphic Representation of Human Judgement," Psychological Bulletin, 57, 116-31.

Johnson, Eric J. and Robert J. Meyer (1984), "Compensatory Choice Models of Noncompensatory Processes: The Effect of Varying Context," Journal of Consumer Research, 11 (June), 528-41.

Johnson, Richard M. (1987), "Adaptive Conjoint Analysis," Sawtooth Software Conference on Perceptual Mapping, Conjoint Analysis, and Computer Interviewing. Ketchum, ID: Sawtooth Software.

Leigh, T. W., David B. MacKay, and John O. Summers (1984), "Reliability and Validity of Conjoint Analysis and Self-Explicated Weights: Comparison," Journal of Marketing Research; 21 (November), 456-62.

McClelland, Gary H. (1978), "Equal Versus Differential Weighting for Multiattribute Decisions: There Are No Free Lunches," Center Report No. 207, Boulder, CO: Institute of Behavioral Science, University of Colorado.

Pollack, I. (1962), "Action Selection and the Yntema-Torgenson 'Worth' Function," paper presented at the 1962 Meeting of the Eastern Psychological Association, April.

Shepard, Roger N. (1964), "On Subjectively Optimum Selections Among Multiattribute Alternatives;" in M. .W. Shelley, II and G. L. Bryan (eds.), Human Judgments and Optimality. New York: John Wiley, 257-81.

Srinivasan, V. (1988), "A Conjunctive-Compensatory Approach to the Self-Explication of Multi-attributed Preferences," Decision Sciences, 19 (Spring), 295-305.

Wilkie, William L. and Edgar A. Pessemier (1973), "Issues in Marketing's Use of Multiattribute Attitude Models," Journal of Marketing Research, 10 (November), 42841.