A Cross-Validation Test of Hybrid Conjoint Models

Paul E. Green, University of Pennsylvania
Stephen M. Goldberg, University of Texas at Austin
James B. Wiley, Temple University
ABSTRACT - Hybrid conjoint models (Green, Goldberg, and Montemayor 1981; Green and Goldberg 1981) are conjoint data collection and analysis procedures that retain individual differences while significantly reducing respondent evaluation time. An experiment is reported that compares the predictive performance of different types of hybrid models with self-explicated and full-profile conjoint procedures. The results indicate that hybrid models strongly outperform the self-explicated model and also compare favorably with the predictive validity of full-profile conjoint models.
[ to cite ]:
Paul E. Green, Stephen M. Goldberg, and James B. Wiley (1983) ,"A Cross-Validation Test of Hybrid Conjoint Models", in NA - Advances in Consumer Research Volume 10, eds. Richard P. Bagozzi and Alice M. Tybout, Ann Abor, MI : Association for Consumer Research, Pages: 147-150.

Advances in Consumer Research Volume 10, 1983      Pages 147-150


Paul E. Green, University of Pennsylvania

Stephen M. Goldberg, University of Texas at Austin

James B. Wiley, Temple University


Hybrid conjoint models (Green, Goldberg, and Montemayor 1981; Green and Goldberg 1981) are conjoint data collection and analysis procedures that retain individual differences while significantly reducing respondent evaluation time. An experiment is reported that compares the predictive performance of different types of hybrid models with self-explicated and full-profile conjoint procedures. The results indicate that hybrid models strongly outperform the self-explicated model and also compare favorably with the predictive validity of full-profile conjoint models.


Judged by the widespread reception that conjoint analysis (Green and Rao 1971; Johnson 1974) has received by applied researchers, it seems fair to say that the methodology has now established itself as a major research tool for marketing planning and analysis (Cattin and Wittink 1981). However, if the methodology is to continue to fulfill its potential, there are a number of practical problems still to be addressed.

One of these problems concerns the time and effort that respondents must expend in carrying out the conjoint tasks in either the full-profile or two-at-a-time (tradeoff matrix) procedure (Green and Srinivasan 1978). In response to this problem, hybrid models (Green, Goldberg, and Montemayor 1981; Green and Goldberg 1981) have been proposed as a way to simplify the data collection task while retaining individual differences in utility functions.

Hybrid conjoint models adapt an old idea--self-explicated utility assessment (Wilkie and Pessemier 1973)--to conjoint analysis. While a number of hybrid models have been proposed, each procedure entails some type of self-explicated utility stage, where respondents evaluate the Levels of each attribute (one attribute at a time) on some type of desirability scale. This is followed by an evaluation of the attributes themselves on an importance scale. In the second stage of the hybrid procedure, the respondent evaluates a small set (usually four to nine) or full profiles on some type of intentions-to-buy or overall preference scale.

Hybrid models combine these two types of information to estimate a utility function that contains some parameters at the individual level and some at the subgroup level. While hybrid models have appeared to work well in several industrial applications, no comparative test of their predictive ability has been reported as yet. The purpose of the present paper is to report the results of an experiment in which the predictive performance or several different varieties of the hybrid model is compared to the self-explicated model and the traditional full-profile conjoint model.


An opportunity to implement the experiment occurred in conjunction with a commercial conjoint study of consumer preferences for a new household appliance, whose attributes consisted of: [To respect the sponsor's wishes, all attribute-level descriptions have been suppressed.]


The basic (full) factorial design consisted of a 45, 22 design of 4,096 combinations. However, a partial factorial design was constructed, with only 32 combinations, in which the main effects of all seven attributes and a set of selected two-way interactions were estimable.

The sample consisted of 476 respondents (drawn from three major cities) who were personally interviewed in central shopping mall facilities. Respondents were recruited on the basis of their interest in purchasing the new appliance over the next 12 months.

The data were collected in two stages, suitable for application of a hybrid model. In stage one each set of attribute levels was rated--one attribute at a time --on a four-point, equal-interval scale, with "4" assigned to the most desired level and "1" to the least desired level. Each respondent was then asked to allocate 100 points across the seven attributes, so as to reflect their relative importance. The self-explicated utility U is defined as:


where Ui1i2...ij. . denotes the self-explicated utility of some stimulus profile defined by level i (i = l,Ij) of attribute i (j = 1,J); wj denotes the self-explicated importance weight for attribute j, and uij is the desirability rating of level i of attribute j. All self-explicated utilities (the wjui 's) are then normalized within respondent to total 1.0; this constitutes stage one of the hybrid model.

In stage two, each respondent was presented with a set of 32 cards, with each card containing a profile description of the appliance, drawn from the partial factorial design. The respondent was asked to sort the cards along a seven-point scale board, ranging from "like least" (assigned the integer "1") to "like most" (assigned the integer "7"). This task provided the basic input data for parameterizing either the traditional conjoint model (based on the full set of 32 responses) or stage two of the hybrid model (based on only a subset of eight of the 32 responses). [The subset of eight responses was drawn in such a way that each level of each attribute was balanced within respondent. Across all respondents, each of these subsets of eight profiles appeared an equal number of times.]

In the case of the traditional conjoint procedure, utilizing all 32 evaluations, the model being fitted at the individual level is:


where Vi1i2...ij denotes the response to the sort board rating task for some profile described by level i (i = 1,Ij) on attribute j (j = 1,J); vij is the main effect utility of level i of attribute j; tijij, is one of the two-way interaction utilities (for the subset of two-level attributes whose interactions were estimable); and - denotes least squares approximation. The v's and t's are found from dummy-variable regression.

In the case of stage two in the hybrid procedure, the corresponding model--herein called the single b-weight model is:


where the intercept term is denoted by a; b denotes a regression weight (assumed to be common across the subgroup); and the v's and t's are also estimated, via regression, at the subgroup level.

Equation (3) can be estimated at either the total sample level or at the subgroup level. In the latter case the sample is first clustered on the basis of similarities in self-explicated utilities from stage one. [There are EQUATION self-explicated utilities (normalized within respondent) for each respondent.] Then equation (3) is estimated separately for each cluster after a model comparison test is run to confirm that the subgroups differ with regard to regression weights and/ or intercepts in stage two of the hybrid procedure.

The last phase of evaluations entailed the respondent's preference ranking of four holdout profiles. In order to make the prediction task fairly difficult, the four profiles were chosen, on the basis of prior judgment, to be close to one another in terms of total utility. [While the usual background data on current brand perceptions, psychographics, and demographics were also collected, our discussion is confined to the utility data only.]


As noted above, input data are available for testing, at the individual level, the self-explicated utility model, the traditional conjoint model, and, various types of hybrid models. In all cases, the test of each model is based on predicting the respondent's preference ranking of the holdout sample of four items. Kendall's rank correlation (tau) was used as a summary measure of predictive performance.

The Self-explicated Model

The self-explicated utility model, defined by equation (l), was the simplest model tested. [This model is sometimes described as the adequacy-importance model (Myers and Shocker 1981).] In this case, one simply finds the U value for each of the four holdout items and ranks them. This ranking is then compared, for each individual separately, to the respondent's actual ranking of the holdout sample.

The Full Profile Conjoint Model

The full profile conjoint model is defined by equation (2). The full set of 32 ratings responses were used for this model. Two separate models were fitted--a main effects only model and a main effects plus two-way interactions model. [In the conjoint model involving both main effects and two-way interactions, the candidate two-way interactions were entered on a stepwise basis after the main-effects were forced into the regression equation. In most cases, the interaction effects entailed (in total) 4 to 6 additional single-degree-of-freedom contrasts. Thus, given the 18 degrees of freedom needed for main effects, a total of approximately 23 parameters were fitted, leaving 9 degrees of freedom for error.] Again, each model was estimated individually and predictions of the ranking of the holdout sample were made and compared to the actual ranking.

The Hybrid Models

There are a variety of hybrid models that can be fitted (Green and Goldberg 1981), depending upon scaling assumptions, and the desired number of terms in the composite function. In the present example, five different models were fitted--first at the total sample level, and then at the subgroup level (following a cluster analysis of the self-explicated utilities).

The five models were first distinguished by whether the general model was either of the form shown in equation (3)--i.e., the single b-weight model--or the following alternative:


where, as can be noted, a separate bj regression weight is estimated for each attribute's self-explicated utility. This extended model will be called the multiple b-weight model.

The models were then distinguished on the basis of the terms included in equations (3) or (4):

Stage one only--no v's or t's are estimated. [Note that in the case of the single b-weight model, this case yields the same ranking predictions as the self-explicated utility model, since one is finding a simple linear transformation of the self-explicated utilities; see equation (3). Hence, the two cases involving this version of the hybrid model were omitted in the analysis.]

Stage one plus main effects only--no t's are estimated.

Full hybrid--all (significant) effects in either equation (3) or equation (4) are estimated.

Again, all predictions were made at the individual level and compared with the holdout sample. A total of 10 hybrid models--five each for the total sample and the subgroup level--were fitted.

In the set of five subgroup models a K-means cluster analysis revealed three clusters of respondents whose self-explicated utilities were similar within cluster and different across clusters. Hence, hybrid models were fitted separately for each cluster. [The cluster sizes were 224, 170,and 82 respondents.]


We first discuss the test results descriptively, folLowed by some brief comments on their statistical significance.

Table 1 shows the descriptive results--the average tau correlation, computed, in each case, over the total sample of 476 respondents.

Self-explicated Model

The tau correlation of the self-explicated model is only 0.342. This is the lowest correlation in the experiment, suggesting the need either to modify the self-explicated importances (as is done in the stage-one-only, multiple b-weight models) and/or to include additional terms from the stage two conjoint procedure. As noted, the stage-one-only, multiple b-weight models (with no additional terms) do an excellent job; they raise the tau measure from 0. 342 to a level between 0.745 and 0.761.

Traditional Conjoint Models

In contrast to the self-explicated model, the traditional conjoint models perform quite well. Interestingly enough, the main-effects-only model (tau =0.6503 outperforms the case of main effects plus interactions (tau = 0.614). However, such cross validation findings are not all that unusual, given that all of the parameters are sample based, and subject to estimation error.

As has been noted by Dawes and Corrigan (1974) and Schmidt (1971) in the related case of unit weights versus regression-based weights, simple models (in the presence of estimation error) can sometimes cross validate higher than more complex models.

Hybrid Models

As noted from Table 1, the hybrid models perform quite well in this specific example. Descriptively, they do appreciably better than the self-explicated model and somewhat better than the traditional conjoint models.



The multiple b-weight model (both the main-effects-only and the full model) slightly outperform their counterpart, single b-weight models. Models based on subgroup data slightly outperform their counterparts based on total-sample data. Again, we note that the main-effects-only hybrid performs slightly better than the full-model hybrid.

These descriptive results are not too surprising. First, stage one of the hybrid procedure estimates the within- attribute utilities directly--probably with less error than would be found from traditional conjoint procedures. Second, use of the multiple b-weight model allows for differential adjustment of the self-explicated importances. Third, if there is heterogeneity in the self-explicated utilities, the subgroup-based models should provide better predictions than the total-sample-based models.

Statistical Significance

The tau measures (transformed to Fisher's Z values) were next tested for significance (using an alpha of 0.05) via a single-way ANOVA. Both the hybrid and traditional conjoint models significantly outperformed the traditional conjoint model. However, the descriptive findings regarding: (a) main-effects-only versus main effects with interactions (within class of model); (b) multiple b-weight versus single b-weight models; and (c) subgroup estimation versus total sample estimation did not lead to significantly different results at the 0.05 alpha level.

From a practical standpoint, the differences related to main-effects-only versus full; multiple b-weight versus single b-weight; and subgroup versus total sample estimation are not large, on average. Usually the differences amount to only one or two percentage points, on the average.


In this particular data set, the hybrid model outperformed both the self-explicated and traditional conjoint models in terms of predicting the rank order of a small holdout sample. Obviously, results based on a single data set are to be taken with a large grain of salt. While we are not surprised that both the traditional conjoint and the hybrid models outperformed the self-explicated model, we are reluctant to claim anything more than a good beginning for hybrid models versus the traditional conjoint model.

Within the set of the ten hybrid models tested here, the multiple b-weight model versus single b-weight model and the subgroup versus total sample estimation produced descriptive results in the anticipated direction--but the differences failed to reach statistical significance. Moreover, the jury is still out on the relative merits of including main effects only versus main effects with selected two-way interactions.

Clearly, additional cross validation tests on other sets are needed before any definitive statements can be made. About all that can be said is that the comparative validity of hybrid models looks encouraging at this point.


Cattin, Philippe and Wittink, Dick R. (1981), "Commercial Use of Conjoint Analysis: A Survey," research paper, Graduate School of Business, Stanford University.

Dawes, Robyn M. and Corrigan, Bernard (1974), "Linear Models in Decision Making," Psychological Bulletin, 81, 95-106.

Green, Paul E. and Goldberg, Stephen S. (1981), " A Non-metric Version of the Hybrid Conjoint Analysis Model," paper presented at the Third ORSA/TIMS Market Measurement Conference, New York University (March).

Green, Paul E., Goldberg, Stephen M., and Montemayor, Mila (1981), "A Hybrid Utility Estimation Model for Conjoint Analysis," Journal of Marketing, 45 (Winter), 33-41.

Green, Paul E. and Rao, Vithala R. (1971), "Conjoint Measurement for Quantifying Judgmental Data," Journal of Marketing Research, 8, 355-63.

Green, Paul E. and Srinivasan, V. (1978), "Conjoint Analysis in Consumer Research: Issues and Outlook," The Journal of Consumer Research, 5 (September), 103-23.

Johnson, R. M. (1974), "Trade-off Analysis of Consumer Values," Journal of Marketing Research, 11, 121-27.

Myers, James H. and Shocker, Allan D. (1981), "The Nature of Product-Related Attributes," in Research in Marketing, Vol. 5, ed. J. N. Sheth, JAI Press, pp. 911-36.

Schmidt, F. L. (1971), "The Relative Efficiency of Regression and Simple Unit Predictor Weights in Applied Differential Psychology," Educational and Psychological Measurement, 31, 699-714.

Wilkie, William L. and Pessemier, Edgar A. (1973), "Issues in Marketing's Use of Multi-Attitude Models," Journal of Marketing Research, 10, 428-41.