Market Shares Estimates Based on Conjoint Analysis of Concepts

James B. Wiley, University of Florida
Robert Bushnell, Wayne State University
ABSTRACT - This paper describes a market share model for estimating the proportion of "first choices" that will be received by each of an arbitrary number of concepts. The model is particularly useful for estimating market shares of multiattribute alternatives (products) using conjoint analysis.
[ to cite ]:
James B. Wiley and Robert Bushnell (1979) ,"Market Shares Estimates Based on Conjoint Analysis of Concepts", in NA - Advances in Consumer Research Volume 06, eds. William L. Wilkie, Ann Abor, MI : Association for Consumer Research, Pages: 582-586.

Advances in Consumer Research Volume 6, 1979      Pages 582-586


James B. Wiley, University of Florida

Robert Bushnell, Wayne State University


This paper describes a market share model for estimating the proportion of "first choices" that will be received by each of an arbitrary number of concepts. The model is particularly useful for estimating market shares of multiattribute alternatives (products) using conjoint analysis.


Conjoint analysis is a recent development in mathematical psychology that is concerned with measuring the effect of two or more variables on the ordering Of a dependent variable. [Work on this paper was supported by ERDA grant "Future Transportation Systems of the Great Lakes Area: Energy and Economics". Contract E(11-1)-4136.] "Conjoint", of conjoint analysis, refers to the notion that it may be possible to measure the relative effects of two or more variables even though their effects individually may be unmeasurable. Conjoint analysis differs from related procedures, such as dummy variable regression or ANOVA, in that only ordinal assumptions are made regarding the dependent variables. Consumer psychologists have found conjoint analysis very useful in the study of consumer preferences for multiattribute alternatives.

The task addressed is to model, fit, and if successful, to predict the choices among alternatives where each alternative is a concept described by a list of its possession of described levels of certain attributes. By a concept we mean a conceptualization of a real object as distinct from the object itself. The descriptive dimensions of the attribute and the descriptions of the levels of the attributes which concepts could possess are all prepared by the interrogator.

It should be noted that the term "attribute" has several interpretations in consumer research. First, the use of the term may suggest a physical characteristic of an object, such as the sweetness of a drink or the color of a car. Second, its usage may imply criteria for membership in a category, as, for example, "a 'strike' in baseball might be defined as either a pitch that the batter swings at and either misses (A) or hits in foul territory (B), or a pitch to the batter in the area above home plate (C), above the matter's knees (D), and below his armpits (E); that is,

A U B U (C n D n E)    strike". (Wyer, 1974, p. 20.)

Finally, it may be used as a way of classifying specific perceptual attributions about members of another class, such as a product. With this usage, "sportiness" is interpreted not as something cars have in common, but rather something that attributions regarding cars may have in common.

Whatever the usage, however, multiattribute alternatives, or concepts, may range in abstractions from verbal descriptions, through pictorial representations, to physical "mock-ups" (Green and Tull, 1978). Regardless of the level of abstraction, however, in the mind of the interrogator each concept is regarded as a combination of attributes, each possessed at a specified level. The set of concepts presented to respondents, then, are designed to contain individual concepts or combinations of levels of attributes which are systematically varied from each other. Further, restricting the choice situation by controlling the stimuli assures that each concept is evaluated with respect to the same information. Ambiguous and equivocal cues are removed, and all respondents are thereby certain to have the same information at their disposal and no more so that the inferences made beyond this point will only have their origins in the data provided. (Hoffman, 1960.)

Of course, attributes are not the only things that influence individuals' preferences. As a matter of practice, however, those who use conjoint analysis generally divide the factors that influence consumer preferences into two types. Attributes of concepts which influence alternatives' relative "favorableness" (or utility) comprise the first type. The following assumptions are made regarding attributes:

(1) The consumer can distinguish "levels" or states of attributes (representing specific attributions).

(2) The consumer attaches part-utilities (or partworths) to these levels or states.

(3) The overall utility (or favorableness) of a concept is a function of its part-utilities.

(4) In any set of alternatives, the concept having the greatest utility will be most preferred.

(5) Choice will be a function of preference.

Factors of the second type include characteristics or states that are not attributed to concepts, but which may influence preferences. Examples include the context in which this consumer will use the concept, the needs that are fulfilled through its use, institutional constraints (such as distribution channels), and policies of parties that can influence choice. Factors of this type can influence the outcome of conjoint analysis through two mechanisms. First, they can operate to exclude potential alternatives from consideration. Secondly, they can 'operate by influencing the way in which the consumer chooses amongst those concepts that he is presented. For example, the relative importance of an attribution may depend on the context of the choice.

Three sets of procedures are necessary to implement conjoint analysis (Johnson, 1974):

(1) A technique of data collection requiring consumers to consider "trade-offs" among attributions,

(2) A computational method which derives partworths by accounting as nearly as possible for each consumer's choice behavior, and

(3) A model which allocates "market share" to concepts according to the relative utility of the competing concepts.

To date, the majority of work in the area has focused on the first two sets of procedures.

It is not sufficient, however, to account for the preferences of a single consumer. Market share is the result of many decisions made by many consumers. Each of these consumers may differ in the utilities ascribed to attribute levels and, hence, consumers may disagree in their preferences for specific alternatives. This paper presents a model for estimating the market share that will be captured by each of an arbitrary number of concepts when choices are made by a heterogeneous group of decision-makers.


In order to infer the part-utility of attributions, conjoint analysis requires the assumption of a composition model which specifies the mechanism by which the attribute scores interact and are related. An additive model is frequently adopted. It is then assumed that a consumer bases his choices on the utility of alternatives and that utility is determined according to the following model (the additive conjoint analysis model):


Note the similarity between model (1) and main effects analysis of variance in that in both cases an overall score is assumed to be the sum of a set of effects defined on attribute levels (Green and Wind, 1973).

If the hypothesized rule holds to a satisfactory degree of approximation, we consider the matrix U with elements Uij defined as in (1):

U = {Uij, i=1,...n; j=1,...q}  (2)

Each row of this matrix represents the ith individuals utility for each of the q alternative concepts. Each column represents the utility scores on the jth concept registered by the n respondents. Given this matrix, we require a procedure which will estimate for each concept the mean utility score Uj, its standard deviation, sj, and the correlation of its utility scores with the utility scores of the other concepts. Pekelman and Sen (1977) discuss one approach for estimating these parameters. A weighted least squares procedure is also possible. Both procedures assume attributes are measured at the interval level.


We now turn to the problem of predicting market share, i.e., the proportion of times each concept will be chosen when decision-makers are given a choice of one out of n concepts. If Ui is the utility of the ith alternative, (i = 1,...n) then Ui* is chosen if Ui* > Ui for all i k, that is, Ui* = max {Ui, i=1, n}. If for simplicity, we consider initially the problem of predicting the proportion of choices allocated one concept among three, then for concepts Ci, Cj, and Ck we require the probability of the event (Bock and Jones, 1968):

(Ui > Uj) n (Ui > Uk)  (3)

If we designate the difference between utilities within the above parentheses as uij = Ui - Uj and uik = Ui - Uk, respectively, the required probability may be written as P[(uij 1 uik) > 0]. If in the population of consumers, the distribution of utilities assigned Ci, Cj, and Ck are assumed N(Ui, si), N(Ui, sj), and N(Uk, sk), with Cov(Ui, Uj) = rijsisj, etc., the joint distribution of uij and uik is bivariate normal. This distribution may be denoted as N(uij, uik, sij, sik, rij, ik) where uij is the difference between the mean utility assigned Ci and the mean utility assigned Cj, sij is the standard deviation of the utility differences, and rij, ik is the correlation between the difference scores for (Ci, Cj). The proportion of times Ci will be judged to exceed both Cj and Ck by a randomly selected consumer is given by the positive quadrant of the bivariate normal distribution.

Integration of the multivariate normal distribution when the number of variables exceeds two is extremely difficult. While such problems can be solved using numerical analysis, it generally is not practical to apply a multivariate normal market share model when the number of concepts exceeds three, since this would involve integration of a trivariate (or higher dimensioned) distribution of utility differences. Fortunately, if the correlations between utilities can be assumed equal to one-half, a useful approximation is available for cases involving four or more concepts.


The bivariate logistic distribution in its reduced form is defined as:

F(x,y) = [1 + e-x + e-y]-1  (4)

Gumbel (1961) has shown that the marginal expectations are zero and the marginal standard deviations are EQUATION. Furthermore, the covariance between the marginal scores is p2/6 and hence the correlation coefficient has the fixed value one-half (1/2).

Bock and Jones (1968) suggest that the generalized multivariate distribution provides a convenient vehicle for calculating the proportion of first choices that will be received by each of the n alternatives. To use this function, it is only necessary to adjust the variates to fit the normal distribution by equating variances in the marginal distribution; that is, by using EQUATION times the unit normal deviates as logistic deviates. When this is done in terms of the utility difference variables uij defined in the previous section (e.g., un1 = Un - U1), the probability that Cn will be chosen first from among n concepts is approximated by:



Figures 1, 2, and 3 illustrate the effect of various parameter changes on the estimated proportion of first choices. From these figures, it is evident that the function has at least three desirable characteristics. First, whatever the number of concepts or their utilities, the "market shares" assigned each will sum to one. Second, the greater the difference between the mean utilities of the concepts, the greater the difference between their market shares (assuming equal standard deviations).

This property is illustrated in Figure 1. Note that concepts "2" and "3" have equal mean utilities in the first data set and that they are assigned equal market shares.



The third desirable property of the model is illustrated in Figure 2. This figure illustrates that the more the respondent population agrees about the utilities of the concepts (the smaller the variances), the more the model will discriminate between concepts on the basis of small utility differences. For example, the same utilities are assumed in both data seta; however, the standard deviations of the utilities in the second data set are four times the standard deviation in the first data set. Market share estimates in the small variance data set range from 1.5 percent to 72 percent while market share estimates in the large variance data set range from 7.5 percent to 54 percent even though both are based on the same mean utilities.

Figure 3 illustrates further the interesting market share dynamics that can occur as a consequence of interaction between the utility estimates and their variances. Here concepts having the same mean utilities have somewhat different market shares as a consequence of their differing standard deviations. In general, a concept will pick up market share relative to other concepts as the variance of its utility increases. This suggests that it is possible for a concept to have a smaller mean utility than alternative concepts and yet capture the largest "first choice" or market share.




It is interesting to compare the behavioral assumptions of the logistic model with other models. (We follow Bock and Jones, 1968, in these matters.)

Relationship to Thurstone Case V

The psychologist Thurstone developed a number of models which link subjective properties of stimuli to observable choice proportions. The most restrictive model --which also is the most practical -- incorporates the so-called Case V assumptions. Torgerson (1958, pp. 159-165) provides an excellent discussion of the Case V assumptions. The covariance between the difference scores can be represented as (Bock and Jones, 1968, p. 249):

EQUATIONS   (6) - (8)

It is the term rij, ik -- which Bock and Jones call the

"Comparital correlation" -- that is assumed equal to 1/2 when using the multivariate logistic model. The central assumptions of Thurstone Case V are that s2i = s2j for all i and j and that rij = rik for all i, j, and k. With these assumptions, equations (6), (7), and (8) can be written:

EQUATIONS  (6a) -   (8a)

Equation (6a) may then be solved for rij, ik


That is, if the Thurstone Case V assumptions hold, then the "comparital correlations" will equal one-half, consistent with the properties of the logistic model.



Relationship to the Bradley-Terry-Luce (B.T.L.) Model

Bradley and Terry (1952) and Luce (1959) have suggested a model to link probability of choice and preference. Their models assert that:


where P(j) = probability that individual will choose brand j, V(j) = consumer's ratio scale preference for brand j, k = 1,1, . . .,j,. . . ,q. Equation (9) may be rewritten:


This equation may be converted to the logistic model by setting:


Although the Bradley/Terry/Luce model (9) normally is applied to individual choice behavior, Bock and Jones (1968, sec. 6.5.1) provide an interpretation of model parameters that allows the model to be applied to populations of individuals. The B.T.L. model, therefore, can be considered a special case of (5).

Because B.T.L. model incorporates a probabilistic version of the axiom of independence of other alternatives, however, it cannot account for the effect that similarity between alternatives may have one choice. For example, suppose an individual is given a choice of a can of Green Giant peas (U = .3), a can of Del Monte peas (U = .3), and a can of Del Monte peas with a $.10 rebate coupon (U = .4). Assuming the alternatives have the same price, the B.T.L. model suggests the Del Monte can with the coupon has .40 probability of selection. If a Del Monte is selected, however, almost everybody would agree the version with the coupon would be selected. Note that whatever the valued characteristics of the regular can may be, they are shared with the coupon-bearing can. Consequently, a large and positive covariance between the utilities assigned these two alternatives is expected (10). This will result in a small standard deviation of difference scores and a large difference between "market shares" will be predicted for these two alternatives by the logistic market share model.


This paper discusses a logistic market share model that can be used in connection with conjoint analysis. The model appears to provide a practical vehicle for estimating market shares that would be captured by abstract alternatives, or concepts.

We make the following assumptions in developing the market share model:

(1) decision-makers attach part-utilities (or part-worths) to concept characteristics,

(2) the overall utility (or "favorableness") of a concept is a function of the sum of its part-utilities,

(3) the most favorable concept will be selected.

Thus, individual choice is assumed to be deterministic, If all decision-makers agree on concept characteristics and the partworths of these characteristics, then a single concept will receive all the choices, ceteris paribus.

However, all factors will not be equal. In particular, it is likely that decision-makers disagree on the part-worths they assign characteristics. If decision-makers disagree on partworths, then a distribution will exist for the favorableness of each concept. Therefore, choice is stochastic in the aggregate.

The logistic market share model is somewhat more general than Thurstone Case V in that unequal standard deviations for utility difference estimates can be accommodated. It is somewhat more general than the B.T.L. model in that it is not necessary to assume that pair-wise ratios of choice probabilities are independent from other alternatives.

It should be noted that McFadden (1970) has proposed a probabilistic individual choice model similar in formulation to the market share model outlined. It appears likely that the two formulations could be combined in a model having both individual and aggregate stochastic elements.

The logistic market share model as currently formulated does not allow external information to be incorporated into market share estimates. For example, a subset of the concepts presented individuals may describe actual alternatives for which market share data is available. A useful extension of the current formulation would enable the researcher to "calibrate" the model using prior data on a subset of concepts. Needless to say, it would also be desirable to include in the calibration process covariate information vis-a-vis the subset of alternatives, such as their distribution share, advertising share, promotion share, and so forth. Validation of the model against subsequent choices is of ultimate concern. A series of such studies is contemplated.


R. D. Bock, and L. V. Jones, The Measurement and Prediction of Judgment and Choice, (San Francisco: Holden-Day, 1968).

R. H. Bradley, and M. E. Terry, "Rank Analysis of Incomplete Block Designs, I, The Method of Paired Comparisons," Biometrika, 39(1952), 324-345.

P. E. Green and Y. Wind, Multiattribute Decisions in Marketing: A Measurement Approach, (Hinsdale, Illinois: The Dryden Press, 1973).

E. J. Gumbel, "Bivariate Logistic Distributions," Journal of the American Statistical Association, 56(1961), 335-349.

P. J. Hoffman, "The Paramorphic Representation of Clinical Judgment," Psychological Bulletin, 47(1960), 116-131.

R. M. Johnson, "Trade-Off Analysis of Consumer Values," Journal of Marketing Research, 11 (May 1974), 121-127.

R. D. Luce, Individual Choice Behavior, (New York: John Wiley & Sons, 1959).

D. McFadden, "Conditional Logit Analysis of Qualitative Choice Behavior," in P. Zarembka (ed.) Frontiers in Econometrics, (New York: Academic Press, 1970), 105-142.

D. Pekelman and S. Sen, "Regression Versus Interpolation in Conjoint Analysis," Advances in Consumer Research IV, W. D. Perreault, Jr. (ed.), 29-34.

W. S. Torgerson, Theory and Methods of Scaling, (New York: John Wiley & Sons, 1958).

R. S. Wyer Jr., Cognitive Organization and Change: An Information Processing Approach, (New York: John Wiley & Sons, 1974).