Two Models For Representing Unrestricted Choice Data

Paul E. Green, University of Pennsylvania
Wayne S. DeSarbo, Bell Laboratories
ABSTRACT - The collection of unconstrained choice data (in which respondents choose products, TV shows, etc. from a reference set that is left unspecified) is a common practice in marketing research. This paper describes and applies two models for representing such data as points in a multidimensional space. The models are illustrated in the context of preferences for sports cars.
[ to cite ]:
Paul E. Green and Wayne S. DeSarbo (1981) ,"Two Models For Representing Unrestricted Choice Data", in NA - Advances in Consumer Research Volume 08, eds. Kent B. Monroe, Ann Abor, MI : Association for Consumer Research, Pages: 309-312.

Advances in Consumer Research Volume 8, 1981      Pages 309-312

TWO MODELS FOR REPRESENTING UNRESTRICTED CHOICE DATA

Paul E. Green, University of Pennsylvania

Wayne S. DeSarbo, Bell Laboratories

ABSTRACT -

The collection of unconstrained choice data (in which respondents choose products, TV shows, etc. from a reference set that is left unspecified) is a common practice in marketing research. This paper describes and applies two models for representing such data as points in a multidimensional space. The models are illustrated in the context of preferences for sports cars.

In marketing research it is common practice to collect what may be called unrestricted choice data. By this is meant that respondents are neither instructed to choose a fixed number of alternatives nor is the set of possible alternatives explicitly listed. The interesting thing about this class of data is that ambiguity surrounds the alternatives not picked by a given respondent. One cannot tell whether nonchosen alternatives are viewed less favorably than chosen ones because the respondent is not required to list all of the items considered in choosing those that are preferred.

Recently, Levine (1979) has described an interesting model for scaling such data that leads to a joint space representation of stimuli (items chosen) and respondents (choosers). This is carried out by means of an internal MDS (multidimensional scaling) analysis (Carroll 1972) in which the stimulus space is not specified beforehand. Levine also mentions--but does not attempt to develop--a companion external analysis in which the stimulus space is prespecified.

The purpose of the present research note is twofold. We first show how Levine's technique can be applied to problems arising in marketing research. Second, we describe how his method can be adapted to external analysis problems, based on the inclusion of prespecified attribute ratings data and application of Carroll and Chang's PREFMAP-2 model (1971). The net result of this extension is the development of a joint space consisting of three sets of points--stimuli, attributes of stimuli, and respondents' ideal points. An empirical application of the two models is presented, followed by a brief discussion of potential industry applications.

THE MODELS

We start out by describing the Levine model and PREFMAP-2. Since detailed descriptions of each model appear in the references cited above, our discussion is brief.

The Levine Model

Basic input data to the Levine modal consist of I respondents' choices of J stimuli (e.g. brands, political candidates) in terms of preference, endorsement, appropriateness, or any other construct of interest to the researcher. The J stimuli are defined as the sat union of all I individuals' choices. The basic idea of the Levine modal is motivated by the following argument. If one had available an a priori stimulus configuration (as in external MDS analysis), a natural procedure would be to compute any given respondent's ideal point as the centroid of the stimuli he/she picks. By the same token, if an a priori respondent configuration were available one could position a stimulus point at the centroid of the points representing all respondents who happened to pick that stimulus.

What Levine's model does is to find two sets of coordinates, one set for respondents and one set for stimuli, that satisfy the centroid criterion. In Levine's notation, given respondents 1, 2, ..., i, ..., I and stimuli 1, 2, ..., j, ..., J, let E denote a symmetric (I + J)x(I + J) matrix in which the entries are

ei,I+j = eI+j,i = 1   (1)

if and only if respondent i picks stimulus j (with zero otherwise).

Next, let D denote an (I + J)x(I + J) diagonal matrix with general entry dkk representing row sums of corresponding rows of E. Let denote a vector of (I + J) coordinates corresponding to the I respondents and the J stimuli. The vector is a solution to the problem if for all xk in x, the entry xk is proportional to the centroid of coordinates representing entities to which k is linked by virtue of either choosing or having been chosen. In scalar notation, the solution is:

EQUATION   (2)

where 1 < k < (I + J) and l > 0. Expressing (2) in matrix notation, we have:

yx = (D-1 E)x   (3)

and, hence, the desired coordinates are the eigenvectors of the nonsymmetric matrix D-1E. Various ways are available to expand  to a symmetric form that permits standard eigenstructure computer routines to be used in solving for D-1E. As in any kind of eigenstructure decomposition, the researcher may wish to retain only the first few eigenvectors, accounting for the greatest proportion of the trace of D-1E.

In sum, the Levine model takes an (I + J)H(I + J) matrix of 0-1 entries and solves for a set of coordinates xr in some (I + J) x r-dimensional space, according to the criterion of equation (2).

PREFMAP-2

Carroll and Chang's PREFMAP-2 algorithm was motivated by the desire to define a stimulus configuration from preference-like data alone. In PREFMAP-2 the preferences are typically interval-scaled (e.g. ratings), not 0-1 data. The key idea of the PREFMAP-2 model is to allow the preference data to he used twice. For example, if I respondents are asked to rate J stimuli with respect to preference, then it can be shown that a singular value decomposition of the doubly centered I x J matrix (i.e. the I x J matrix with both row and column means removed) yields both a stimulus space and a set of ideal points for respondents. Unfortunately, however, each is defined only up to a linear transformation. PREFMAP-2 finds the desired linear transformation of the stimulus space (which satisfies an explicit least squares criterion) so that the stimulus configuration is appropriately linked to the ideal point configuration.

Having found this linear transformation, the specific ideal point locations are found by quadratic regression in exactly the same manner used in the original PREFMAP algorithm. Somewhat more formally, let:

pij = ai d2ij + ci   (4)

denote the preference scale value of respondent i for stimulus j; d2ij is the squared Euclidean distance between stimulus j and ideal point i; ai and ci are parameters of a linear function for respondent i; & denotes least squares approximation.

In PREFMAP-2, equation (4) is constrained to a1 = a2 = Y = aI = a, so that all respondents have the same slope coefficient. The first step in the computational procedure is to factor p, the preference data matrix with general entry pij into the matrix product:

P = YX'   (5)

where X is the desired stimulus matrix, defined up to a linear transformation. The object is then to find that linear transformation T, so that:

X = X*T.   (6)

PREFMAP-2 provides a least squares criterion that solves for T in the sense of best fitting (in a least squares sense) the preference values. Details can be found in Carroll and Chang (1971).

Comparing the Two Models

PREFMAP-2 and the Levine model were developed independently for different types of data--0-1 responses in the case of the Levine model and interval- (or ratio-) scaled data in the case of PREFMAP-2. What is proposed here is to use PREFMAP-2 in developing a joint space of stimuli and a third set of entities--attributes of the stimuli. In this case the list of attributes is prespecified. It is further assumed that the respondents represent a homogeneous sample with regard to their perceptions of the stimuli that they associate with specified attributes.

Having found a joint space of stimuli and attributes, each respondent's unconstrained stimulus choices (i.e. Levine-type pick k/n data) are used to compute that person's ideal point in the Joint space of stimuli and attributes. Ideal points are computed as centroids of the stimuli that each respondent picks. This step leads to three sets of points in a common space--attributes, stimuli and ideal points. An attractive property of the approach is that, if all distances are comparable, one can consider ideal point to attribute distances. These latter distances can be substantively interpreted as the desirability that each respondent has for each stimuli attribute (as inferred from consideration of the respondent to stimulus associations and the stimulus to attribute associations).

In sum, we address the problem of modeling pick k/n data for the case of an externally supplied stimulus configuration (as mentioned by Levine). This is accomplished by the PREFMAP-2 procedure. We then apply the centroid feature of Levine's approach to modify PREFMAP-2, so as to include ideal points derived from 0-1 preference responses.

A PILOT APPLICATION

The two-stage approach, Just described, was applied, on a pilot basis, to a set of data obtained from 35 respondents, drawn from a university population. Each respondent was asked to complete three tasks in the order shown:

1.  List all foreign sports cars that come to mind that he/she would like to own, if money were not a major constraint. (No specified number or list of choices was provided.)

2.  For each chosen car, pick as few or as many attributes as desired, from a specified list of 20 (see Table i), that are must highly associated with the choice.

3.  List three car attributes from the list of 20 that he/she would most like to have in a foreign sports car.

Table 1 shows the nine foreign sports cars whose names appeared most frequently across the set of 35 respondents.

TABLE 1

SPORTS CAR NAMES MENTIONED MOST FREQUENTLY AND PRESPECIFIED CAR ATTRIBUTES

Application of the Levine Model

Let us consider the Levine model first. In this case we focus only on the 35x9 matrix of 0-1 entries in which a 1 denotes stimulus j being picked by respondent i; 0 denotes otherwise. (As described earlier, this matrix is then transformed to a 44x44 symmetric E matrix.) A computer program was prepared to solve for the respondent and stimulus matrix of coordinates, as described in equation (3).

The first 3 eigenvalues accounted for 55 percent of the trace. For illustrative purposes, Figure 1 shows a plot of dimensions 2 and 3. [In eigenstructure solutions of the particular matrix product in equation (3), the first eigenvector is always a constant vector, and hence, is deleted in the spatial plots.] Each eigenvector has been scaled by the square root of its associated eigenvalue. The horizontal axis might be described as a "value" axis that separates cars of lower price (e.g. MGB) from those of higher price (e.g. BMW). The vertical axis might be considered as an appearance axis that separates highly elegant cars such as Ferrari and Jaguar SJ6 from the more functional cars, like BMW and Porsche. (However, these interpretations are speculative, at best.)

Although omitted (for clarity) from Figure 1, the ideal points obtained from the Levine internal analysis showed a fairly high concentration around Mercedes, BMW, and Porsche; still, each car had one or more respondents for whom it represented the closest point.

The Two-Stage Analysis

The next step in the analysis was to apply the PREFMAP-2 model to the 9x20 matrix of associations data in which each cell entry denotes the frequency, across respondents, with which each attribute (column) is perceived to be associated with each sports car (row). Solutions, involving the fitting of the simple ideal point model to the optimally rotated and differentially stretched stimulus space, were obtained in both three and two dimensions. For illustrative purposes we comment only on the two-dimensional solution. [The root-mean-square fit across all nine sports cars was 0.76 for the two-dimensional solution and 0.83 for the three-dimensional case.]

All multiple correlations between the dependent variable (frequency with which each attribute was associated with a given sports car) and the stimulus coordinates of the attributes were significant at the 0.05 level or better. Figure 2 shows the Joint space of sports cars and attributes.

Comparison of the sports car positions in Figure 1 and 2 shows the influence of the attribute-sports car associations on the placement of the sports car points. [In future studies the researcher may wish to rotate (or otherwise transform) the two solutions to maximal congruence via Cliff's (1966) algorithm or some other such procedure. Here, our interest was primarily to examine the character of the solutions just as they appeared in their respective computer printouts.] As noted, Mercedes, BMW, and Porsche are still relatively near each other (as are Jaguar and Ferrari) but Fiat and MGB are much more distant from each other in Figure 2 than in Figure 1. [Still, the canonical correlation between the sports car coordinates in Figure 1 and their counterparts in Figure 2 turned out to be 0.84, showing reasonably good correspondence between the two solutions.]

We note that such attributes are prestigious, sleek and racy, plush interior, beautiful lines, and comfortable describe the Ferrari and Jaguar, while the Porsche is described as having high resale value, good braking, high acceleration, good cornering and being well engineered.

Computing the Ideal Points

Respondent ideal points were next found, for the analysis summarized in Figure 2, by simply computing the centroid of all sports cars that each respondent picked. As a type of simple model validation, the next step in the analysis was to compute, for each respondent in turn, the average Euclidean distance of the three attributes that were picked as most highly desirable from his/her ideal point. This average was then compared to the average Euclidean distance of the 17 attributes that were not picked by that respondent.

Our hypothesis, of course, is that picked attributes will be closer, on the average, to a person's ideal point than those not picked. For 24 of the 35 respondents this was, in fact, the case. A sign test indicated that the result was significant beyond the 0.05 level. Hence, there is some evidence to support the reasonableness of the two-stage modeling approach proposed in this paper.

POTENTIAL APPLICATIONS

As mentioned earlier, marketing research surveys abound with questions dealing with the choice of unconstrained options. For example, such data as top-of-mind brand recall (list those brands that you recall having tried at least once in the past six months), non-aided listings of product use occasions, free associations data, and reasons /or liking (disliking) a new product concept lend themselves well to analysis by the original Levine model.

Moreover, if respondents can be meaningfully classified on some type of a priori basis (e.g. favorite brand, heavy versus light product usage), additional analyses can be carried out to see if their ideal points occupy different regions of the joint space.

The two-stage approach, incorporating PREFMAP-2, followed by Levine-type external analysis (computing the centroid of picked items), is also applicable to a number of situations. For example, the stimuli could be print advertisements and the attributes could be characteristics of the ads--type of theme, believability of claims, copy points, etc. In other cases the stimuli could be political candidates and the attributes could be issues on which the various candidates are perceived to have a position. In still other cases the stimuli could be vendors of telecommunications services and the attributes could be various features of chose services, such as excellent technical support, high equipment reliability, and so on. [It is also possible to apply PREFMAP-2 to cases where the attributes are obtained by (say) free association, conditioned on each object that the respondent evokes. In this case unrestricted choice can apply to both brands (or other stimulus objects) and the attributes of those brands.]

In short, it is not difficult to think of many classes of marketing problems for which the models described here may be applicable. It is hoped that additional empirical work will be carried out in the future regarding the pragmatic value of these models in consumer and industrial buyer research. Some initial research on the applicability of the Levine model has already been carried out by Holbrook, Moore, and Wiser (1980).

FIGURE 1

JOINT SPACE CONFIGURATION OBTAINED FROM THE LEVINE MODEL

FIGURE 2

JOINT-SPACE CONFIGURATION FROM PREFMAP-2

REFERENCES

Carroll, J. Douglas (1972), "Individual Differences and Multidimensional Scaling," in R. N. Shepard, A. K. Romney, and S. B. Nerlove (eds.), Multidimensional Scaling. Vol. I. New York: Seminar Press, 105-155.

Carroll, J. Douglas and Chang, Jih-Jie (1971), "An Alternate Solution to the Metric Unfolding Problem," paper presented at the annual meeting of the Psychometric Society.

Cliff, Norman (1966), "Orthogonal Rotation to Congruence," Psychometrika, 31 (March), 33-42.

Holbrook, Morris B., Moore, William L. and Wirier, Russell (1980), "Using 'Pick Any' Data to Represent Competitive Positions," paper presented at the TIMS/ORSA Conference on Market Measurement in Austin, TX, March.

Levine, Joel H. (1979), "Joint-Space Analysis of 'Pick-Any' Data: Analysis of Choices from an Unconstrained Set of Alternatives," Psychometrika, 44 (March), 85-92.

----------------------------------------