Covariance Bias of Thurstone Case V Scaling As Applied to Consumer Preferences and Purchase Intentions

Joel Huber, Duke University
ABSTRACT - A possible source of bias in Thurstone Case V scaling as applied to buyer preference and intention to purchase data is examined. An empirical test for the existence of spatially related covariance bias is proposed and applied to data from two field studies.
[ to cite ]:
Joel Huber (1979) ,"Covariance Bias of Thurstone Case V Scaling As Applied to Consumer Preferences and Purchase Intentions", in NA - Advances in Consumer Research Volume 06, eds. William L. Wilkie, Ann Abor, MI : Association for Consumer Research, Pages: 578-581.

Advances in Consumer Research Volume 6, 1979      Pages 578-581


Joel Huber, Duke University

Murphy A. Sewall, the University of Connecticut  [Also Senior Research Associate, Cambridge Marketing Group of New York City.]


A possible source of bias in Thurstone Case V scaling as applied to buyer preference and intention to purchase data is examined. An empirical test for the existence of spatially related covariance bias is proposed and applied to data from two field studies.


In spite of its use by marketing professionals (Green and Tull, 1978; Sewall, 1978a), there are good reasons to believe that the usual assumptions of Thurstone Case V scaling (1927) are systematically violated by the data collected in many buyer behavior studies. In a typical marketing context, Thurstone Case V analysis is applied across subjects to preference judgments on competing brands. The scale then provides a model of the proportion of respondents that will prefer one brand over another. Thus, as a model, it reduces the n(n-1)/2 of original proportions on pairs to a more readily manageable scale with 'n' values.

The underlying concept, dubbed a "discriminal process," is simple and intuitively compelling. Without getting into details that may be found in Edwards (1957), Torgerson (1958), or Bock and Jones (1968), the process assumes that each evaluation of an object is drawn from a distribution with fixed mean and variance. When a pair of items is compared, the item with the higher momentary value is chosen. In order to estimate the hypothesized mean affective values that make up the scale, it is necessary to make the assumption, among others, that pairs of objects have momentary values with equal covariances. It is this assumption that is consistently violated in many applications of Case V across subjects.

Briefly, covariance bias arises if there is heterogeneity of preferences across subjects that is related to a common perception of the similarities between the objects under study. Under such conditions, the result is a distortion in the predicted pairwise probabilities that can be directly related to the degree of pairwise dissimilarity.


There are two forms of bias which may occur as a result of violating the Case V assumption of equality of co-variances. The first form derives from cases where individuals' affective values are random deviations from the group mean. In this model, pairs that share an item in common will be correlated for a respondent since a high probability of choosing that item on one pair implies a higher probability of choosing that item when it is paired with other objects. Bock and Jones (1968, p. 144ff) provide a relatively complex, but usable, procedure to adjust for this source of heterogeneity of covariances.

The second form of bias currently is not analytically tractable. This source of bias arises when the variation in preferences across subjects is not random, as in the case above, but instead is based upon a spatial model. That is, respondents who tend to like one object also tend to like other objects which are spatially similar. This is not a new idea; it is the necessary condition for most attribute models and has severe implications with respect to several probabilistic choice models. To quote Bock and Jones (1968, p. 133):

"Indeed the principle of independence from irrelevant alternatives has the same effect as constant correlation of discriminal processes for all pairs of stimuli. For, if the correlations are constant and the discriminal dispersions are equal, the difference process for pairs of objects, not sharing an object in common are uncorrelated. This implies that the conditional probability of a subject's choice between two objects, given his choice between any other two objects, is equal to the unconditional probability. It is doubtful that the principle is strictly true in practice, since personal preferences for given objects usually extend to similar objects. (A person who likes turnips will probably also like rutabaga and may even have a taste for parsnips.) Nevertheless, the assumptions of Luce's model and Thurstone's Case V may appear to be well enough approximated in many applications to allow reasonably accurate predictions of choice."

It is this last assertion that may be questionable in a consumer behavior context. There is good reason to believe that systematic distortions in choice due to perceptual similarities are not insignificant. Indeed, Coombs (1964, ch. 8) bases an entire joint-space scaling on the assumption that correlations in affective values across subjects enable one to recover the latent dimensionality of the space. [It should be noted that Coombs based his derivation of the joint space of subjects and objects on the correlation of affective values between subjects. But since the joint space is symmetric between subjects and objects, it could be identically defined from the correlation of affective values between objects, as above.]

Furthermore, any model of preference based on attributes implies that the correlation between objects will depend upon attribute similarity. That is, a subject's liking for one object implies a greater probability of liking similar items. This means that similar items will tend to be positively correlated while dissimilar items will tend to be negatively correlated.

The Thurstone Case V assumes that all pairs of objects have identical covariances. If, contrary to this assumption, similar pairs are more highly correlated, a consistent bias will result, moving predictions for these pairs closer to 0.5. This bias can be clearly seen by examining an extreme case.

Suppose two similar items have a (covert) correlation of nearly 1.0. In such a situation, one item would almost always dominate the other because its momentary value would increase whenever the other's did. Case V, however, assumes that all pairs have the same covariance. Thus, the predicted probability for this pair would be biased closer to 0.5. Similar logic applies to dissimilar items, except the bias is expected to be away from 0.5 and towards either 0.0 or 1.0.

A Test of This Distortion

Modeling this distortion is complicated by the fact that the direction of the predicted bias reverses when the predicted probability is below, rather than above, 0.5. One way of eliminating this problem is to recode the predicted proportions on all pairs to be greater than 0.5. Because, by definition, the predicted probability of choosing 'j' over 'k' is the complement of the probability of choosing 'k' over 'j' (pjk = 1 - pkj), it is a simple matter to make all predicted probabilities greater than 0.5 by reordering the subscripts.

To avoid the problem of scale effects wrought by probabilities, all probabilities are transformed to standard normal deviates (z-scores) by the inverse of the cumulative normal probability distribution. Assuming that the relationship between the bias and some measure of dissimilarity between pairs is approximately linear in the z-scores, the strength of the relationship can be tested by

zjk - ^zjk = a + b.djk,   (1)


^zjk > 0.0

If bias is systematically in the predicted direction, the coefficient of the dissimilarity between two objects (djk) should be negative. This sign occurs because a highly dissimilar pair will tend to have a negative (covert) covariance, and the predicted value will be biased upward, creating a negative residual term. Conversely, a highly similar pair (djk = 0) will result in a ^zjk that is biased downward so that residuals will be positive.

The spatially related covariance distortion in Thurstone scales also is expected to be an increasing function of the affective heterogeneity of the underlying subjects. That is, bias will be greater if individuals have widely different preferences for the objects under study.


A Marketing application of Thurstone Case V scaling was reported at the 1977 Association for Consumer Research meeting (Sewall, 1978a). The scaling reported in that study is based on derived paired comparisons calculated from five-point intention to purchase ratings. It was pointed out at the meeting that, since derived paired comparisons suppress intransitive pairs, the usual statistical test for the adequacy of Thurstone Case V (Mosteller, 1951) produces an exaggerated estimate of accuracy. Hence, the test proposed in this paper appears to be a more appropriate method of evaluating scales based upon derived paired comparisons.

Sewall (1978b) also has developed a procedure for dividing a population into intention to purchase or preference segments based upon the rank order correlations between individual subjects and a group scale. Because the procedure requires that all subjects have a significant rank order correlation with the Thurstone scale for the segment, one would expect much greater affective homogeneity for subjects within a segment than for the population as a whole. Thus, covariance distortion is expected to be much greater for a scale based on all subjects than for scales for each segment.

Comparison Data

Intention to purchase ratings for two different product types were collected for this study. Both data sets are from mall shopper intercept interviews taken in four major metropolitan areas in different sections of the United States. Five-point intention to purchase ratings on 22 bed linen (sheet) designs were collected in the spring of 1977. Data on 29 drapery designs were gathered in the summer of 1977. All interviewing was done by paid professional interviewers.

The data from the two surveys provide contrasting prior expectations about the severity of covariance distortion for total population scales. The two largest segments found in the drapery study are of nearly equal size and quite divergent in their purchase intentions. The responses to the sheet survey are dominated by a large segment containing about sixty percent of the subjects.

Table 1 contains the correlation matrices between the segment Thurstone scales for each survey. Because the correlations between the scales for the two large segment scales (numbered 1 and 2) in the drapery survey are significantly negatively correlated, greater affective heterogeneity is implied in this population than for the sheet survey respondents. The one large segment in the sheet survey (numbered 1) has a Thurstone scale that is less divergent from the scales for the two smaller segments. Hence, the population scale for the sheet survey should be dominated by the purchase intentions of the largest segment, and less covariance distortion should be found in the group scale for the subjects responding to the sheet survey than for respondents to the drapery survey.



Another measure of the relative homogeneity of subjects upon which a Thurstone scale is based is the average rank order correlation between the subjects' ratings and the scale values. Scales for groups with lower average subject correlations are expected to produce greater evidence of covariance distortion than scales for groups with higher average subject correlations.

Dissimilarities Data

The independent variable for each data set is based upon dissimilarities in physical attributes of the products. Several obvious characteristics of the designs (for example, weave, pattern, lining, and variety of color for the draperies) were coded as dummy variables (either the characteristic is present in the design or it is not). The matrix of designs by attributes was subjected to principal components analysis and dissimilarities between pairs is operationalized as the Euclidean distance between designs in reduced space. For the drapery designs, the first two components (containing 79.4 percent of the variance in the four attributes) are used as a basis for dissimilarities.

Ten attributes were identified for the 22 sheet designs. The first three principal components (containing 60.8 percent of the attribute variance) are used as a basis for calculating dissimilarities.

Least-Squares Analysis

Values of the dependent and independent variables (equation 1) were plotted to determine if a nonlinear transform of the data might be appropriate. The plots indicate the relationship between bias and dissimilarity is approximately linear.

Table 2 contains correlation results and calculations of average subject to Thurstone scale rank correlations. Significant evidence of spatially related covariance distortion is indicated. The results are generally consistent with expectations about the behavior of such distortion. In all instances, the direction of the bias is as predicted. Distortion is greatest in the group scale from the drapery survey. The dominance of segment 1 in the sheet survey is indicated by a higher average subject correlation for the total sample compared to the average subject correlation for the drapery survey respondents. Within surveys, distortion is greater for scales based on total samples than for scales based on segments of relatively homogeneous individuals. However, evidence of significant distortion appears in the segment scales, and the average subject correlation does not appear to be a general indicator of the severity of this distortion.




Under the assumptions of Thurstone Case V, the probability that a particular object will be chosen over another can be derived from the cumulative normal probability distribution. Although the logic is quite different and the estimation methods differ slightly, the estimated probabilities are virtually identical to models which assume the logistic function (Luce, 1964) or the arcsin transform distribution (Torgerson, 1958; Bock and Jones, 1964; David, 1963). Therefore, covariance bias applies to these models as well. [An analysis was run using arcsin transformed variables (Mosteller, 1951) instead of z-values with results which were not significantly different from those reported in Table 2.]

Subjective dissimilarities on objects may be more appropriate for detecting distortion than the objective dissimilarities used in this study. Consumers are likely to react to some attribute differences more strongly than others. It is possible that evidence of spatially related covariance distortions might be even stronger if subjective dissimilarity data were available.

From a marketing management perspective, the existence of covariance distortion on Thurstone Case V predictions for pairs will have little effect on the average affective values relevant to such questions as:

1. What types of designs are favored by various consumer segments?

2. What types of designs are unlikely to appeal to any segment of consumers?

Thus, although predictions of choice of one object over a dissimilar one may be exaggerated, the bias is unlikely to alter a conclusion about which type of dissimilar objects is favored.

If management wishes to choose between similar objects, distortion may cause a problem. Because predictions understate choice probabilities, the scale may not indicate the degree to which one object actually is favored over another, similar one.

It should be noted, however, that the covariance distortion found to exist in affective judgments is less likely to be a problem when applied to perceptual judgments; that is, judgments that one item is larger, creamier, or zestier than another. Perceptual judgments should be less susceptible to covariance bias because they are more homogeneous across subjects than preference judgments.


Spatially related covariance distortion can exist in Thurstone Case V scales derived in consumer preference or purchase intention studies. Significant distortion can occur in data which appears acceptable on the basis of an established test for the adequacy of Thurstone Case V scaling (Mosteller, 1951). The adjustment for covariance distortion requires separate data on inter-object dissimilarities, but is particularly appropriate for Thurstone scales of preference based upon derived paired comparison data aggregated across consumers.

Procedures that cluster respondents into subjects which express similar preference or purchase intention orderings appear to reduce the magnitude of this spatially related covariance distortion. Whether this distortion should concern management users of Thurstone scale information depends specifically on whether management is concerned with overall affective values or on the differences between individual items. In the latter case steps should be taken to reduce covariance distortion.


Darrell R. Bock and Lyle V. Jones, The Measurement and Prediction of Judgment and Choice (San Francisco: Holden-Day, 1968).

C. H. Coombs, A Theory of Data (New York: Wiley, 1964).

H. A. David, The Method of Paired Comparisons (New York: Hafner, 1963).

A. L. Edwards, Techniques of Attitude Scale Construction (Englewood Cliffs, New Jersey: Prentice-Hall, 1957), 1982.

P. E. Green and D. S. Tull, Research for Marketing Decisions, 4th edition (Englewood Cliffs, New Jersey: Prentice-Hall, 1978), 258-9.

R. D. Luce, Individual Choice Behavior (New York: Wiley, 1959).

Frederick Mosteller, "Remarks on the Method of Paired Comparisons: III. A Test of Significance Assuming Equal Standard Deviations and Equal Correlations," Psychometrika, 16(June 1951), 207-18.

Murphy A. Sewall, "Nonmetric Unidimensional Scaling of Consumer Preferences for Proposed Product Designs," in H. Keith Hunt, ed., Advances in Consumer Research, vol. 5 (Ann Arbor, Michigan: Association for Consumer Research, 1978), 22-5.

Murphy A. Sewall, "Market Segmentation Based on Consumer Ratings of Proposed Product Designs," Journal of Marketing Research (1978), forthcoming.

L. L. Thurstone, "A Law of Comparative Judgment," Psychological Review, 34 (1927), 278-86.

Warren S. Torgerson, Theory and Methods of Scaling (New York: Wiley, 1958).