Evaluation Process Models and the Prediction of Preference



Citation:

Frederick A. Russ (1971) ,"Evaluation Process Models and the Prediction of Preference", in SV - Proceedings of the Second Annual Conference of the Association for Consumer Research, eds. David M. Gardner, College Park, MD : Association for Consumer Research, Pages: 256-261.

Proceedings of the Second Annual Conference of the Association for Consumer Research, 1971     Pages 256-261

EVALUATION PROCESS MODELS AND THE PREDICTION OF PREFERENCE

Frederick A. Russ, University of North Carolina

This paper briefly reviews the use or evaluation process (EP) models to predict consumer preferences and, based on research currently underway discusses some of the problems facing those trying to determine which EP model is best.

Evaluation is the process of determining the (relative) positions of one or more alternatives with respect either to some criterion or to each other. Because evaluation is the step in the decision process which immediately precedes choice, it is of major importance in predicting choice and it may also help to explain a number of post-choice phenomena (e.g., the amount of cognitive dissonance).

EP models are based on the way in which individuals purportedly evaluate alternatives facing them. They assume that an individual's preference among alternatives will be some function of his preference for the various attributes of each alternative and how important these attributes are to him. Thus, these are models of multiple-attribute decision making.

THE MODELS

EP models are numerous in the normative literature (MacCrimmon 1968). They find some use (often in simulation models) in the descriptive literature but with few exceptions (Russ; 1971), only one type of model-additive-has been used to predict preferences.

All of the major evaluation process models which have been generated to date are based on one or more of three notions about the way evaluations are made.

1. Alternatives are compared by "collapsing" preferences for individual attributes into a single overall value for each alternative. The function which collapses the attribute values is presumed to be additive.

2. Alternatives are evaluated on the basis of their values for the attribute which is most important to the decision-maker. This is essentially a lexicographic approach.

3. Alternatives are evaluated by comparing their attribute values with a set of goals or standards for these attributes. Any alternative (but usually the first one discovered) which meets or exceeds all (or some prespecified subset) of these standards is chosen. This is basically the satisficing approach suggested by Simon (1955).

We shall focus our attention on five models based on these approaches.

Additive Weighting (ADD)

The ADD model is based on the first approach suggested above. A decision maker chooses the alternative which has the best score on some weighted additive evaluation function.

The additive utility model preferred by decision theorists asserts that the utility of an alternative is equal to the sum of the utilities of the attributes of that alternative. (The weights are implicit in the attribute utility judgments.) Adams and Fagot (1959), Dickson (1970), and Tversky (1967) provide good examples of empirical research designed to test this model.

A different additive model is preferred by the clinical judgment theorists (Einhorn, 1969). It is suggested for use with alternatives described by numerical attributes. Weights for the model are determined by using multiple regression with the overall preference ranking as the dependent variable and the numerical attribute values as the independent variables.

A third model which is rapidly becoming more popular among marketing scholars (Bass and Talarzyk, 1969; Hansen, 1969) is based on attitude theory suggested by Fishbein (1967). This model states that an individual's attitude toward an object (which may be interpreted as a preference for it) will depend on (1) how satisfactorily the object possesses certain attributes and (2) how important these attributes are to that individual. We shall focus our attention on the last of these three ADD models.

Regular Lexicography (LEX)

The LEX model suggests that the decision maker chooses among alternatives on the basis of their values on the attribute most important to him. If more than one alternative exhibits the same value for the most important attribute, the tie is broken by looking at the second most important attribute...and so on until there are either no more ties or no more attributes. LEX seems to be substantiated as an explanatory model in protocols reported by Alexis, Haines, and Simon (1968); Bettman (1969); Clarkson (1963); and Russ (1970).

Lexicographic Semiorder (LSO)

The LS model is a modification of the LEX model. Rather than suggesting that the second most important attribute is considered only if two or more alternatives have equivalent values for the most important attribute, it suggests that the decision maker turns his attention to the second most important attribute so long as the difference between two values of the most important attribute is not significant or not noticeable.

Such a model is obtained by applying a just noticeable difference structure to a lexicographic ordering (Luce, 1956; Tversky 1969) or by assuming that a decision maker is unwilling to downgrade alternatives when differences are insignificant, even though preferences are defined (Yntema and Torgerson, 1961).

SATISLEX

The SATISLEX model is a combination of the LEX model with the satisficing notion developed by Simon. It suggests that alternatives which fail to meet certain goals or standards will be eliminated from further consideration; those that remain will be ranked lexicographically.

SATISLS

The SATISLS model combines the LS model with satisficing.

PREDICTIVE ACCURACY OF EP MODELS

Are EP models any good? The answer must depend on the purposes of the researcher. If predicting first preferences correctly is the appropriate measure, the answer would be that all of them are good. For example, Bass and Talarzyk used the Fishbein additive model and correctly predicted first preferences for well known, inexpensive branded products between 54 and 75 percent of the time for over 1100 respondents. Random predictions would have yielded a 20% predictive accuracy. Russ used all five of the models to predict housewives choices among small appliances. In the eighty choice situations, the first choice was correctly predicted between 55 and 66 percent of the time (depending upon the model used), when 105 predictive accuracy would have been expected by chance.

First preference accuracy measures are comparable from one study to the next, but such is not the case for measures of preference order accuracy. Measures include a "confusions" matrix used by Bass and Talarzyk, Spearman's Rho, and a Preference Accuracy Index developed by Russ and based on Kendall's Tau. Only the latter index takes into consideration both the order in which the preferences are predicted and the importance of predicting a particular preference correctly. (For example, it is more important to predict the first preference accurately than it is to predict the fourth.) Some standardization needs to occur before models can be compared across the different situations used in various studies. Nevertheless, preferences below the first do seem to be predicted considerably more accurately than would be expected by chance.

Despite the exceptionally good predictive results obtained with all of the evaluation process models, it should be pointed out that the descriptive ability of the popular ADD model must be seriously questioned. There is practically no evidence that decision makers (except, for example, purchasing agents who develop additive formulae for evaluating suppliers) make evaluations in an additive fashion. Instead. most evidence leads to the inference that the satisficing and lexicographic notions - taken together - are extremely accurate indications of what actually transpires during the evaluation of alternatives.

Now we turn to the main issue: which model is best? (And we shall define "best'! on the basis of predictive accuracy rather than external validity.) In the only previous comparative study of predictive accuracy (Russ, 1971), four of the five models proved to be comparably accurate, with LEX ranking considerably lower because of its inability to downgrade alternatives on the basis of particularly poor values on attributes of less than maximum importance.

There are at least two possible reasons for the lack of difference among the predictive models: the quality of model inputs may be poor or the models may actually be relatively close in predictive accuracy, at least for the situations in which they were tested.

Exploratory research was conducted to try to determine whether either reason was the proper explanation. The research involved examination of additional data collected during the research previously reported by Russ, re-examination of other reported research, and some new research dealing with toothpaste preferences of college students. This new research is described briefly below.

Seventy-seven college juniors, seniors, and graduate students taking undergraduate marketing courses at UNC were asked to rank their preferences among ten leading brands of toothpaste. Next they were asked to rank each of six attributes of those products according to the attributes' importance to them. Ratings of importance were also obtained on an eleven point scale assumed to be an interval scale. Comparable rankings and ratings were obtained for how much difference the ten brands of toothpaste exhibited on each of these attributes. Finally, the subjects were asked to rate each brand according to how well it possessed each of the six attributes.

Sixty-four usable questionnaires were obtained. These have been used to assess the predictive accuracy of three of the models: ADD, LEX, and SATISLEX. There were no significant differences among the predictions of first preference made by each of the models. SATISLEX, the most accurate, made 51 correct predictions and there were six ties as well. If ties are randomly allocated, SATISLEX is correct on approximately 84% of its predictions. LEX was the second best predictive model, and ADD was third. Furthermore, it was discovered that using importance and difference ratings multiplicatively as a measure of importance (suggested by Alpert, 1971) significantly lowered the predictive accuracy of all of the models.

From a re-examination of previous research and the new research reported above, a number of tentative conclusions about model input quality may be drawn. (It should be noted that these conclusions are based on relatively small samples which dealt with certain types of people in relatively few situations.)

1. A major problem with model inputs which could lead to smaller differences among models is the practice (except in Russ, 1971) of having subjects rate or rank alternatives on the basis of each attribute rather than rating their preference for attribute values apart from the alternatives. DeSoto (1961) has produced evidence which suggests that a "halo effect' exists in such measurements; preferred alternatives are likely to be rated more highly than dispreferred alternatives on any particular attribute even though objectively they shouldn't be. For example, in the study of toothpastes reported above, 25 of the 64 subjects produced ratings for their most preferred brand which indicated strong or weak dominance over all other brands. Any predictive model based on attribute ratings would have predicted their first preferences correctly.

2. Using the Alpert multiplicative measure of importance may produce an improvement in predictive accuracy when the alternatives in the decision situation presented to the subJect are substantially different from those with which he is familiar, but it may also lead to "double counting" of the effect of attribute value differences on the importance of that attribute. The importance rating may implicitly consider differences in attribute values and thus duplicate their explicit consideration in the difference ratings.

3. Although the use of ordinal scale values as if they were interval scale values in the ADD model is a mathematical faux pas, in those studies where predictions could be compared, the use of ordinal data in tile ADD model did not significantly reduce its predictive accuracy.

4. Input data can almost always be improved upon, but a more fruitful approach seems to be to try to devise situations where, say, ADD and LEX models cannot - or are unlikely to - lead to the same predictions. If past experience is any indication, creating such situations will be a difficult task.

This leads US to consider the other possible explanation for the lack of differences between tile predictive accuracy of the models. Perhaps, in most situations, it really doesn't make much difference which model we use. But how is it possible that different EP models can lead to almost exactly the same predictions?

1. Perhaps a halo effect actually does occur in the individual's mind as he decides - preventing, by anticipation, the existence of postchoice dissonance. That is, perhaps the decision maker looks at alternatives and makes an initial judgment on the basis of only one or a few attributes then he "molds" his views of the values of other attributes to make them conform. He is forced to change his initial evaluation only when he discovers an attribute value which is clearly unacceptable.

2. Most decisions are based on only a few significant attributes which exhibit only a few values each. The high likelihood of tied values on a given attribute would force the lexicographer to look at several important attributes each of which would get high weights in an ADD model. This would promote the possibility of small or no differences between predictions.

3. Decision making could be construed as a matter of making trade-offs. LEX allows no trade-offs between values on the most important attribute and any other attribute. SATISLEX allows them only when some other attribute exhibits exceptionally poor values. If no such poor values exist, then LEX and SATISLEX will make identical predictions. ADD just makes the trade-off relationships explicit, and it has been shown that an ADD model can be derived which can duplicate the predictions of any LEX model (Russ, 1971).

The upshot of the discussion so far is that in many situations we may reasonably expect no significant differences in predictive accuracy among EP models. But this does not mean that the researcher should flip a coin when making his decision as to which model to use.

This decision is a multiple-attribute decision itself. Because EP models seem essentially equivalent on the most important attribute--predictive accuracy, other attributes should also be considered.

LEX and SATISLEX type models offer a considerable advantage over ADD models because of the requirements placed on the inputs gathered from the subjects. Ordinal scale data on importance, preference, and acceptability are required rather than interval scale data on importance and preference. In the original study conducted by Russ, the ordinal data were considerably more reliable over time. Furthermore, in informal research conducted among subjects it was clear that subjects were more comfortable and less likely to be irritated when required to provide rankings rather than ratings. Finally, to return to a point made earlier, LEX and SATISLEX type models are clearly more descriptive of the evaluation process than is the ADD model.

The implication is clear if we can generalize from these few studies: in predicting consumer preferences it would be very worth-while if the marketing researcher were to turn his attention from additive models to lexicographically-based models.

REFERENCES

Adams, E. W., and Fagot, R. A model of riskless choice. Behavioral Science, 1959, 4, 1-10.

Alexis, M., Haines, G. H., Jr., and Simon, L. Consumer information processing: the case of women's clothing. In R. L. King (Ed.), Marketing and the new science of Planning. Chicago: A. M. A., 1968.

Alpert, M. I. Identification of determinant attributes: a comparison of methods. Journal of Marketing Research, 1971, 8, 184-191.

Bass, F. M., and Talarzyk, W. W. A study of attitude theory and brand preference. In P. R. McDonald (ed.), Marketing involvement in society and the economY. Chicago: A. M. A., 1969.

Bettman, J. R. Behavioral simulation models in marketing systems. Unpublished doctoral dissertation. New Haven: Yale University, 1969.

Clarkson, G. P. E. A model of the trust investment process. In E. A. Feigenbaum and J. Feldman (Eds.), Computers and thought. New York: McGraw-Hill, 1963.

DeSoto, C. B. The predilection for single orderings. Journal of Abnormal and Social Psychology, 1961, 62, 16-23.

Dickson, G. W. A generalized model of administrative decisions: an experimental test. Management Science, 1970, 17, 35-47.

Einhorn, H. J. The use of nonlinear, noncompensatory models in decision making. Unpublished doctoral dissertation, Wayne State University, 1969.

Fishbein, M. (Ed.), Readings in attitude theory and measurement. New York: Wiley, 1967.

Hansen, F. Consumer choice behavior: an experimental approach. Journal of Marketing Research, 1969, 6, 436-443.

Luce, R. D. Semiorders and a theory of utility discrimination. Econometrica, 1956, 24, 178-191.

MacCrimmon, K. R. Decision-making among multiple-attribute alternatives: a survey and consolidated approach. Ran Memorandum RM-4823-ARPA, 1968.

Russ, F. A. Consumer evaluation of alternative product models. Paper presented at the A. M. A. Fall Educators' Conference, Boston, 1970.

Russ, F. A. Consumer evaluation of alternative product models. Unpublished doctoral dissertation, Pittsburgh: Carnegie-Mellon University 1971.

Simon, H. A. A behavioral model of rational choice. Quarterly Journal of Economics, 1955, 69, 99-118.

Tversky, A. Additivity, utility and subjective probability. Journal of Mathematical Psychology, 1967, 4, 175-202.

Tversky, A. Intransitivity of preferences. Psychological Review, 1969, 76, 31-48.

Yntema, D. B., and Torgerson, W. S. Man-computer cooperation in decisions requiring common sense. IRE Transactions on Human Factors in Electronics, 1961. HFE-2. 20-26.

----------------------------------------

Authors

Frederick A. Russ, University of North Carolina



Volume

SV - Proceedings of the Second Annual Conference of the Association for Consumer Research | 1971



Share Proceeding

Featured papers

See More

Featured

The Effect of Fertility on Women’s Word-of-Mouth Behavior

Sevincgul Ulu, Rutgers University, USA
Kristina Durante, Rutgers University, USA
Jonah Berger, University of Pennsylvania, USA
Aekyoung Kim, Rutgers University, USA

Read More

Featured

C8. Can Packaging Imagery Fill Your Stomach? Effects of Product Image Location on Flavor Richness, Consumption Quantity, and Subsequent Choice

Taku Togawa, Chiba University of Commerce
Jaewoo Park, Musashi University
Hiroaki Ishii, Seikei University
Xiaoyan Deng, Ohio State University, USA

Read More

Featured

Mining Consumer Minds: How Airbnb Hosts’ Motivations Affect Their Retention and Pricing Decision

Jaeyeon Chung, Columbia University, USA
Gita Venkataramani Johar, Columbia University, USA
Yanyan Li, Columbia University, USA
Oded Netzer, Columbia University, USA
Matthew Pearson, Former User Experience Researcher at Airbnb

Read More

Engage with Us

Becoming an Association for Consumer Research member is simple. Membership in ACR is relatively inexpensive, but brings significant benefits to its members.