An Index of Consumer Satisfaction



Citation:

Anita B. Pfaff (1972) ,"An Index of Consumer Satisfaction", in SV - Proceedings of the Third Annual Conference of the Association for Consumer Research, eds. M. Venkatesan, Chicago, IL : Association for Consumer Research, Pages: 713-737.

Proceedings of the Third Annual Conference of the Association for Consumer Research, 1972      Pages 713-737

AN INDEX OF CONSUMER SATISFACTION

Anita B. Pfaff, Wayne State University, and University of Augsburg

[This paper reports the interim results of an ongoing study of consumer and citizen satisfaction. The research was financed by Contract No. 1217-D5-1-6107 from the U.S. Department of Agriculture.]

[I would like to express my appreciation to Professor Martin Pfaff, who was instrumental in the conception and design of this project; to Professor James Lingoes, without whose cooperation the development of scoring systems and aggregation procedures would have been impossible; to Dr. Terry Cooper and Dr. Lily Huang, who performed the computer analysis; and to Mr. William K. Jackson who assisted in the design, administration and analysis of pretest questionnaires of the Indexes of Consumer and Citizen Satisfaction.]

[An earlier version of this paper was presented at the 84th Annual Meeting of the American Economic Association jointly with the Association for the Study of the Grants Economy, New Orleans, December 27, 1971.]

[Assistant Professor of Quantitative Methods, Wayne State University; Wissenschaftlicher Rat und Professor, University of Augsburg.]

I. CONSUMER SATISFACTION IN A MARKET ECONOMY

The ultimate claim and justification of a market economy is its ability to satisfy individual consumers better than any other type of economic system. In the face of differing ideologies and organizational forms competing for man's allegiance, this is no small claim and no mean achievement. Yet, in a period when product markets have vastly expanded the offering of goods, gadgets, and services, and when factor markets have provided for increased affluence to the majority of Americans, there has arisen a consumer restlessness and even outright revolt against the system. Indeed, the voices of criticism have come not only from our urban ghettos or rural enclaves which have been bypassed by this economic up-swell like islands of misery in an ocean of affluence. The suburban housewife, favored most by the spectacular increase in living standards of the 1960's, sets the tone in the growing chorus of protest. No doubt, general economic forces associated with inflation and unemployment have made the consumer pessimistic and more aware of the limits of the household budget. But complaints pertain often to specific facets of product or service offerings and to specific parts of the whole distribution process. Even more pronounced, seemingly, is the voice of criticism levied at the quality of services offered in the market, ranging from poor automobile servicing and repairs, to more sophisticated services offered by professionals.

This is a reaction not only against the general economic scene, or the institutions and individuals constituting the retail levels, but at the whole mass-marketing process.

Marketers generally take great pride in their strategies of product differentiation which promise a plethora of product types to satisfy every whim of the consumer, as long as it is backed up with the necessary purchasing power. Strategies of market segmentation, in turn, are geared to tailoring product and service offerings to the desires of specific consumer groups. In the face of all this one may be somewhat surprised at the discrepancy that exists between the images of suppliers and the images of consumers about the performance of products and markets.

The aim of the research project reported in this paper is not to level further criticism at the "goose that lays the golden eggs." Rather, it attempts to marshall modern psychometric techniques--particularly nonmetric scaling techniques and related breakthroughs in computational technology--to the task of formulating social indicators of market performance. These indicators may serve the ends of business and industry aiming at improving their own performance, and of government charged with the task of providing happiness to its citizens. But we hope that such information will ultimately benefit consumers who will be the beneficiaries of improved market performance. An index of consumer satisfaction will not replace marketing research, but rather indicate a point on departure for such efforts.

II. MEASURES OF ECONOMIC PERFORMANCE

1. Objective Measures

Over the last four decades, economists have become increasingly concerned with the evaluation of economic performance. Such evaluation is contingent on two prerequisites: First, the goals of the economic system have to be known; and second, measurements on variables representing these goals, or operational "proxies" of these goals, have to be carried out, to determine to what degree these goals have been achieved.

As a first step in this direction, a number of familiar economic indicators have been developed--such as gross national product and related concepts on the output side, or labor force participation and investment rates on the input side. Consumers' living costs are estimated with the aid of the Consumer Price Index. These economic indicators monitor economic processes at regular intervals. No doubt, these indicators provide valuable information about how well our economy performs. They suffer, however, from a monetary and physical bias, as all effects have to be expressed in monetary or physical form. They fail in helping answer questions such as: "Is enough produced to provide everybody with the most essential goods and services? Are the goods and services produced that people want most? Are people satisfied with market offerings?"

Objective meaSures of economic performance can act as reasonable proxies of subjective welfare only if certain questions are excluded from the menu of economic topics: To do this we have to work with certain economic ideal types: Monetary and physical measures of performance will reflect subjective welfare provided, first, we assume the distribution of income to be given and outside the area of concern of the economist; second, the price mechanism operates in a perfectly competitive market; and it guarantees that relative prices reflect the demand patterns of the consumers. The demand function, moreover, is an expression of "effective" demand, i.e., willingness backed by ability to pay for the goods and services. With a given income distribution it is quite conceivable that a price structure results which provides essentials to the poor in far too small amounts, and luxuries to the rich in relative abundance. An increase in the output of luxuries may result in an upward movement of the objective economic indicators, suggesting an improvement in social welfare; while by many social norms, and in the perception of most people, the change would not necessarily be considered desirable.

In a perfectly competitive market the price mechanism will ensure that the right products will be produced in right amounts. Imperfections in the market are however, the rule rather than exception. This means that the "feedback action" of the market is considerably weakened. The prevailing price structure is influenced by differential power, special skills in negotiation, and a variety of factors extraneous to true demand. While the objective economic indicators are thus measurable at least in principle, they may not satisfy the first requirement of an evaluative theory of the socio-economy, since they fail to measure what the actual goals are.

Considerable attention has recently been given to national goals and the development of social indicators. Increasingly these studies do not only concern themselves with physical or objective measures, but with subjective measures as well.

2. The Index of Consumer Satisfaction: A Measure of Subjective Welfare

The paper reports on a study designed to develop a "subjective" indicator of consumer satisfaction, as a complement to the Consumer Price Index (CPI). the CPI measures variations in the prices of a general "market basket." The Index of Consumer Satisfaction (ICS) is to supplement this "cost measure" with a "benefit measure." It tries to determine the benefits that consumers receive--or the degree of satisfaction they experience--from the operation of the market. Since no convenient aggregate measure of utilities exists that provides an easy and reasonably unambiguous measure of satisfaction, consumers' opinions have to be explored on the subject. An index, or a profile of indices, has to be based on their avowed satisfaction. In the absence of weight changes which express the quantitative composition of the "market basket," the CPI is of the form:

(1)    EQUATION

where Goj is the weight of the jth product in the market basket in the base period, and Pij and Poj are the average prices of the tth product at time i and the base period, respectively. [For detailed descriptions of the Consumer Price Index see The Consumer Price Index. A Short DescriPtion, 1967. U.S. Department of Labor, Bureau of Labor Statistics, and, Consumer Price Index: History & Techniques, Bulletin No. 1517, U.S. Department of Labor, Bureau of Labor Statistics.]

An analogous Index of Consumer Satisfaction of the form:

(2)    EQUATION

may be suggested where Woj represents the weight and Sij the satisfaction scores of the jth product, in current and base periods, respectively. The weights Woj and Wij would not generally coincide with Goj and Gij. The latter relate to average amounts purchased. The former constitute "importance weights"--the weights consumers associate subjectively with given goods or services. Some items tend to be purchased in large quantities. Yet, they may not rank very highly in terms of subjective importance in the eyes of the consumer.

Similar to the G's in the Consumer Price Index, the W's would be revised only occasionally, to reflect a change in attitudes or distribution of income and wealth. They are likely to differ between different socioeconomic groups.

There is no prima facie basis for assuming additivity of satisfaction scores across products. Yet in a situation of changing relative prices, the simple additivity assumption underlying the CPI may be questioned for precisely the same reason: This is due to the interdependence of products with regard to substitutability or complementarity. The additivity assumption is no less heroic when applied to the CPI than to the ICS.

III. THE DESIGN OF THE PILOT STUDY

A pilot study was conducted in order to explore whether the approach to measurement of consumer satisfaction was technically and economically feasible and whether it yielded results which have interesting policy implications.

The sampling procedure, and the conduct of the field surveys was carried out by the Survey Research Center of the Institute for Social Research of the University of Michigan. A sample was taken in spring of 1971, consisting of 574 family units with both husband and wife present, and with small children no older than ten. This random sample was drawn on the basis of birth records and other public records. Subsequent cross-checks indicated that there were essentially similar distributions among the demographic descriptors as in the 1971 Survey of Consumer Finances. A small income bias was noted in our sample due to a slight under-representation of low income families.

A limited number of products were chosen for inclusion in the pilot questionnaire. A duplication of the entire CPI market basket would have resulted in much too long a questionnaire. The development of an index based on a more extensive product or service selection would have to follow an extension of the common CPI practice. This entails the collection of responses on different items from different but similar subsamples.

It does not suffice to determine only the consumers' overall satisfaction with a product. A consumer acquires a bundle of goods and service combinations in any purchase. Various marketing aspects--packaging, display, etc.-represent relevant dimensions of the product. The product itself may exhibit a number of attributes. For example, a housewife who purchases a refrigerator may consider not only the objective or engineering-economic attributes, such as size, internal arrangement, economy of operation, or special features such as automatic ice maker. She also looks at marketing attributes such as price, availability, convenience of store location, availability of repair services, image, and so on. Finally, a model is likely chosen which is perceived as particularly satisfactory with regard to some of these attributes, and not so satisfactory with regard to others. The rational decision-maker will choose a product which conforms more to his desires in those attributes that he perceives as important to him. His overall satisfaction with the product may thus be seen as a composite of his satisfaction with the various attributes.

The more salient aspects are likely to dominate the picture. Random fluctuations in the product image are likely to occur as a consequence of particular experiences involving the product. But over time, a fairly stable image is likely to emerge. The stability of the product image will also depend on the frequency of purchase and, therefore, the product life.

For purposes of this analysis a very limited number of different products was included. The study was directed at the investigation of products with different purchase frequencies and life. The overall sample was divided into two subsamples--A and B. In subsample A, husbands were interviewed about their overall satisfaction with the house they lived in and their degrees of satisfaction with various attributes of the house. In subsample B, husbands responded to questions about their car and its attributes. In subsample A, wives answered questions about food products; and in subsample B, wives were asked about clothing. In both subsamples the wives were also interviewed about their general satisfaction with all products they purchased, with breakfast cereals, and five attributes of breakfast cereals, luncheon meats, and five attributes of luncheon meats, as well as women's clothing, such as dresses, blouses, slacks, shirts. They were also asked about their satisfaction with food in general, and appliances.

Satisfaction was measured on a seven-point scale ranging from "very satisfied" (A) to "not at all satisfied" (G). Respondents were asked how important each of the attributes were for the purchasing decision. Importance was measured on a seven-point scale as well, ranging from "very important" (A) to "not at all important" (G). Letters rather than numbers were associated with the scale in order not to suggest a particular order or specific quantitative relation between points on the scales. A "satisfaction" scale (rather than a scale ranging from very satisfied to very dissatisfied) was employed: It was felt that a scale of the latter type is a mixed scale, derived from two separate scales where one measures satisfaction and the other dissatisfaction. These two scales are monotonically related, but they may not coincide.

The choice of products was based on the different purchase frequencies and durability (or span of life). "House" was chosen as a product with very long life span, characterized by infrequent purchasing (or renting) decision. Cars occupy an intermediate position. They are shorter-lived than houses. Yet, they are not purchased very frequently. Moreover, the high expense associated with their purchase makes the decision also important in terms of the commitment of the family budget. Clothing occupies the next position on a life span scale. Finally, luncheon meats and cereals are purchased frequently and have a very short life span.

The overall sample was reduced to eliminate all individuals and their spouses who had responses on the 47 satisfaction scales (subsequently called "internal" variables) such as "do not know," "not applicable," "inappropriate," "uncertain." To be included in the analysis, both husband and wife had to respond to all the internal satisfaction questions put to them on the 7-point scale. This reduced the sample to size 342 men and women with subsample A consisting of 189 and subsample B of 153 couples.

Complete ratings on a 7-point scale regarding 37 questions about importance of product attributes were also obtained for the reduced sample of 342 dyads.

IV. RESULTS OF THE ANALYSIS OF SATISFACTION SCORES

Both raw scores and optimal weighted scores were employed for measuring consumer satisfaction.

1. Raw Scores

The first step, consisting of a metric measurement of satisfaction, was taken by associating numerical values with alphabetic response. Arbitrarily, numbers 1 through 7 were assigned to the letters A (very satisfied) through G (not at all satisfied). These are subsequently called "raw scores." This particular scale was used for all individuals. It implies that respondents perceived adjoining points as "equidistant." In other words, individuals perceived a movement from 1 to 2 as psychologically equally easy--or difficult-as a movement from 5 to 6. It also implies that no semantic differences with respect to the two end points (or any other point with verbal association) exist.

2. Optimal Weighted Scores

A short digression is needed to show why raw scores appeared not very useful. On a priori grounds two extreme hypotheses on the nature of satisfaction may be considered. One may hypothesize that individual products and attributes are independent of each other in terms of the satisfaction they yield. Alternatively, one may postulate that satisfaction generalizes: Persons who are more satisfied with one product are also more likely satisfied with another. If the former hypothesis were true, the satisfaction created by each product would be an independent random variable; and an Index of Consumer Satisfaction would be based on the sum or a mean derived from the set of products and individuals. In an aggregation across products, information would obviously be lost, since one index number may result from a large number of score distributions. If the second hypothesis were to be true, this would be reflected in high correlations between satisfaction scores of attributes and products, suggesting a generalization of satisfaction. In the latter case an additive or averaging aggregation across products would not entail too great an information loss. It would also have important policy consequences. We did not perform a rigorous statistical test on the significance of the simple linear correlation coefficients of satisfaction ratings based on raw scores. Simple inspection sufficed to reject the hypothesis of a generalization of consumer satisfaction. Most correlations had a positive sign; but most of them were also near zero. Due to the restrictive assumptions of the raw scores, such a result may well have been statistical artifact.

To investigate this possibility, a set of optimal scores was derived by application of Multivariate Analysis of Contingencies III (MAC III). [James Lingoes, "The Multivariate Analysis of Qualitative Data" Multivariate Behavioral Research, January, 1968, Vol. 3, No. 1, pp. 61-94.] This involved a rescaling of the raw scores in such a fashion as to maximize the (new) coefficients of correlation. Thus a maximum degree of homogeneity (- average correlation) will be achieved.

TABLE 1

OPTIMAL WEIGHTED SCORES (BASED ON MAC III FOR 25 SATISFACTION SCALES, 5 CATEGORIES, FEMALE RESPONDENTS)

MAC III does not guarantee a monotonic rescaling. However, in the absence of a more complex theory of interdependence of product and attribute satisfaction involving non-linear relations, a non-monotonic rescaling of raw scores appears void of meaning. In other words, it may be difficult to interpret scores on a scale ranging from "very satisfied" to "not at all satisfied" (such as, -101, 12, -23, 8, 61, 20, 22). On the other hand the imposition of a monotonicity constraint on the new scores results in a lower degree of homogeneity. [n2 measures the square of the average correlation of the scaled variable with all other variables in the system. It is used as a measure of homogeneity.] The degree of homogeneity or average intercorrelation achieved by MAC III scores depends on sample size, number of categories, and number of individuals falling in each category.

The original seven categories proved inconvenient: A rather small number of individuals fell into raw-score categories 6 and 7 (very low satisfaction scores). This tended to distort the scales; and it led to non-monotonic scales. Categories 5, 6, and 7 were therefore collapsed into one category. The 5-point scale proved quite appropriate.

Of the 47 satisfaction scales 28 are perfectly monotonic and the others show only minor deviations. The degree of homogeneity could be improved; but the average intercorrelations are still not very high.

Table 1 shows the optimal MAC III scores for a system of 25 satisfaction responses of females on (1) standard of living, (2) breakfast cereals plus five attributes, (3) luncheon meat plus five attributes, (4) women's clothing plus seven attributes, (5) food, (6) clothing in general, and (7) appliances. Inspection of the right most column (n2) shows that the system is not homogeneous. The highest average correlation (n) can be observed for clothing in general; even this amount is less than .4 2 s .131).

A much higher degree of homogeneity can be achieved by analyzing a smaller system of more closely related variables. Table 2 shows, for example, the results of the analysis of satisfaction with luncheon meat and its attributes. Higher values of n2 can be noted therefrom.

TABLE 2

MAC III SCORES OF SATISFACTION WITH LUNCHEON MEAT (5 ATTRIBUTES; FEMALE RESPONDENTS)

However, even in this system the highest average correlation n is less .6 ("taste"). With the exception of "availability" all variables have strictly monotonic scores.

All MAC III scores are generated in such a fashion as to have a mean of zero and the qual variance. This has a distinct advantage in comparing relative well-being when absolute comparisons between scales are difficult. For example, being "very satisfied with the price of luncheon meat" is indicated by a score of 109, whereas being "very satisfied with packaging" is denoted by a score of 63. The higher value results from a relatively small number of respondents in the first category, and from the relatively low mean satisfaction with price of luncheon meat (4.39 in terms of raw scores, as compared to a mean raw score of satisfaction with packaging of 2.63). The person who is "very satisfied" with "price" generally may be better off than the person who is "very satisfied with packaging." The higher score of 109 for "satisfaction with price" would thus be justified. MAC III scales of the first four variables are rather similar. The "availability" scale reflects a rather high level of satisfaction. This is evident from the large number of negative signs; a response in the second category in terms of this population means relatively low satisfaction. This scale is slightly nonlinear: Category 5 has a higher score than category 4.

Table 3 exhibits MAC III scores of male respondents for more general satisfactions. The four variables under analysis are responses to the following questions: (1) "Generally speaking, how satisfied are you with the quality of all the products you buy?"; (2) "Compared to 5 years ago, would you say the quality of all the products you and your family buy is better now, worse, or about the same?"; (3) "What about the quality of service you get in the stores where you and your husband shop, is it better now than five years ago, about the same, or worse?"; (4) "What about getting things repaired properly; is it easier to have things repaired now than it was five years ago, harder, or about the same?"--While variable 1 was measured on a 7-point scale (collapsed to 5 categories), variables 2-4 contained an implicit satisfaction rating; it assumes that the response "better now" or "easier now" corresponded to very satisfied (category 1), "same" to category 3, and "worse now" or "harder now" to category 5. Obviously no MAC scores were derived for categories 2 and 4, since no responses fall into these categories.

TABLE 3

OPTIMAL WEIGHTED SCORES (MAC III) FOR GENERAL SATISFACTION, 342 MALE RESPONDENTS

The low level of n2 indicates a high degree of heterogeneity in the system. Scale (1), moreover, exhibits slight nonlinearity. Concentration of responses in variables 2-4 were on "same" and "worse"; the fact that the score for the "same" category is above average (0) reflects this phenomenon.

Optimal weighted scores for importance responses of product attributes were derived. Since attributes had been selected for inclusion in the questionnaire on the basis of their salience, there occurred very few responses in categories 5-7 (not important). Moreover, the mode of all importance scores was in category 1 ("very important"). MAC III scores (not shown in this paper) proved not very useful, since most scales were non-monotonic with the raw scores.

3. Optimal Monotonic Scores

The definite disadvantage accompanying the use of optimal weighted (MAC III) scores is their potential lack of monotonicity to the implied order of the raw scores. In order to avoid this, a rescaling of raw scores has to incorporate the constraint of maintaining the original order.

The Conjoint Measurement III (CM III) algorithm (Lingoes, 1968) provides for such a rescaling: While original order is retained the homogeneity (or average linear correlation) between variables is maximized. Generally, the degree of homogeneity achieved by the use of CM III Scores will be less than that of the MAC III scores. In particular if large digressions from the original order occur in MAC III scores the retention of the original order will involve an opportunity loss in homogeneity.

The experience with MAC III scales, viz. the low degree of homogeneity achieved by combining rather diverse goods or concept suggested that satisfaction does not seem to generalize. Scales were, therefore, jointly developed for closely related sets of measures such as a product and its attributes.

Like MAC III scores, CM III scores depend on sample size and frequency distribution. The attribute variable scores are, however, standardized. This implies that the mean equals zero; and the standard deviation equals one. Since monotonicity is no problem, the original seven categories were retained for the raw score input. The product variable is rescaled to a zero mean as well.

The product "luncheon meat" and five attributes (packaging, taste, nutritional value, availability, price) were analyzed accordingly.

Table 4 summarizes some of the results of this analysis. The left hand side of the table shows the CM III scores for "luncheon meat" and 5 attributes, the scores run from negative figures ("very satisfied") to positive figures ("very dissatisfied").

TABLE 4

OPTIMAL MONOTONE (CM III) SATISFACTION SCORES OF LUNCHEON MEAT AND 5 ATTRIBUTES

Inversion of the signs would result in a "satisfaction" rather than the present "dissatisfaction" scale. The larger number of positive values, and their higher absolute values, indicate a positively skewed distribution: Most consumers reported to be fairly satisfied. The attribute "price," for example, shows a higher concentration in negative scores; this indicated that fewer people were "very satisfied" with price. The mean raw score of satisfaction with luncheon meat was 2.49, while the mean raw score of satisfaction with price of luncheon meat was 4.05. This denotes a higher level of dissatisfaction. Since the means of all scores are zero, these scores are useful in investigating relative distributions of satisfaction, rather than absolute levels.

Generally, CM scores of adjacent categories will be closer if a relatively large number of respondents fall into these categories. If only few respondents fall into an extreme-value category, the extreme values will differ by a large amount from the next value--and the mean: The scale has to discriminate between rather few consumers and the rest of the consumer group. This implies that a move to extreme values (or from extreme values) may be much harder than between other levels. The interval between adjoining categories can, therefore, be interpreted as a measure of the perceived distance between these response levels. With raw scores all intervals were of the same size. This may be an inadequate reflection of the true perception of the meaning of different degrees of satisfaction.

The right-hand side of Table 4 shows the correlation coefficients between the product variable measured in raw scores (y) and each attribute variable measured in raw scores (x1, j=1, ...,5), in the column ryxj. The average of these coefficients (.5297) is reported in the bottom line (average"). The column headed by ry*xj* reports the corresponding correlation between satisfaction measured in terms of CM III scores. While some correlations are smaller than corresponding raw score correlations, an overall improvement in homogeneity is reflected in the somewhat higher average of .5306.

Both the small increase in homogeneity as well as the fact that the new scales almost exclusively exhibit little unequal stretching or shrinking of intervals suggest that in the instance of luncheon meat CM III scores, in fact, did not provide a great improvement in homogeneity. The almost equal intervals do not Provide a great increase in sensitivity.

4. A Comparison of Scoring Systems

Raw scores have the advantage of allowing absolute comparisons between attributes or products. However, optimal weighted scores and optimal monotone scores provide higher homogeneity of measures and higher sensitivity to extreme responses.

It is questionable whether reporting being "very satisfied" (raw score 1) with a car and with a breakfast cereal rally means the same thing, and whether they should therefore be denoted by the same score. MAC III or CM III will generally render different scores for the two scales. This results from the different frequency distributions for the two products. Both will yield a mean score of zero and the same standard deviation for all scales. This implies that the mean satisfaction with luncheon meat (2.98 in terms of raw scores) will be identically zero, whether measured in MAC III or CM III scores.

A satisfaction scale with an anchored mean may, however, be very useful for policy making. At present we have no generally accepted satisfaction scale such as the Fahrenheit scale for temperature, or a Murk Index for air pollution. Therefore, we can measure the present state of satisfaction and associate with it an average score of zero. Future deviations will be reflective in positive or negative averages. Similarly one can make comparisons at a given point in time between different subpopulations. It could be argued that the same may be achieved by using raw scores, by measuring their deviation from the mean. Such a measure recommends itself on the grounds of simplicity. But the drawback of such a system is the lower homogeneity of scales. If we were to contemplate a reduction in the number of products or attributes to be included in a survey, then we would have to choose one or a few out of a group of very homogeneous products or attributes

5. Prediction and Aggregation of Satisfaction Across Attributes

"Mixed feelings" about a product are not uncommon. While some attribute appears uncommonly pleasing, another one may delight us not at all. An overall reaction to the item can still be elicited, which in all probability reflects and aggregates these "mixed feelings."

The simplest approach to arrive at an aggregate satisfaction measure would be to sum satisfaction scores across all attributes. Or, in order to facilitate comparison between different products with a different number of attributes, an average across attributes may be more appropriate. One might even include the general satisfaction score in the averaging process.

Under the assumption of a simple additive relationship the average ought to be close to the general product satisfaction score for any one individual. The simple averaging process assumes approximately equal importance of all attributes. Any scoring system may be used to derive such an average.

In Table 5, aggregate scores or 3, (out of 342) female respondents are shown for luncheon meat. They are based on satisfaction scores for the product as well as all attributes. The numbers in parentheses are the overall product satisfaction response scores. On the right hand side of the table the raw score responses are shown for luncheon meat and its attributes for the three respondents. As can be seen the attribute scores are either equal or below the product scores (underlined) which results in averages that indicate lower satisfaction than the general response does. This is, however, only typical for these three respondents. Particularly the second respondent seemed inconsistent, in reporting to be very satisfied with the products, but far less so with each and every of the attributes.

A sample of average satisfaction scores with three products (luncheon meat, breakfast cereals, women's clothing as well as average across satisfaction with "all products," "all foods," "all clothes," and "all appliances," and the average across the previous four averages are depicted in Table 6. Satisfaction is measured in terms of MAC III scores. The mean of each set of aggregate scores across the entire sample is zero.

TABLE 5

AVERAGE SATISFACTION WITH LUNCHEON MEAT MEASURED IN THREE DIFFERENT SCORING SYSTEMS, FOR 3 RESPONDENTS

From the very few examples in Table 5 we saw that average satisfaction sometimes differs very much from stated overall satisfaction; this was the case even though, as a correcting factor, the reported product satisfaction score was included in the averaging process. Rad this not been the case, deviations would even have been larger. In other words, a simple average does not necessarily appear to be the best predictor of overall product satisfaction. The cause for this shortcoming may be in the differential impact that different attributes may have on the evaluation of the product as a whole. One way of recognizing this differential impact may be the computation of a weighted average.

An appropriate way of inferring implicit importance weights for different attributes consists in estimating regression coefficients for a model explaining variations in overall product satisfaction through variations in attribute satisfaction. Satisfaction scores in a regression model may again be measured in any one of the three scoring systems discussed above.

Regressions of product satisfaction on attribute satisfactions were performed for luncheon meat using all three scoring systems (raw, MAC III and CM III scores). The summary results are shown in Table 7.

Regression coefficients are reported in columns "coef.", their standard errors in columns headed by "sb." One asterisk in columns "sig." indicates that the coefficient is significant at the 5 percent level, two asterisks, that the coefficient is significant at the 1 percent level. R denotes the multiple correlation coefficients, syx the standard errors of the estimate, and "sequence" the sequence in which variables were entered as explanators in the stepwise multiple regression, which also coincides with the rank-order of the standardized regression coefficients. The poorest predictive power is exhibited by raw scores as a consequence of the lower homogeneity of the system. MAC III scores are better predictors than CM III scores. However, MAC III scores are not fully comparable with the others since they were based on 5 categories only as compared to 7 in other scores. It should be noted, that a higher degree of homogeneity does not necessarily entail a more accurate prediction

TABLE 6

AVERAGE SATISFACTION SCORE PROFILES OF 10 RESPONDENTS

TABLE 7

SUMMARY RESULT OF REGRESSION OF PRODUCT SATISFACTION ON ATTRIBUTE SATISFACTION FOR LUNCHEON MEAT

The numerical impact of "taste" appears the largest in all scoring systems.

"Taste" and "price" are significant for CM III and MAC III scores; "availability" appears significant in CM III scores, and "nutritional value" in MAC III Scores. In raw score measurements "packaging," "taste" and "nutritional value" appear significant.

In the absence of a more convincing theory we assume that product satisfaction depends in a linear additive fashion on attribute satisfaction. If this is the case, regression results also provide a possible clue as to which attribute is the most instrumental determining overall satisfaction with the product. It may also provide suggestions as to which attributes have to be given a different "image" so as to change the consumer's attitude. This aspect is important in a policy making context. In this case regression is not only used as a prediction technique but also for the purpose of control.

This technique is not only useful in explaining product satisfaction in terms of attribute satisfaction. It can analogously be employed to determine the weight of individual products in the aggregate satisfaction of product groups. A rough attempt towards this end was made by trying to explain satisfaction with the quality of all products in terms of satisfaction with breakfast cereals, luncheon meats, women's clothing, food, all clothing and appliances. Obviously the choice of independent variables is not satisfactory, as other products influence overall satisfaction which were not included in the pilot survey.

The regression results are reported in Table 10.

TABLE 8

SUMMARY RESULT OF REGRESSION OF SATISFACTION WITH ALL PRODUCTS OF 6 PRODUCT GROUPS

The regression was based on raw scores. The predictive accuracy reflected in a coefficient of multiple correlation of .555 only is not very high. This is hardly surprising since "all products" are not only composed of the few groups represented by the independent variables. The most significant predictor appears to be "clothing."

If the model of satisfaction composition is, in fact, correct, and respondents report their importance weighting of attributes truthfully, no significant deviations in the rank order of the mean importance weights and the standardized regression coefficients should be observable.

6. Differences in Consumer Satisfaction Among Occupational and racial Groups.

Are there significant differences in consumer satisfaction between different socio-economic groups? To answer this question an analysis of variance was performed on MAC III scores for the effects of occupational status (A) as expressed through the differences among white-collar (Al) and blue collar (A2) workers; and for race (B), distinguishing between whites (B1) and blacks (B2).

The following patterns emerged:

1. Whites (B1) are significantly more dissatisfied with market goods than blacks (significance level of a = .01)

2. White collar workers are significantly more dissatisfied with market goods than blue collar workers (a = .05)

Turning to specific products, we note that:

3. Whites are significantly more dissatisfied than blacks with clothing (a = .001)

4. Whites are significantly more dissatisfied than blacks with breakfast cereal (a = .05)

These results are quite striking, particularly since they appear to be counterintuitive at first sight: Evidently, the higher socioeconomic strata are more dissatisfied because they are also more aware of the range of issues and possibilities associated with the market. Literary and general exposure to mass media may generate a predisposition to dissatisfaction. A more general expression of this phenomenon was observed by Strumpel for the same population: A different level of attainment in terms of a hierarchy of goals or needs seems to be evidenced by responses of different professional and racial groups, suggesting that blacks and blue-collar workers tend to be more concerned -- and more satisfied -- in the attainment of material goods. [Burkhardt Strumpel, "Economic Life Styles, Values, and Subjective Welfare -An Empirical Approach" paper presented at the joint meeting of the American Economic Association and the Association for the Study of the Grants Economy, New Orleans, December, 1971; in Eleanor Sheldon (et.) Understanding Economic Behavior, New York: Lippincott, 1973.] These results, it should not be forgotten, were obtained from a specific subpopulation - young families with employed family heads. Very different patterns may, in fact, be observed among the poor, the-aged, unrelated individuals, or the unemployed.

TABLE 9

SUMMARY RESULTS OF THE ANALYSIS OF VARIANCE OF SATISFACTION RATINGS OF DIFFERENT PROFESSIONAL AND RACIAL GROUPS. (BASED ON MAC III SCORES)

The analysis of variance results add further validity to the use of MAC III scoring: While the raw scores did exhibit patterns of variance which went in the same direction as the MAC III scores, their lower power of discrimination resulted in not significant differences. (A test for homogeneity of variance revealed that the F-test is appropriate to these data.)

Correlation analyses between consumer satisfaction scores and general personality, self-efficiency/fate control, educational level, and other socio-economic characteristics were carried out. A detailed description goes beyond the scope of this paper. On the whole, the ordered external variables (i.e., variables measured on an ordinal scale) and the unordered external variables (i.e., variables expressed on a nominal scale) showed very weak patterns of association with mean consumer satisfaction scores. There may be, however, significant differences among subgroups defined on the basis of these variables. In any case, our results indicate that consumer satisfaction is relatively indePendent of attitudinal an.d general personality factors. This result strengthens the meaningfulness of these indices. We can now be more sure that we are not measuring general optimism or pessimism, or other general traits, when we Pose questions on satisfaction with market goods.

V. CONCLUSION

1. A Single Aggregate Measure or a Profile of Consumer Satisfaction?

The discussion of aggregation and prediction showed that even within a product using scores that maximize homogeneity, no perfect prediction could be achieved. This would indicate that, while a one-figure measure is very compact, it entails by necessity a great deal of information 1088. An aggregate figure remaining constant from one time period to the next may conceal opposite movements of components. This paper did not deal at all with a possible use of this index as an indicator of consumer riots or boycotts. Extreme dissatisfaction with a product or product group may result in rather drastic action on the part of many consumers. Greater familiarity with a measure of subjective well-being may encompass the knowledge of a "critical value" or a "red zone". These are ranges of very low satisfaction scores that are indicative of imminent trouble. An aggregative measure may balance such "critical values" or trouble indicators by improvements in other unrelated areas.

This shortcoming may be circumvented by the use of a profile of consumer satisfaction measures. A profile involves a set of measures rather than a single value. For each group of products, a separate index would be computed.

A profile may be of further use in the composition of separate aggregative indices for different socio-economic groups, since it is likely that distinctly different groups would not attach the same relative importance weights to different products or product groups. Different weights in the averaging process may well be indicators of changes that favor one group at the expense of another group.

2. Aspects of Longitudinal Studies of the Index of Consumer Satisfaction

Longitudinal studies measure changes in a phenomenon over time in absolute or relative terms. The Consumer Price Index measures percentage changes in the price level ln terms of relative deviation from the level which was set equal to 100 in the base period.

This study investigated three different measurement scales of satisfaction. Each one would have different advantages and disadvantages for use in longitudinal studies.

Raw scores allow comparison in absolute levels of satisfaction between products. They are, however, not very sensitive to extreme values -and thus to change.

An index using this scoring system (or any similar one, say, using scores 7, 6, 5, 4, 3, 2, and 1 instead of the reverse order applied in the present study) would lead to a single value measure of the form:

(4)    EQUATION

where Sio and Sit are the mean satisfaction scores across all individuals of the ith product or product group at the base period and time t, respectively; Wi are the importance weights. Using the limited number of product groups investigated in this study and a more or less arbitrary set of weights for illustrative purposes:

TABLE

The denominator of this index would amount to: (.3)(2.95)+(.38) (3.26)+(.02)(2.16)+(.1)(2.49)_(.2)(3.00)=3.016.

Due to the fact that ln these raw scores 1 stands for "very satisfied," an increase in satisfaction would be reflected ln a smaller number. If, for example, satisfaction had fallen at time t for the same indicators to, say, mean of 3.2, the index would be ICSt 3.2/3.016 X 100.

The actual number of categories and the size of the intervals between adjoining points may have to be chosen differently, so as to provide a sufficiently sensitive indicator. Until at least two different amounts have been computed, it will not be known how sensitive a certain scoring system is.

Using optimal weighted (MAC III) scores, or optimal monotone (CM III scores) results in a slightly different index. These two scoring systems will not reflect different absolute levels of satisfaction between variables since each item would have a different scoring system. However, they would discriminate better for extreme values, particularly if the variables have unimodal distributions. Since each variable has a zero mean, the weighted average satisfaction of the base period would be a weighted sum of zeros, which tantamounts to zero. Evidently a ratio of new to old mean satisfaction cannot be computed in this case. An index of the form:

(5)    EQUATION

would have to be used. (The symbols have the same meaning as above, only Si are measured in terms of MAC III or CM III scores). In both time periods the same scores would be used. In other words, once a scoring system has been computed in the base period it is retained for subsequent periods.

The disadvantage of such a system is that it cannot be compared to the Consumer Price Index as readily as the 100-based system. A remedy can be found in changing all scoring systems by adding a constant to all scores, e.g., let Si* = Si + 100, then the interval sizes are retained and an index of the following form results:

(6)    EQUATION

CM III scores established for a base period could be retained for several years till a large shift towards more outlying (negative or positive) mean satisfaction scores has occurred, suggesting a changing group perception in distances between adjoinint points on a satisfaction scale. This change would be desirable to provide, in a possibly changing situation, again for a more sensitive instrument. If the index were to be used, amongst other applications as a crisis indicator this sensitivity to extreme values would be of utter importance.

The significance of changes in the index can be tested by F-or t-tests, as suggested by the results of the present study.

MAC III scores, for the reason suggested earlier, seem to be less desirable and useful in the construction of an Index of Consumer Satisfaction.

This study shows that in principle, an index of consumer satisfaction can be constructed, which will

(1) reflect relative changes in satisfaction;

(2) allow for an aggregation of satisfaction measures across

(a) attributes to products,

(b) products to product groups, and

(c) product groups to overall satisfaction:

(3) provide for sufficient sensitivity, to diagnose

(a) differences in satisfaction between groups and thus, by implication

(b) between time periods.

The usefulness of such an instrument for the formulation of public and business policy and as market performance measure will, of course, increase with extended measurement across population and time, and with the experience ln the use of the scoring systems discussed.

REFERENCES

Bradburn, N. M. & Caplovitz, O. Reports on Happiness, Chicago, Ill.: Aldine, 1965.

Bradburn, N. M. The Structure of Psychological Well-Being, Chicago, Ill. Aldine, 1969.

Guttman, L. A General Nonmetric Technique for Finding the Smallest Coordinate Space for a Configuration of Points. Psychometrika, 1968, 33, pp. 469-506.

Green, P. E. & Carmone, F. J. Multidimensional Scaling and Related Techniques in Marketing Analysis. Boston, Allyn and Bacon, Inc., 1970, XV + 203.

Lingoes, J. C. An IMB 7090 Program for Guttman-Lingoes Smallest Space Analysis-III. Behavioral Science, 1966, 11, pp. 76-76.

Lingoes, J. C. An IBM 360/67 Program for Guttman-Lingoes Conjoint Measurement-III. Behavioral Science, 1968, 13, pp. 421-b2.

Lingoes, J. C. A General Nonparametric Model for Representing Objects and Attributes ln a Joint Metric Space. In: Jardin, J-C., (ed.) Archeoloale et Calculateurs, C.N.R.S., Paris, 1970, pp. 277-98.

Lingoes, J. C. Some Boundry Conditions for a Monotone Analysis for Symmetric Matrices. Psychometrika, 1971, 36, pp. 195-203.

Lingoes, J. C. The Guttman-Lingoes Nonmetric Program Series, 1972, (in press).

Lingoes, J. C. A General Survey of the Guttman-Lingoes Nonmetric Program Series. In Shepard, R., Romney, A. K. & Nerlove, S., (eds.), Multidimensional Scaling: Theory ant Applications in the Behavioral Sciences, Seminary Press, 1972, (ln press).

Lingoes, J. C. & Cooper, T. PEP-I: A Fortran IV (G) Program for Guttman-Lingoes Nonmetric Probability Clustering. Behavioral Science, 1971 16, pp. 259-261.

Lingoes, J. C. & Guttman, L. Nonmetric Factor Analysis: A Rank Reducing Alternative to Linear Factor Analysis. Multivariate Behavioral Research, 1967, 2, pp. 485-505.

Lingoes, J. C. & Roskam, E. A Mathematical and Empirical Study of Two Multidimensional Scaling Algorithms. Michigan Mathematical Psychology Program, 1971, 1, pp. 1-169.

Pfaff, M. Theories of Market Systems: Implications for the Measurement of Market Performance. Philadelphia, Pa.: Marketing Science Institute Working Paper, April. 1968.

Pfaff, M. & Lingoes, J. C. Measurement of Subjective Welfare and Satisfaction. Paper presented at the 84th Annual Meeting of the American Economic Association jointly with the Association for the study of the Grants Economy, New Orleans, December 27, 1971

Pfaff, A. B. & Pfaff, M. Methods of Welfare Economics in the Measurement of Market Performance. Philadelphia, Pa.: Marketing Science Institute Working Paper P-51-7, May, 1969.

Pfaff, M. & Pfaff, A. Toward an Index of Consumer Satisfaction: The Synthesis of Benefit-Cost with Nonmetric Scaling Concepts as Basis for the Measurement of Market Performance. Philadelphia, Pa.: Marketing Science Institute Working Paper P-51-8, May, 1969.

Rakam, E., & Lingoes, J. C. MINISSA-I: A Fortran IV (G) Program for the Smallest Space Analysis of Square Symmetric Matrices. Behavioral Science, 1970, 15, pp. 204-205.

Smith, P. C., Kendall, K. M. & Helin, C. L. The Measurement of Satisfaction in Work and Retirement: A Strategy for the Study of Attitudes, Chicago, Ill.: Rand McNally and Company, 1969.

----------------------------------------

Authors

Anita B. Pfaff, Wayne State University, and University of Augsburg



Volume

SV - Proceedings of the Third Annual Conference of the Association for Consumer Research | 1972



Share Proceeding

Featured papers

See More

Featured

Growing Up Rich and Insecure Makes Objects Seem Human: Childhood Material and Social Environments Predict Anthropomorphism

Jodie Whelan, York University, Canada
Sean T. Hingston, York University, Canada
Matthew Thomson, Western University, Canada
Allison R. Johnson, Western University, Canada

Read More

Featured

Changing the World, One Word at a Time: The Effect of Font Size on Prosocial Intention

Rima Touré-Tillery, Northwestern University, USA
Ayelet Fishbach, University of Chicago, USA

Read More

Featured

The Asymmetric Effect of Highlighting Intertemporal Opportunity Costs

Christopher Olivola, Carnegie Mellon University, USA
David Hardisty, University of British Columbia, Canada
Daniel Read, University of Warwick

Read More

Engage with Us

Becoming an Association for Consumer Research member is simple. Membership in ACR is relatively inexpensive, but brings significant benefits to its members.