A Reliability Problem in the Measurement of Disconfirmation of Expectations

Ved Prakash, Florida International University
John W. Lounsbury, The University of Tennessee
ABSTRACT - This study analyzes the reliability of difference scores used for the computation of the disconfirmation of expectations. Data were collected in two stages, once for expectations and then for post purchase evaluation for two products--fast food hamburger restaurants and beer. Results showed that the disconfirmation of expectations measures based on difference scores have low reliability (.46 and .19), which may contribute to the low correlations with overall purchase satisfaction. Formulas for correction for attenuation and illustrative tables are presented which show how validity changes as a function of changes in the reliability of the disconfirmation of expectation and overall satisfaction measures. Recommendations are made for: the development of more reliable scales, continued examination of the quality measuring devices used by marketing researchers, and inquiry into related measurement topics such as generalizability theory.
[ to cite ]:
Ved Prakash and John W. Lounsbury (1983) ,"A Reliability Problem in the Measurement of Disconfirmation of Expectations", in NA - Advances in Consumer Research Volume 10, eds. Richard P. Bagozzi and Alice M. Tybout, Ann Abor, MI : Association for Consumer Research, Pages: 244-249.

Advances in Consumer Research Volume 10, 1983      Pages 244-249

A RELIABILITY PROBLEM IN THE MEASUREMENT OF DISCONFIRMATION OF EXPECTATIONS

Ved Prakash, Florida International University

John W. Lounsbury, The University of Tennessee

ABSTRACT -

This study analyzes the reliability of difference scores used for the computation of the disconfirmation of expectations. Data were collected in two stages, once for expectations and then for post purchase evaluation for two products--fast food hamburger restaurants and beer. Results showed that the disconfirmation of expectations measures based on difference scores have low reliability (.46 and .19), which may contribute to the low correlations with overall purchase satisfaction. Formulas for correction for attenuation and illustrative tables are presented which show how validity changes as a function of changes in the reliability of the disconfirmation of expectation and overall satisfaction measures. Recommendations are made for: the development of more reliable scales, continued examination of the quality measuring devices used by marketing researchers, and inquiry into related measurement topics such as generalizability theory.

INTRODUCTION

The construct of disconfirmation of expectations has been extensively used in the study of consumer satisfaction and in understanding the bases of buyer behavior (cf. Howard and Sheth 1969; Engel and Blackwell 1989). This construct has sound theoretical support from Helson's (1959) adaptation level theory and Thibaut and Kelley's (1959) comparison level theory. The major problem with this concept is one of measurement.

There are three (or basically two) methods of operationalizing the construct of disconfirmation of expectations (Oliver 1980). The first approach involves computation of the discrepancy between expectations and postpurchase performance outcomes. Typically, expectations on a set of dimensions are measured prior to purchase and evaluations are made on the same set of dimensions after the purchase. The difference between these expectations and product performance evaluations represents the construct of disconfirmation. Some examples of the studies that have used this approach are Morris, Crull and Winter (1976), Swan and Trawick (1981) and Prakash (1981). A major finding of these studies is the positive correlations between disconfirmation scores and satisfaction. These studies also reported negative though significant correlations between expectations and disconfirmation scores. .Also, with the exception of one study (Prakash 1981) none of these studies report on the reliability estimates of the difference scores meant to compute disconfirmation.

The second approach is basically a variation of the first approach as it involves the computation of difference scores. This approach is mainly used in laboratory studies where expectations are created on the basis of information provided and then subjects are exposed to some treatment and consumer evaluations are measured after the purchase. The difference between preexposure expectations and postexposure evaluations constitutes disconfirmation of expectations (cf. Oliver 1977; Madden, Little and Dolich 1979).

The third approach involves the use of summary judgmental scales to measure disconfirmation such as "better than expected" to "worse than expected". This approach does not involve the computation of difference scores. Most of the studies that have used this scale have measured disconfirmation at the overall level and not over product attributes (Aiello, Czepiel and Rosenberg 1977; Oliver 1977; Oliver 1980; Oliver and Linda 1981; Westbrook 1980; Westbrook and Oliver 1981). In these studies while expectations are measured over product attributes, disconfirmation is measured on a single item 3 point (or longer) scale i.e., whether product performance exceeded expectations, or it was the same as expected, or it was less than expected. In these studies especially Oliver (1980) it is natural to expect no or little correlation between expectations and disconfirmation because- a)the two constructs are measured at two different tines; b)expectations and disconfirmation are measured by two different methods i.e. while expectations are measured over-product attributes disconfirmation is measured at the overall product level; and c)actual disconfirmation at the overall level may have little to do with the level of high, medium or low expectations. This approach is of little use for diagnostic analysis for finding out in which product attributes performance is lagging behind expectations.

Conceptually the difference score approach is more sound because it involves actual comparison of expectations and disconfirmation and permits diagnostic analysis. The critics of this approach (Oliver 1977, 198> being concerned with proving independence between expectations and disconfirmation dismiss this approach because it leads to negative correlations between expectations and disconfirmation. Oliver cites Lord's (1963) study but never actually discusses in detail the real reasons why difference score approach is problematic i.e., the low reliability of difference scores. The major purpose of this paper is to exclusively concentrate on the disconfirmation construct computed from the difference between expectations and product performance, to comment on the reliability of this construct and to discuss the resultant correlations between disconfirmation and a measure of overall satisfaction.

To see the problem of unreliability of difference score in a more general perspective we first turn to the psychometric literature. As has been discussed in detail by several authors (e.g., Cronbach & Furby, 1970; Lord, 1963; Magnusson, 1965), anytime the score on one variable is subtracted from the score on another variable to form the score on the variable of interest (the difference score variable), there is the potential threat of low reliability of the difference score variable.

Consider the following formula given by Lord (1963, p.3?) for estimating the reliability of a difference score:

EQUATION    (1)

where:

rdd, = reliability of the difference score (y - x)

rxx, = reliability of the first measure (e.g., prepurchase expectations)

ryy, = reliability of the second measure (e.g., postpurchase evaluation)

s2x = variance of the first measure

s2y = variance of the second measure

rxy = correlation between the first measure and the second measure.

As can be seen in this formula, the reliability of the difference score decreases as the -variance of either measure decreases and as the reliability of either measure decreases. The reliability of the difference score measure also decreases as the correlation between the two component measures increases. Thus, if one wanted to increase the difference score reliability, one might attempt to lo the correlation between, say, a measure of prepurchase expectations and a measure of postpurchase evaluation for the same attribute (or set of attributes). However this would lead to a paradox since, as the correlation between the two measures approaches zero, the difference score reliability would approach a maximum value (other factors constant), but then one would have less assurance that the same attribute was, in fact, being measured. Of course, the correlation could be lowered somewhat by such factors as failure to confirm expectations (although this may merely result in a lower mean value on the post-purchase evaluation) and evaluator inexperience, but one would hardly want to strive for a lower correlation to achieve a higher difference score reliability.

Furthermore, when the two component variables are measured at different points in time, difference score unreliability capitalizes on any effects of regression-to-the-mean over time. In fact, unreliability of the measures can contribute to regression to the mean. Such regression effects contribute to a lowering of the difference score reliability.

As will be shown in the data presented later in this text, the empirical estimates of the reliability of difference score measures are at best modest and often rather low. The problem that this poses for theory-building and efforts to ascertain the construct validity or disconfirmation of expectation measures is straightforward. At the very least, low reliability obscures the true validity of a construct; most often it reduces the observed validity of a construct. This can be seen clearly in the following formula adapted from Lord (1963, p. 34):

EQUATION    (2)

where:

rdc = the observed (or expected observed) correlation between a difference score measure and another measure.

rDc = the correlation between measure c and the difference score measure when the difference score measure is perfectly reliable (i.e., error-free).

rDD,= the reliability of the difference score measure

Thus, it can be seen that at a given (unobserved) level of correlation between a measure c and a perfectly reliable difference score measure d, the observed correlation is expected to decrease directly as the reliability of the difference score decreases (and as the reliability of the other measure, c, decreases). This formula is a rewritten version of a "partial correction for attenuation," with attenuation in this context referring to any reduction in the value of a relationship owing to errors of measurement and the term partial referring to the fact that perfect reliability is assumed only for one of the two measures.

If it is of interest to the researcher to forecast how the reliability of the expected observed correlation between a difference score measure and another measure c is affected additionally by unreliability in c, the following formula, based on a "full correction for attenuation," can be applied:

EQUATION    (3)

where, in addition to the terms defined for (2) rDC symbolizes the correlation between the difference score measure and the measure c assuming both measures are perfectly reliable. Also rcc, = reliability of the criterion measure c.

In the example that follows, the data from a study by Prakash (1981) are used to estimate the following:

a) the reliabilities of the disconfirmation of expectations measured as a difference score variable;

b) the observed correlations between the disconfirmation of expectation measure and consumer satisfaction; and

c) the projected correlations between the disconfirmation of expectations and satisfaction measures at selected values for the reliability of both measures, using the full correction for attenuation.

EMPIRICAL STUDY

Overview

The data described and discussed below are from a larger study by the senior author of the disconfirmation of predictive brand expectations as determinants of consumer satisfaction'(Prakash, 1981). Although three types of disconfirmation of expectations were analyzed in the larger study (normative, predictive, and comparative), to simplify the reliability analysis presentation, only the results for the disconfirmation or predictive expectations are presented here. The pattern of results were similar for all three types or expectations measures. The products chosen for this study were fast food hamburger restaurants (labelled FFHR) and beer.

Method

Data were collected in two stages, once for prepurchase expectations and then for postpurchase evaluation. The sample consisted of 402 students in the College of Business Administration at a large, Southeastern university, chosen for their familiarity with the products and their ability to make ratings on the scales provided.

Data on expectations and postpurchase evaluations were collected on seven-point, bipolar scales. The attributes assessed for FFER were taste of food, "having rood served the way you like it," food served hot, quality of food, menu variety, speed of service, friendliness of employees, value for price, cleanliness of restaurant, location convenience, and atmosphere/ decor. The attributes assessed for beer were good taste, pleasant aftertaste, good value for price, not filling, recommendation of friends, good quality of ingredients, and good brand reputation. Also, information on overall satisfaction with respect to the last purchased brand was measured using a seven-point scale ranging from 1-Extremely Dissatisfied to 7--Extremely Satisfied.

Reliability Analysis

For each product, disconfirmation of predictive expectation measures were formed by subtracting prepurchase brand expectations from the corresponding postpurchase evaluations across the set of attributes for a product. The overall disconfirmation of expectations measure was derived by summing these individual difference scores. Since multiple difference scores are combined here in forming the composite confirmation of expectation measure, one simplifying assumption must be mate to permit a meaningful application for formula (1): It is assumed here that the sum of individual difference scores for two variables across a common set of dimensions (e.g. EQUATION) equals the sum of the second variable across dimensions minus the sum of the second variable across dimensions (e.g., EQUATION).

To use formula (1) for estimating the reliability of the difference score (i.e., the confirmation of predictive expectations measure in the present case), information was needed on three types of parameters. First, the reliability of prepurchase expectations (and postpurchase evaluations) were estimated by computing a cronbach's coefficient alpha (see Magnusson, 1965) on the composite sum of attribute ratings. Second, variance estimates were computed on the composite scores across individuals. Finally, the correlation between the prepurchase and postpurchase composites was computed for each product. Then formula (1) was applied for both Products.

RESULTS AND DISCUSSION

For FFHR, the reliability of the prepurchase expectation composite was .83; the reliability of the postpurchase evaluation was .81; and the correlation between these two measures was .66. Thus, by formula (1), the reliability of the disconfirmation of predictive expectations variable measured as a composite difference score was .46. The test-retest reliability of the overall satisfaction with purchase measure was .94.

For beer, the reliability of the prepurchase expectation composite was .6;; the reliability of the postpurchase evaluation composite was .71; and the correlation between these two measures was .61. Thus, by formula (1), the reliability of the disconfirmation or predictive expectations variable measured as a composite difference score was .19. The test-retest reliability of the overall satisfaction with purchase measure was .9;.

These two reliability coefficients for the disconfirmation of expectations measure for FFER and beer are substantially lower than the reliabilities usually seen for scales measured at a single point in time. In both cases, one can infer that over half the variance in observed scores is error variance' Such a state of affairs does not bode well for the validity of these measures. Thus, it is not surprising that the disconfirmation of expectations measures correlated only .33 and .19 with overall satisfaction with purchase for the FFHR and beer measures, respectively.

One might well inquire what the observed correlations between the confirmation measures and overall satisfaction would be if the reliability of the disconfirmation measures were higher. In fact, one could assume that both the disconfirmation of expectations and satisfaction measure were assessed with perfect reliability and estimate the "true" correlation between the two measures using the full correction for attenuation. However, this procedure would provide us only with an unobservable (until that very distant future time when we develop error-free measures of subjective states) correlation. To estimate what the expected observed correlation would be at different levels of reliability less than 1.0, one can first estimate the "true" or error-free correlation, then use formula (3) to compute the projected values. We have done so for FFHR and beer in Tables 1 and 2.

TABLE 1

ESTIMATED EXPECTED OBSERVED CORRELATIONS BETWEEN DISCONFIRMATION OF PREDICTIVE EXPECTATIONS AND OVERALL SATISFACTION USING THE FULL CORRECTION FOR ATTENUATION AT SELECTED RELIABILITY VALUES FOR THE MEASURES OF DISCONFIRMATION OF PREDICTIVE EXPECTATIONS (rDD,) AND OVERALL SATISFACTION (rcc,).

TABLE 2

ESTIMATED EXPECTED OBSERVED CORRELATIONS BETWEEN DISCONFIRMATION OF PREDICTIVE EXPECTATIONS AND OVERALL SATISFACTION USING THE FULL CORRECTION FOR ATTENUATION AT SELECTED RELIABILITY VALUES FOR THE MEASURES OF DISCONFIRMATION OF PREDICTIVE EXPECTATIONS (rDD,) AND OVERALL SATISFACTION (rcc,)

Thus, it can be seen in Table 1 that in the case of FFHR there is a substantial increase in the expected observed value of the correlation between disconfirmation of predictive expectations and overall purchase satisfaction as the reliabilities of the two measures are increased. Consider, for example, the last column of values where, under a reliability of .95 for the satisfaction measure, the observed value or the correlation could be expected to rise from .15 to .47 if the reliability of the disconfirmation measure were raised from .10 to .95. Or, if the reliability of the disconfirmation of expectation measure were raised to the level of that actually observed for the satisfaction measure in the present data set (.95), the value or the correlation between the two measures could be expected to increase from the actual observed value of .33 to a value of .47, which represents a doubling in the proportion of explained variance from .11 to .22.

A similar pattern can be observed in Table 9 where expected observed correlation values rise as the reliabilities of the disconfirmation of expectations and overall satisfaction with beer purchases increase. In this case, if the reliability of the disconfirmation of expectation measure were increased to a level commensurate with that of the actual observed value of the overall satisfaction measure (approximately .95), the value of the correlation between the two measures could be expected to increase from the empirically observed value of .19 to .43, which represents a fourfold increase in the proportion of explained variance (.04 to .16).

IMPLICATIONS

The implications of the present study are clear. Researchers who chose to employ the difference score method to compute disconfirmation of expectations should expect to encounter low reliabilities for their measures. Low reliabilities of such difference score measures as the values of .46 and .19 observed in this study should not be considered acceptable to today's marketing researcher. Low reliability reduces the power of statistical tests, lowers estimates of effect sizes, and clouds our understanding of the construct validity of theoretically relevant variables. Low reliability also impedes our ability to predict important outcomes such as purchases in the applied domain.

If the disconfirmation of expectations construct is to continue to serve a ta subject of study in this area, we believe that either efforts be made (successfully) to improve their reliability when measurement is based on difference scores, or that other more reliable approaches be used to measure disconfirmation which do not-involve the use of difference scores. With respect to the first option, two of the more reasonable strategies might be to: 1) increase the reliability of two component variables, which could be accomplished by increasing the number of items or increasing the homogeneity of items in the component measures; 9) attempt to increase the number or types of disconfirmation of expectations measures used (cf. Prakash 1982). Even with relatively low reliabilities for each of the individual disconfirmation measures, given enough of them, prediction can be substantially improved.

With respect to the other option of searching for more reliable measures, it is possible to measure the degree of disconfirmation through the use or better than expected--worse than expected scale. This approach avoids the use of difference score and has in fact been used in some previous studies (Oliver 1977, 1980; Trawick and Swan 1980; Westbrook 1980). An examination of these studies shows that they do not report on issues of reliability. They simply report positive correlations between disconfirmation and satisfaction. Oliver (1980) also reports that this correlation might be exaggerated because of the fact the measurement of the two constructs i.e. satisfaction and disconfirmation takes place at the same time. Therefore he suggests a three stage study, one for measuring expectations, the second for measuring disconfirmation and the third for measuring satisfaction and intention to repurchase.

Furthermore, measuring disconfirmation by summary P Judgmental scale does not indicate which expectations (high or low) were confirmed or disconfirmed. More importantly, the fact of a zero correlation with expectations as reported by Oliver (1980) is an artifact of the way disconfirmation is measured. Supposing that high, medium or low expectations are indeed disconfirmed, the results of no correlation between expectations and disconfirmation seem very strange. High expectations might regress downward due to error of measurement and low expectations might regress upward (in memory) thus leading to a zero or strange correlation. This type of measure still requires memory and could lead to a bias on that account. To help reduce such a bias, we suggest coupling this approach with the following procedure. Measure and record prepurchase expectations for each subject and allow the subject to review these expectations immediately prior to making evaluation on the degree of disconfirmation. Such an approach would force the subjects to attend to his or her prepurchase expectations. The subjects could also be asked to justify actual confirmation or disconfirmation of expectations.

Overall, it seems that there is no satisfactory solution to the measurement of disconfirmation. Difference score approach suffers from low reliability, while the overall judgmental approach suffers from artifactual zero correlation between expectations and disconfirmation. The later approach also has a drawback of bias due to memory loss. Future research should compare the efficacy of the difference scores approach vs. the use of summary judgmental scales. Swan and Trawick (1981) did undertake such a study but they did not address the issues of reliability and validity.

A question can be raised whether the operationalization of disconfirmation based on difference score in this study was invalid because of negative but significant correlations between expectations and disconfirmation (in other words expectations and disconfirmations were lot shown to be independent). The answer to this criticism is that the results of this study are consistent with the results of other studies that have used difference score approach (Swan and Trawick 1981; Oliver 1977). In fact, it is logical to expect an inverse relationship between expectations and disconfirmation because the higher the expectations the more difficult it is to confirm them. The studies that have shown independence between expectations and disconfirmation (Oliver 1980) have measured disconfirmation differently i.e. by use of summary judgmental scale and as discussed in previous paragraph this independence is mainly an artifact of the measurement of disconfirmation. Furthermore, it may be argued by critics that for the difference score to work successfully we have to show independence or low correlations between the two component measures i.e. expectations and postpurchase evaluation. The answer to this criticism is that it would be desirable to have low correlations between the two component measures in order to improve difference score reliability as per Lord's (1963, p. 32) formula (1) in this paper. But this is not a necessary prerequisite for the difference score approach to be valid. In fact as we have also pointed out earlier in this paper this creates a paradox in that if the correlation between the two component measures is low it creates doubt as to whether the same product attributes are being measured. This paradox highlights the difficulties of improving the reliability of difference scores. The operationalization of disconfirmation in this study was not invalid. The purpose of the paper was to focus attention on the reliability problem in regard to the disconfirmation construct based on difference scores and that purpose has been successfully served .

Finally, no matter what measurement strategy is employed, researchers in this area should be encouraged to examine empirically the quality of their measures in terms of classical reliability concepts as well as other factors which may impede validity such as range restriction, criterion contamination, and generalizability of measurement procedures across item pools, product types, settings, time intervals, and subject characteristics. A next logical step would be to apply generalizability theory concepts and estimation procedures (cf. Cronbach, Gleser, Nanda, & Rajaratnam, 1979) to measures of interest to marketing researchers. Our hunch is that such endeavors will expose some disappointing data on the generalizability or constructs across measurement facets, but the knowledge gained will hopefully point toward remedial efforts that will ultimately lead to enhanced prediction and more valid models.

REFERENCES

Aiello, Albert, Jr., Czepiel, John A., and Rosenberg, Larry A. (1977), "Scaling the Heights of Consumer Satisfaction: An Evaluation of Alternate Measures", in Consumer Satisfaction, Dissatisfaction and Complaining Behavior, Ralph L. Day, ed. Bloomington, Indiana: School of Business, Indiana University.

Cronbach, L.J. and Furby, L. (1970), "How Should We Measure "Change" or Should We?", Psychological Bulletin, 74, 68-80.

Cronbach, L.J., Gleser, G.C., Nanda, H. and Rajaratnam, N. (1972), The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles, New York: Wiley.

Engel, J.F. and Blackwell, R.D. (1982), Consumer Behavior, Fourth Edition, Hinsdale, Illinois: Dryden.

Helson, X. (1959), "Adaptation Level Theory", in Psychology: A Study of Science, S. Koch, ed., Volume I, Ne; York: McGraw-Hill.

Howard, John A. and Sheth, J.N. (1969) The Theory of Buyer Behavior, New York: Wiley.

Lord, F.Y. (1963), "Elementary Models for Measuring Change", in Problems in Measuring Change, C.W. Harris, ed.. Madison. Wisconsin: University of Wisconsin.

Madden, Charles S. Little, Eldon L., and Dolich, Ira J. (1979), "A Temporal Model of Consumer SID Concepts as Net Expectations and Performance Evaluations," in New Dimensions of Consumer Satisfaction and Complaining Behavior, Ralph L. Day and H. Keith Hunt, eds., Bloomington, Indiana: School of Business, Indiana University.

Magnusson, D. (1965), Test Theory, Palo Alto, California: Addison-Wesley.

Morris, Earl W., Crull, Sue R., and Winter, Mary (1976), "Housing Norms, Housing Satisfaction and the Propensity to Move", Journal of Marriage and Family, 38 (May), 309-21.

Oliver, Richard L. (1977), "Effect of Expectation and Disconfirmation on Postexposure Product Evaluations: An Alternative Interpretation", Journal of Applied Psychology, 62 (August), 480-6.

Oliver, Richard L. (1980) A Cognitive Model of the Antecedents and Consequences of Satisfaction Decisions", Journal of Marketing Research, Vol XVII, 460-9.

Oliver, Richard L. and Linda, Gerald (1981), "Effect of Satisfaction and Its Antecedents On Consumer Preference and Intention", in Advances in Consumer Research, Rent B. Monroe, ed., Volume VIII, Ann Arbor, Michigan: Association for Consumer Research, 88-93.

Prakash, V. (1981), An Investigation of the Confirmation of Predictive Brand Expectations as a Determinant of Consumer Satisfaction, Unpublished Doctoral Dissertation, The University of Tennessee, Knoxville.

Prakash, V. and Lounsbury, John W. (1982), "An Investigation of the Confirmations of Predictive, Normative, and Comparative Expectations as Determinants or Consumer Satisfaction", in 14th Annual National AIDS Conference Proceedings, (Forthcoming).

Swan, John A. and Trawick, Frederick I. (1981), "Satisfaction Explained by Desired vs. Predictive Expectations" in The Changing Marketing Environment: New Theories and Applications, Kenneth Bernhardt et. al., eds., Educator's Conference Proceedings, Chicago: American Marketing Association, 170-3.

Thibaut, J.W. and Kelley, H.H. (1959), The Social Psychology of Groups, New York: Wiley.

Trawick, Frederick I. and Swan, John A. (1980), "Interred and Perceived Disconfirmation in Consumer Satisfaction", in Educator's Conference Proceedings, Chicago: American Marketing Association, 97-100.

Westbrook, Robert A. (1980), "Interpersonal Affective Influences On Consumer Satisfaction with Products", Journal of Consumer Research, Vol. 7 (June), 49-54.

Westbrook, Robert A. and Oliver, Richard L. (1981), "Developing Better Measures of Consumer Satisfaction: Some Preliminary Results", in Advances in Consumer Research, Kent B. Monroe, ed., Vol VIII, Ann Arbor, Michigan: Association for Consumer Research, 94-99.

----------------------------------------