The Effect of Varying Response Intervals on the Stability of Factor Solutions of Rating Scale Data

Roger Best, University of Arizona
Del I. Hawkins, University of Oregon
Gerald Albaum, University of Oregon
ABSTRACT - The number of response intervals used with rating scales is generally considered to have a limited impact on the obtained results. A univariate comparison of five interval and continuous interval response formats confirmed this. However, a multivariate (factor analysis) comparison produced significantly different solutions depending on the number of scale intervals used. The implications for the validity of multivariate analysis of rating scale data are discussed.
[ to cite ]:
Roger Best, Del I. Hawkins, and Gerald Albaum (1979) ,"The Effect of Varying Response Intervals on the Stability of Factor Solutions of Rating Scale Data", in NA - Advances in Consumer Research Volume 06, eds. William L. Wilkie, Ann Abor, MI : Association for Consumer Research, Pages: 539-541.

Advances in Consumer Research Volume 6, 1979      Pages 539-541

THE EFFECT OF VARYING RESPONSE INTERVALS ON THE STABILITY OF FACTOR SOLUTIONS OF RATING SCALE DATA

Roger Best, University of Arizona

Del I. Hawkins, University of Oregon

Gerald Albaum, University of Oregon

ABSTRACT -

The number of response intervals used with rating scales is generally considered to have a limited impact on the obtained results. A univariate comparison of five interval and continuous interval response formats confirmed this. However, a multivariate (factor analysis) comparison produced significantly different solutions depending on the number of scale intervals used. The implications for the validity of multivariate analysis of rating scale data are discussed.

INTRODUCTION

Despite the widespread utilization of rating scales, several issues concerning the appropriate way to utilize these scales remain unresolved. The specific issue to be addressed in this paper is the impact that the number of response intervals has on the results. A number of researchers have addressed this question (Guilford, 1954; Green and Rao, 1970; and Jacoby and Matell, 1971; Lehmann and Hulbert, 1972; Matell and Jacoby, 1972; Masters, 1972; Albaum and Munsinger, 1973; and Bendig, 1974).These studies have focused on univariate comparisons and have generally found that the number of scale positions used has a minimal impact on the obtained results. The authors have recommended using between 3 and 25 intervals depending on the task at hand and the nature of the respondents.

What happens when rating scales using varying numbers of response categories are subjected to a multivariate analysis such as factor analysis? A specific situation can help clarify the importance of this question. At the 1977 Association for Consumer Research conference Vaughn et al (1978) presented an interesting analysis of university choice criteria which utilized several seven-point rating scales as part of its methodology, The responses to these rating scales were analyzed via averaging and by factor analysis. The question now is, "Is it possible that Vaughn et al would have reached different conclusions had they utilized a different number of response intervals.?"

This question is not intended as a criticism of Vaughn, et al. The same question could be addressed to any of the 15 papers at that conference which utilized rating scales and specifically to the 3 which then factor analyzed the results. In utilizing seven intervals, Vaughn et al were following a commonly accepted practice (Tull and Hawkins, 1976). Nonetheless, it is important to determine the possible effects of arbitrarily selecting seven or five or some other number of response intervals.

METHOD

A five-interval discrete scale which restricts response to one of five intervals and an intervally continuous scale (graphic rating scale) which provides unrestricted response were created for use with the semantic differential as shown below.

SCALE

Each scale was 125 mm in total length. A 30-item questionnaire was constructed for each treatment using ten bipolar adjective scales which were used by 176 undergraduate students to evaluate three familiar stimuli--the university, the university bookstore, and the student union. One questionnaire utilized the five-interval discrete scale while the other incorporated the intervally continuous scale. The adjective pairs were selected from among those commonly reported in studies of retail store image that would also be relevant to the stimuli being evaluated in this study. One-half of the students were randomly selected to complete one questionnaire and one-half to complete the other questionnaire. The presentation of scale items and stimuli were randomized in both questionnaires to minimize the chance of order bias.

ANALYSIS AND RESULTS

Univariate analysis of the data followed two procedures. First, means were computed for each of the 30 scales (10 scales x 3 concepts) for both the discrete and continuous formats. (The responses to the continuous interval scale were coded using a metric ruler and the values of 1 to 125 mm.). The scale means for the two formats were then compared by correlation analysis. The correlation between the two formats was .92. As Figure 1 illustrates, this represents a very high level of agreement between the two approaches.

FIGURE 1

CORRESPONDENCE BETWEEN SCALE MEANS OBTAINED FROM TWO INTERVALLY DISSIMILAR SEMANTIC DIFFERENTIAL SCALES

A second form of univariate analysis was conducted by using the Kolomogorov-Smirnov two sample test. This test is nonparametric and thus removes any concern for the interval nature of this type of data (Labovitz, 1967; Labovitz, 1970; Dawes, 1971; Martilla and Garvey, 1975; Henkel, 1975; and Albaum, et al., 1977). In addition, this test is sensitive to differences in central tendency, dispersion, and skewness. To conduct the test, the continuous scale was converted into five equal intervals that corresponded to the five intervals on the discrete scale. The responses were then assigned the appropriate one to five value.

The results of this analysis are shown in Table 1. Only two significant differences with p set as high as .15 were found in the 30 comparisons. One would expect this result by chance.

TABLE 1

RESULTS OF KOLOMOGOROV-SMIRNOV TEST

Based on these two univariate comparisons, the data obtained from these two distinct sets of response categories appear to be equivalent. However, before concluding that there is complete equivalence, we need to take the analysis one step further; we need to examine the equivalence of multivariate analysis of these two scale formats. This is something previous studies have failed to do.

Factor analysis is perhaps the most common multivariate technique applied to rating scale data. Twenty percent of the studies using rating scales in last year's ACR conference performed factor analysis on those scales.

To test the sensitivity of factor solutions to varying numbers of response intervals, the responses from each of the two scale treatments for each concept were subjected to normalized and varimax rotated factor analysis. The factor solutions for each concept were compared by computing the correlation between the factor loadings obtained from each treatment type. The correlations for the three concepts ranged from .45 to .55. The nature of these responses are shown for one concept the University, in Table 2 and illustrated graphically in Figure 2 (of the three concepts this one had the highest degree of linear association between the two factor solutions).

TABLE 2

FACTOR ANALYTIC SOLUTION OF 5-INTERVAL AND CONTINUOUS SEMANTIC SCALE TYPES FOR "UNIVERSITY"

FIGURE 2

CORRESPONDENCE BETWEEN FACTOR LOADINGS OBTAINED FROM FACTOR ANALYSIS OF FIVE INTERVAL DISCRETE AND CONTINUOUS INTERVAL SEMANTIC DIFFERENTIAL SCALE DATA

DISCUSSION

The univariate analysis of the responses to a five interval and a continuous interval rating scale indicated that the two approaches produced equivalent results. However, correlations of the factor loadings obtained from each type scale ranged from .45 to .55. Thus, the explained variance between the multivariate solutions ranged from 22 to 30 percent. This low level of association questions the reliability and validity of interpretations derived from factor analyzed rating scales.

Since factor loadings are dependent upon the covariance between items as well as the variance within an item and the variances were not significantly different as shown by the Kolomogorov-Smirnov tests, these results suggest that changes in the number of response intervals influences the covariance between items substantially more than it does the central tendency or variance within an item. Another explanation is that factor analysis is inherently more sensitive to minor shifts in responses than are means or aggregate tests such as the Kolomogorov-Smirnov test.

CONCLUSIONS

In this study, the conclusions one would reach based on the factor analysis would depend upon the number of response intervals used. However, two extreme forms of the rating scale were compared, a five interval discrete scale and a continuous scale. The results indicate that the number of response intervals can influence the factor solutions even though there is no significant variation in their univariate properties. Looking back at Vaughn et al.'s study (as well as all the others that have factor analyzed rating scale data), we can say that it is possible that the factor solutions were influenced by the number of response categories. What must be determined now is how sensitive the procedure is to variations in the number of categories used. Will a shift from 5 to 6, or 5 to 7 influence the result? In addition, one must wonder how sensitive other multivariate techniques such as the principle components derived in discriminant analysis are to changes in the number of response categories. Also, which is more accurate in terms of validity? Perhaps this question can be answered by examining the factor solutions of known interval or ratio scaled data. This data could then be re-measured using a varying number of scale intervals. A comparison of the various solutions would yield some insight into the accuracy of the solutions. Until questions such as these are answered, the results of multivariate analysis, particularly factor analysis, of rating scale data must be viewed with caution.

REFERENCES

Albaum, G., and G. Munsinger, "Methodological Questions Concerning the Use of the Semantic Differential," paper presented at the Southwestern Social Science Association Conference, Dallas, March 1973.

Albaum, G., R. Best, and D. I. Hawkins, "Measurement Properties of Semantic Scale Data," Journal of the Market Research Society, 19 (1977), 21-28.

Bendig, A. W., "Reliability and Number of Rating Scale Categories," Journal of Applied Psychology, 38 (1974), 38-40.

Dawes, R. M., "Suppose We Measured Height with Rating Scales Instead of Rules," Oregon Research Institute Technical Report (1971), 2.

Green, P. E., and V. R. Rao, "Rating Scales and Information Recovery--How Many Scales and Response Categories to Use?" Journal of Marketing, 34 (1970),33-39.

Guilford, J.P., Psychometric Methods, New York: McGraw-Hill, 1954.

Henkel, R. L., "Part-Whole Correlations and the Treatment of Ordinal and Quasi-Interval Data as Interval Data," Pacific Sociological Review, 18 (1975),3-26.

Jacoby, J., and M. S. Matell, "Three-Point Likert Scales are Good Enough," Journal of Marketing Research, 8 (1971), 495-500.

Labovitz, S., "Some Observations on Measurement and Statistics," Social Forces, 46 (1967), 151-60.

Labovitz, S., "The Assignment of Numbers to Rank Order Categories," American Sociological Review, 35 (1970), 515-524.

Lehmann, D. R. and J0 Hulbert, "Are Three-Point Scales Always Good Enough?" Journal of Marketing Research, 9 (1972), 444-446.

Martilla, J. A. and D. W. Carvey, "Four Subtle Sins in Marketing Research," Journal of Marketing, 39 (1975), 8-15.

Masters, J. R., Reliability as a Function of the Number of Categories of a Summated Rating Scale, unpublished dissertation, University of Pittsburgh, 1972.

Matell, M. S. and J. Jacoby, "Is There an Optimal Number of Alternatives for Likert-Scale Items," Journal of Applied Psychology, 56 (1972), 506-509.

Tull, D.S. and D. I. Hawkins, Marketing Research: Meaning, Measurement and Method. New York: Macmillan, 1976, 336.

Vaughn, R., J. Pitlik, and B. Hansotia, "Understanding University Choice: A Multi-Attribute Approach," in H. K. Hunt, Advances in Consumer Behavior, Vol. V, (Ann Arbor: Associations for Consumer Research, 1978), 26-31.

----------------------------------------