Developing Interval Scale Values Using the Normalized Rank Method: a Multiple Context, Multiple Group Methodology

James W. Hanson, University of Western Ontario
Arno J. Rethans, Pennsylvania State University
ABSTRACT - This study focuses on the derivation of interval scale values for adjectives frequently used on measurement instruments in marketing research. Housewives and students are subjects in an experimental design to test the effects of specific marketing contexts on groups reactions to lists of adjectives. A normalized rank method is used to derive scale values. The results show that there are both group and context effects for the derived scale values, thus providing additional evidence for the utilization of "tailor-made" measurement instruments in marketing research.
[ to cite ]:
James W. Hanson and Arno J. Rethans (1980) ,"Developing Interval Scale Values Using the Normalized Rank Method: a Multiple Context, Multiple Group Methodology", in NA - Advances in Consumer Research Volume 07, eds. Jerry C. Olson, Ann Abor, MI : Association for Consumer Research, Pages: 672-675.

Advances in Consumer Research Volume 7, 1980     Pages 672-675

DEVELOPING INTERVAL SCALE VALUES USING THE NORMALIZED RANK METHOD: A MULTIPLE CONTEXT, MULTIPLE GROUP METHODOLOGY

James W. Hanson, University of Western Ontario

Arno J. Rethans, Pennsylvania State University

ABSTRACT -

This study focuses on the derivation of interval scale values for adjectives frequently used on measurement instruments in marketing research. Housewives and students are subjects in an experimental design to test the effects of specific marketing contexts on groups reactions to lists of adjectives. A normalized rank method is used to derive scale values. The results show that there are both group and context effects for the derived scale values, thus providing additional evidence for the utilization of "tailor-made" measurement instruments in marketing research.

INTRODUCTION

Researchers in marketing have done a considerable amount of work in the development of scaling methods of attitude measurement. Studies have been conducted, for example, into areas of: numbers of scale categories (Jacoby and Matell 1971); odd versus even scale categories (Green and Rao 1970); concept scale interaction (Dickson and Albaum 1977); measurement properties of semantic scales (Albaum, Best and Hawkins 1977); and reliability of scales (Best, Hawkins and Albaum 1977; Dickson and Albaum 1977). The findings of the studies in each of these areas have made valuable contributions to scale development. One particular area, although receiving some research attention, seemingly has not made much of an impact on researchers' use of scales. This area deals with the construction of instruments which have equal psychological intervals.

Constructing scales which have equal psychological intervals has been a continuing problem in the measurement and scaling areas. Various authors have approached the problem by utilizing techniques to determine the psychological scale values of words and phrases to be placed on a scale. Mean scale values have been attained on various lengths of scales and in situations using different subject groups as well as different contexts. (Jones and Thurstone 1955; Cliff 1959; Myers and Warner 1972; Mittelstaedt 1971; Bertram and Yielding 1973; Vidali 1975; Specter 1976). Despite the research findings in this area, many research studies continue to select response categories which do not have equal interval properties (Wildt and Mazis 1978). The selection of response categories seems to be made on the basis of habit, imitation or subjective judgment. The equal interval properties of the response continuum is often assumed even though this assumption may be false. Researchers should be concerned with utilizing equal interval scales for the following reasons: (a) Many researchers treat data obtained from scales such as the Likert scale as if they were interval. The results of most standard parametric techniques applied to such data are not greatly affected by small deviations from the interval requirement, and if the deviations are in fact small, no serious damage is done. However, this does not excuse researchers from carefully checking the nature of the measurement instrument and the responses they obtain to assure themselves that they do indeed have measurements that closely approach being interval (Labovitz 1970; Tull and Hawkins 1976, p. 221). (b) Respondents, when faced with scales which the researcher assumes to be interval, may treat the scale labels as if they were in fact equally spaced along the psychological continuum, but it is more likely that they would experience difficulty in making responses because some adjacent choices are closer together than others (Specter 1976). (c) Researchers must be aware of conditions under which scale values are derived as such derivations may or may not be applicable in particular contexts and/or across different groups of subjects (Mittelstaedt 1971; Myers and Warner 1972; Sharpe and Anderson 1972; Specter 1976; Vidali 1975).

These three reasons for concern provided the rationale for the study described in this paper. In particular, the concern for group and context effects influenced the purpose and design of the study.

PURPOSE

The purpose of this study is to derive scale values utilizing a multiple group-multiple context methodology and to determine whether or not groups differ in developing equal interval scales under specific marketing contexts. This latter purpose was translated into the following three hypotheses:

H1:  There is no difference between housewives and students in their mean scale values for frequency, evaluative and agreement words or phrases. (Main effect-group.)

H2:  There is no difference between the four context in their mean scale values of the three groups of words or phrases across respondents. (Main effect-contexts.)

H3:  Mean scale values do not differ by group for similar response categories across contexts. (Interaction effect-groups and contexts.)

METHODOLOGY

Multiple groups of respondents were employed in this study. The first group consisted of 80 undergraduate students enrolled in either a basic marketing course or a basic quantitative methods course at a West Coast university. A total of four different course sections were used to obtain this sample size. The second group consisted of 80 housewives. A proportion of these respondents (70%) were members of a church group while the remaining proportion were enrolled in an extension course at the university.

These respondents were asked to rank order three lists of words and phrases. Each list contained fifteen words or phrases which are frequently used as response categories in rating scales (Shaw and Wright 1967). The lists refer to frequency, evaluation and agreement adjectives and are shown in Table 1. These three categories of adjectives seem to be appropriate categories for a marketing context as consumers at times evaluate products, exhibit degrees of agreement concerning product statements and use products.

The rank ordering of the three lists was done under four specific contexts. Respondents were instructed to think in terms of one of three specific products varying in conspicuousness and social significance (Sharpe and Anderson, p. 432). These were slacks (an important product and consumed publicly), pens (an unimportant product and consumed publicly), and deodorants (an unimportant product and consumed privately). The fourth context was a "no product" context. The four contexts were randomly administered to each group of respondents. The position of each list on the instrument and the order of presentation of the words or phrases were also randomly determined.

TABLE 1

RESPONSE CATEGORIES

ANALYSIS AND RESULTS

Derivation of Scale Values

Once the rank order data were obtained, scale values for each of the response categories on each list were derived using the normalized rank method (Guilford 1954, Chapter 8). Under this approach, the ranks assigned to the stimuli (adjectives) are first converted into rank values. These rank values are a series, denoted by Ri, that are in reverse order to the rank ri. Ri is related to ri by the equation Ri = n-ri + 1. Subsequently, the Ri values are transformed to a common C scale recommended by Guilford. The C scale is a scale with a mean of 5.0 and a standard deviation of 2.0.

The results of the normalized rank method derivation of scale values are summarized in Tables 2, 3 and 4, giving the mean score and standard deviation for each word or phrase. Interest lies in the average scale value and its consistency within a rating group and among rating groups. The standard deviations indicate the amount of dispersion for each word or phrase within each group. The smaller the standard deviation, the more consistently the item was rated within the group.

The researcher can use this information in developing rating scales with equal psychological distance between items. Upon deciding on the number of points on the scale, the researcher can select words or phrases far enough apart in terms of scale values so as to be relatively independent and with values such that the psychological distances between the items are approximately the same.

Testing for Effects

A multivariate analysis of variance (MANOVA) was used to assess differences among groups and among contexts. MANOVA resembles analysis of variance except that the dependent variable for each observation is a vector rather than a single number. In this case the vector contained thirteen scale values since the two extreme adjectives of each list were not included in the analysis. Inclusion of the rallies for these adjectives would have rendered the error matrice singular.

TABLE 2

MEAN SCORES AND STANDARD DEVIATIONS OF SCALE VALUES FOR FREQUENCY SET

TABLE 3

MEAN SCORES AND STANDARD DEVIATIONS OF SCALE VALUES FOR EVALUATION SET

TABLE 4

MEAN SCORES AND STANDARD DEVIATIONS OF SCALE VALUES FOR AGREEMENT SET

The results of the MANOVA analysis are shown in Table 5.

TABLE 5

RESULTS OF M-ANOVA ANALYSIS

The results show no significant main effects and interaction effect for the "frequency" list. Group effects, however, are present in the list of adjectives expressing evaluation. In other words, some scale values on the agreement list show significant differences between students and housewives. Finally, both group effects and context effect (although somewhat weak) are present in the agreement list.

Univariate tests of individual scale measurements, shown in Table 6, indicate significant (a < 0.1) group difference for six adjectives for the evaluation list and four of the agreement adjectives. Furthermore, five of agreement adjectives showed significant context effects.

TABLE 6

UNIVARIATE F RATIOS FOR ADJECTIVES

CONCLUSIONS AND IMPLICATIONS

Although some previous researchers (Mittelstaedt 1971) have recommended immediate application of derived scale values across groups and contexts, others have called for a more situation specific scale development procedure. The findings of this study provide additional evidence for the latter in that both group and specific marketing context effects were evident for derived scale values of frequency, evaluative and agreement adjectives. The studies contribution lies in two areas: (a) it adds to a body of research in scale development and (b) its design includes an explicit marketing context. Also, the normalized rank method is a methodological procedure not often seen in the marketing literature. Given a particular group and specific context, this procedure can be used to develop rating scales with equal psychological distance between items.

Overall, the results of the study highlight the importance of developing specific rating scales and pretesting such scales in a context area similar to the one in which they will eventually be utilized. This approach would be in contrast to the common practice of borrowing scales designed for use in nonmarketing contexts.

REFERENCES

Albaum, Gerald, Best, Roger and Hawkins, Del (1977), "Measurement Properties of Semantic Scale Data," Journal of the Market Research Society, 19, 21-28.

Albaum, Gerald, Hawkins, Del and Best, Roger (1978), "A Note on the Intervalness of the Semantic Differential," Unpublished paper, University of Oregon.

Bartram, Peter and Yielding, David (1973), "The Development of an Empirical Method of Selecting Phrases Used in Verbal Rating Scales: A Report on a Recent Experiment,'' Journal of the Market Research Society, 15 (July), 151-6.

Best, Roger, Hawkins, Del and Albaum, Gerald (1977), "Reliability of Measured Beliefs in Consumer Research," William D. Perreault, Jr. (ed.), Advances in Consumer Research, Vol. IV, Atlanta: Association for Consumer Research, 19-23.

Cliff, N. (1959), "Adverbs as Multipliers," Psychological Review, 66, 27-44.

Dickson, John and Albaum, Gerald (1977), "A Method for Developing Tailormade Semantic Differentials for Specific Marketing Content Areas," Journal of Marketing Research, 14 (February), 87-91.

Green, Paul E., and Rao, V. R. (1970), "Rating Scales and Information Recovery--How Many Scales and Response Categories to Use?", Journal of Marketing, 34 (July), 33-39.

Guilford, J. P. (1950), Fundamental Statistics in Psychology and Education, 2nd ed. New York: McGraw-Hill Book Company.

Guilford, J. P. (1954), Psychometric Methods, New York: McGraw-Hill Book Company.

Jacoby, Jacob and Matell, Michael S. (1971), "Three Point Likert Scales Are Enough," Journal of Marketing Research, 8 (November), 495-500.

Jones, L. V. and Thurstone, L. L. (1955), "The Psycho-physics of Semantics," Journal of Applied Psychology, 39, 31-36.

Labovitz, S. (1970), "The Assignment of Numbers to Rank Order Categories," American Sociological Review, 35, 515-24.

Mittelstaedt, Robert A. (1971), "Semantic Properties of Selected Evaluative Adjectives: Other Evidence," Journal of Marketing Research, 8 (May), 236-7.

Myers, J. H. and Warner, W. G. (1972), "Semantic Properties of Selected Evaluation Adjectives," Journal of Marketing Research, 9 (November), 409-12.

Sharpe, L. K. and Anderson, W. T., Jr. (1972), "Concept-Scale Interaction in the Semantic Differential," Journal of Marketing Research, 9 (November) 432-4.

Shaw, M. E. and Wright, J. M. (1967), Scales for Measurement of Attitudes, New York: McGraw-Hill Book Company.

Spector, Paul E. (1976), "Choosing Response Categories for Summated Rating Scales," Journal of Applied Psychology, 61, No. 3, 374-5.

Tull, D. S. and Hawkins, D. I. (1976), Marketing Research: Meaning, Measurement and Method, New York: Macmillan Publishing Co., Inc.

Vidali, J. J. (1975), "Context Effects on Scaled Evaluatory Adjective Meaning," Journal of the Market Research Society, 17 (January) 215.

Wildt, Albert and Mazis, Michael B. (1978), "Determinants of Scale Response: Label Versus Position," Journal of Marketing Research, 15 (May) 261-7.

----------------------------------------