Investigating &Quot;Income Refusals&Quot; in a Telephone Survey By Means of Logit Analysis

Robert A. Peterson, University of Texas at Austin
Robert P. Leone, University of Texas at Austin
Mohammad R. Sabertehrani, University of Texas at Austin
ABSTRACT - A logit analysis was employed to investigate the relationship between refusal to answer an income question and four demographic variables. Data were drawn from a telephone survey of 6178 adult consumers regarding their financial attitudes and behavior. All four demographic variables were significantly related to refusal to respond; the primary reason given for refusal related to "invasion of privacy."
[ to cite ]:
Robert A. Peterson, Robert P. Leone, and Mohammad R. Sabertehrani (1981) ,"Investigating &Quot;Income Refusals&Quot; in a Telephone Survey By Means of Logit Analysis", in NA - Advances in Consumer Research Volume 08, eds. Kent B. Monroe, Ann Abor, MI : Association for Consumer Research, Pages: 287-291.

Advances in Consumer Research Volume 8, 1981      Pages 287-291

INVESTIGATING "INCOME REFUSALS" IN A TELEPHONE SURVEY BY MEANS OF LOGIT ANALYSIS

Robert A. Peterson, University of Texas at Austin

Robert P. Leone, University of Texas at Austin

Mohammad R. Sabertehrani, University of Texas at Austin

ABSTRACT -

A logit analysis was employed to investigate the relationship between refusal to answer an income question and four demographic variables. Data were drawn from a telephone survey of 6178 adult consumers regarding their financial attitudes and behavior. All four demographic variables were significantly related to refusal to respond; the primary reason given for refusal related to "invasion of privacy."

INTRODUCTION

Associated with every empirical survey is a certain amount of "error." This error can be conceptually decomposed into sampling error and nonsampling error. Sampling error is a function of data quantity--the greater the quantity of survey data, the smaller the magnitude of sampling error. Nonsampling error, on the other hand, relates to the quality of survey data, how "good" or "meaningful" the data are with regard to various criteria such as validity, reliability, and generalizability.

Traditionally most research on survey error has focused upon sampling error, perhaps due in part to reasons of analytical ease. This is in spite of the fact that non-sampling error (data quality) is likely to be more directly associated with the quality of inferences derived from the data. Recently, however, there has been an increasing interest in, and research relating to, the assessment and impact of data quality. This research has not been limited to marketing, but pervades all the behavioral sciences (see for instance, Bailar and Lanphier (1978) for several data quality issues currently being addressed).

A major nonsampling error is nonresponse or refusal to cooperate. Nonresponse may be complete, as when a sampled individual refuses to participate at all in a survey. Or, nonresponse may be partial, as when a study individual refuses to answer a specific question. This type of refusal is commonly termed an item nonresponse and is an "error of omission". It most frequently occurs for questions relating to sensitive issues such as personal hygiene or moral behavior. Item nonresponse is particularly acute in telephone interviews. This is because study individuals are neither offered the anonymity of a mail interview, nor do they experience the social pressure to answer of a personal interview.

The present research is concerned with investigating item nonresponse. While there has been voluminous research conducted on refusal to participate or cooperate in a survey per se, especially in the context of mail interviews, relatively little research has been conducted on individual item nonresponse. Still, what limited research does exist suggests that item nonresponse can significantly influence data quality in several ways (see Ferber 1966, Craig and McCann 1978, Wiseman and McDonald 1979).

Specifically. the present research consists of an investigation of respondent characteristics relating to refusal to answer a household income question in a telephone survey. The household income question is frequently a crucial question in consumer surveys for several reasons. Most obviously, income commonly serves as a dependent variable (to be estimated) for certain classes or groups of consumers. Moreover, income is often used as an independent variable to predict other, often behavioral, variables. In addition, income data are typically used to evaluate the representativeness of a sample by comparing sample income data with criterion (usually population or census) income data. Finally, income is sometimes employed as a surrogate variable or in combination with other variables in index form.

Despite the importance of the income question, nonresponse to income questions tends to be relatively high. Indeed, among demographic or socioeconomic variables commonly employed in marketing research surveys it tends to possess the highest refusal rate (Herriott 1977, Locander and Burton 1976). While anecdotal evidence would explain income nonresponse as a function of the "private nature" of income, little empirical evidence exists as to who refuses to answer the income question or why study individuals refuse to answer. The purpose of this paper is to provide insights into both the who and the why questions.

METHODOLOGY

Research Data

Data for this investigation were derived from telephone interviews with 6178 adult consumers in a southwestern metropolitan area. Interviews lasted 8-10 minutes and had as their subject matter financial attitudes and behavior. Data collection employed systematic random digit dialing, a dialing technique which gives each household in the specified exchanges an equal probability of being included in the sample (it is a non-directory approach). Two callback attempts (three attempts in all) were made to contact study individuals, and ail interviewing was conducted between the hours of 6-9 p.m. weekdays and 10 a.m. to 9 p.m. weekends. Across all study individuals interviewed the overall refusal rate for the income question was 12.6 percent (779 of the 6178 individuals interviewed).

The dependent variable was refusal to answer an income question. This question was asked in a straightforward manner as "What is your total annual household income?" with five response categories provided. Four demographic characteristics were employed as independent variables: sex, marital status, education, and age. To facilitate data analysis the latter three variables were dichotomized. Marital status response categories consisted of "married" and "not married" (the latter including individuals who were single, divorced, or widowed). The "low" education category included individuals who possessed no college education while "low age" study individuals were those 18-34 years of age. These variables were respe-tively coded as follows:

TABLE

Study data are presented in Table 1, which is essentially a four way (16 cell) contingency table. Observed proportions can be interpreted as refusal probabilities. Thus the probability was .087 that a low age, low education, married male refused to answer the household income question.

TABLE 1

INCOME QUESTION REFUSAL PROBABILITIES

Analytical Approach

Due to the nature of the research data (i.e., a discrete dependent variable and a relatively large sample size) a logit model approach was selected as appropriate for data analysis. The particular approach used ECTA, a likelihood ratio analytical model derived by Goodman (1972). Such an approach permits prediction of the probability of refusal to answer the income question (R) as a function of education (E), age (A), sex (S) and marital status (M). Technically the objective of logit analysis is to reduce a "saturated" model which incorporates all independent variable effects at all levels (and which obviously fits the observed data perfectly) into a more parsimonious model that still provides an acceptable fit to the data. The saturated model can be posed in standard ANOVA format as:

EQUATION   (1)

where LRijkl is the natural logarithm of the odds ratio refusal probability/nonrefusal probability (the logit) and the m" SIZE="2's represent the respective independent main and interaction effects. [In general logit is the logarithm of the odds ratio L = ln p/1-p where p = 1/(1 + e-xB).

To simplify a saturated model M1 the m terms are systematically constrained to zero in a step-wise manner and the loss in explanatory power calculated at each step. Selection of terms to be included in the final model is done by simultaneously evaluating their standardized coefficients and explanatory power (variance accounted for). This approach proceeds in a hierarchical fashion with an independent variable's lower order terms being simultaneously evaluated with significant higher order terms.

TABLE 2

LIKELIHOOD RATIO RESULTS FOR SELECTED LOGIT MODELS

RESULTS

Table 2 contains a summary of the logit analysis. The saturated model M1 has been reduced to the best fit model M6 through the retention of significant independent variables. Thus model M6 is given as:

EQUATION

This model explains the probability of income refusal as a function of variation in respondent education, age, sex, marital status and the interaction between education and age. As shown in the last column of Table 2, the model explains nearly 95 percent of the variance in income refusal probabilities. To further investigate model fit, a predicted probability for each cell in Table 1 was computed and compared with the cell's observed probability. Results from this comparison are presented in Table 3. Overall model M6 fit the data relatively well; the average absolute difference was .013. The relative importance of the independent variables in predicting refusal can be assessed both by the column "Gain in G2" in Table 2 and by the estimated standardized effect coefficients in Table 4. The larger the absolute value of the coefficient the greater the magnitude of the effect.

Who Refused to Answer?

Table 5 presents the observed probabilities of refusal associated with the significant effects in model M6. In general:

- respondents possessing lesser amounts of education were more likely to refuse than respondents possessing higher amounts of education;

- older respondents were more likely to refuse than younger respondents;

- females were more likely to refuse then males; and

- married respondents were more likely to refuse than unmarried respondents,

Moreover, the interaction between age and education produced a significant relationship with refusal. Finally, the least likely respondent to refuse to answer the household income question was a young, highly educated, unmarried male. The most likely respondent to refuse to answer was an older, less educated, married female.

Hence, refusal to answer the income question in a telephone survey was found not to be randomly distributed across respondents. Rather it was significantly related to the demographic characteristics investigated.

TABLE 3

OBSERVED PROBABILITIES, PREDICTED PROBABILITIES, AND RESIDUALS FOR MODEL M6

TABLE 4

ESTIMATES OF MODEL M6 EFFECTS

TABLE 5

OBSERVED PROBABILITY OF REFUSAL ASSOCIATED WITH EACH EFFECT

Why Did They Refuse?

Ten percent of the study individuals refusing to answer the income question were reinterviewed by telephone and questioned as to the reason for their refusal. This resulted in 67 completed reinterviews. By far the most frequent refusal reason given related to "invasion of privacy": 76 percent of the refusals were attributable to perceived privacy invasion. While study individuals giving this answer were found in all demographic categories. nonmarried females were more likely than other study individuals to respond in this manner.

Twenty percent of the study individuals reinterviewed stated they did not know their annual household incomes. These individuals tended to possess the characteristics of low education, high age, and be married females. Further exploration revealed that the married females were only provided a weekly allowance by their husbands; they were simply not informed as to their husbands' earnings.

Finally, 4 percent of the study individuals reinterviewed said they did not have time to calculate their household income since it would be necessary to take into account a spouse's income as well as "other" income sources. For them responding required too much effort.

In brief, reasons for refusing to answer the income question differed across respondents and also (albeit to a lesser degree) related to demographic characteristics. Thus the results lend support to those of Skelton (1963) with respect to why study individuals refuse to answer an income question.

CONCLUSION

The present research investigated the relationship between refusal to answer a household income question in a telephone survey and four demographic variables. Using the logit model approach of Goodman a significant relationship was found between each of the four independent variables and refusal to answer. Moreover, reasons given for not answering an income question also differed across the individuals studied.

The implications of these research findings are twofold. First, the mere existence of a relatively large refusal rate for the income question increases the standard error of that variable and impedes investigation of relationships between income and other variables. Second, because nonresponse was not randomly distributed, any estimate of population income from the sample will be biased unless corrections are made. Moreover, any calculated relationships involving income are likely to be distorted due to the differential refusal rates.

Still, while this paper documented income question nonresponse, more research needs to be conducted on item non-response in general. Ultimately strategies for effectively reducing as well as handling item response need to be developed and implemented.

REFERENCES

Bailar, Barbara and Lanphier, Michael C. (1978), Development of Survey Methods to Assess Survey Practices. Washington, D.C.: American Statistical Association.

Craig, C. Samuel and McCann, John (1978),"Item Nonresponse in Mail Surveys: Extent and Correlates," Journal of Marketing Research, 15, 285-289.

Ferber, Robert (1966), "Item Nonresponse in a Consumer Survey," Public Opinion Quarterly, 30, 399-415.

Goodman, Leo (1972), "A Modified Multiple Regression Approach to the Analysis of Dichotomous Variables," "American Sociological Review, 37, 28-46.

Herriott, R. (1977), "Collecting Income Data on Sample Surveys: Evidence From Split-Panel Studies," Journal of Marketing Research, 14, 322-329.

Locander, William and Burton, John (1976), "The Effect of Question Form on Gathering Income Data by Telephone," Journal of Marketing Research, 13, 189-192.

Skelton, Vincent C. (1963) "Patterns Behind 'Income Refusals'," Journal of Marketing, 27, 38-41.

Wiseman, Frederick and McDonald, Philip (1979), "Noncontact and Refusal Rates in Consumer Telephone Surveys," Journal of Marketing Research, 16, 478-484.

----------------------------------------