Dealing With Indecision - Should We ... Or Not?

Ian Fenwick, York University
Frederick Wiseman, Northeastern University
John Becker, Becker Research Corporation
James Heiman, Becker Reserach Corporation
ABSTRACT - The treatment of "don't know", "no opinion" or other nonsubstantive responses is a problem in many consumer research surveys. This paper looks at the problem in the context of 1980 Presidential election opinion polls. An unusually high proportion of the electorate were "undecided" even in the final weeks of the campaign. Discriminant analysis is used to allocate undecided voters to candidates. The method is validated by a post-election follow-up survey.
[ to cite ]:
Ian Fenwick, Frederick Wiseman, John Becker, and James Heiman (1982) ,"Dealing With Indecision - Should We ... Or Not?", in NA - Advances in Consumer Research Volume 09, eds. Andrew Mitchell, Ann Abor, MI : Association for Consumer Research, Pages: 247-250.

Advances in Consumer Research Volume 9, 1982      Pages 247-250


Ian Fenwick, York University

Frederick Wiseman, Northeastern University

John Becker, Becker Research Corporation

James Heiman, Becker Reserach Corporation


The treatment of "don't know", "no opinion" or other nonsubstantive responses is a problem in many consumer research surveys. This paper looks at the problem in the context of 1980 Presidential election opinion polls. An unusually high proportion of the electorate were "undecided" even in the final weeks of the campaign. Discriminant analysis is used to allocate undecided voters to candidates. The method is validated by a post-election follow-up survey.


As a Presidential election year. 1980 produced the usual flood of political opinion polls. One noticeable aspect of many of those polls was the high proportion of respondents that could not, or would not, state a voting intention. Not only did there appear to be more undecided voters than in previous years, but their prevalence was particularly high in marginal states. Although the problem of "don't knows", "undecideds" and other non-substantive responses has often been discussed in consumer research literature and attempts have been made to characterize non-substantive responders (e.g., Ferber 1966; Bogart 1967; Sicinski 1970; Francis and Bogart 1975) , few substantive methods have been developed to deal with the problem.

This paper uses discriminant analysis to allocate undecided voters to candidates. The method is applied to political opinion data collected two weeks prior to the 1980 election and is validated by post-election follow-up interviews.




Table 1 summarizes findings from some of the major national and regional polls conducted prior to the 1980 Presidential election. State wide polls for Pennsylvania, California, Michigan and New York all show undecided voters as more than 10% of the sample. Every poll in Table 1 estimates the gap between the leading candidates as less than the undecided vote. Clearly the action of undecideds has a major impact on the final election result, and the allocation of these undecided voters is crucial for successful election predictions and effective campaign strategy.

Two methods have been used to reduce the number of undecideds - a secret ballot and a leading question. In the former approach, respondents are actually given a ballot and asked to indicate how they would vote if the election were held on the day the poll was being taken. Respondents, after choosing their candidate, place their ballot in a ballot box thus maintaining complete 5 confidentiality of their preference. Perry (1960, 1979) indicates that the Gallup experience has been that when a secret ballot is used, the undecided percentage typically is reduced by as much as one third.

In the second method, undecided voters are asked the following question: "As of today, do you lean more to Candidate A or more to Candidate B?" Data from the Gallup organization indicate that this method is also effective in reducing undecideds, but not as effective as the secret ballot.

Each of the above methods has a major limitation -- the secret ballot can only be used in personal interview surveys; with the leaning approach, typically no more than 50% of the undecideds actually will express a leaning.

In summary, there is not now a satisfactory approach for allocating undecideds. This problem is increasing in magnitude since most polling organizations are now relying on telephone surveys where it is not possible to use the more efficient secret ballot approach.

The method that we propose is applicable under all modes of data collection and is described in the next section


Estimating the voting behavior of undecided voters is a two stage process. First, we fit a model to explain voting preferences of decided voters, using attitudinal, candidate evaluation and demographic data together with decided voters' stated voting intention. Second, this model is used to predict preferences for undecided voters given their attitudinal, candidate evaluation and demographic characteristics.

As voter preference is a categorical or nominally scaled variable, the appropriate analysis tool is multiple discriminant analysis. This technique uses a set of predictors to classify individuals into one of two or more mutually exclusive groups. This is done by estimating discriminant functions (linear weighted combinations of predictors) such that groups are maximally separated by their scores on these discriminant functions (discriminant scores). If there are only two groups involved this procedure is analogous to regression analysis with a dummy variable as the dependent measure. A single discriminant function will be estimated and its coefficients (predictor weights) can be interpreted as regression coefficients. If there are more than two groups the procedure is intuitively more complex, analogous to canonical correlation with dummy dependent variables. There will be more than one discriminant function to interpret:" groups will yield N-t discriminant functions. In contrast to regression analysis, goodness of fit is measured not by variance explained but by the proportion of individuals that are correctly classified. the "hit rate".

For the 1980 Presidential election a 3-way discriminant analysis is appropriate. Decided voters can belong to any of three groups: intending Anderson voters, intending Carter voters and intending Reagan voters. Consequently two discriminant functions will be estimated.

Data for this study were obtained by telephone interview from a probability sample of 500 registered Massachusetts voters who were systematically drawn from all state-wide telephone directories. The sample was evenly divided between male and female voters. The actual interviewing was completed over a three day period, October 17-1 approximately two weeks before the general election. Clearly all the voter classifications presented here are as of that time and cannot reflect subsequent changes in voters, candidates or issues. Data collected included voting intentions (" If the November election for President were being held today and the candidates were John Anderson, Jimmy Carter and Ronald Reagan, for whom would you vote?") 4 candidate evaluations; opinions on a range of political issues, national and local; and demographic characteristics.


Table 2 presents the results to the voting intentions question. Eleven percent of the sample reported that they did not intend to vote. Of those intending to vote 20% claimed they would vote for Anderson, 31% for Carter and 28% for Reagan. Twenty percent of all intending voters expressed no candidate intention. It is this 20% that we wish to classify.

A stepwise version of discriminant analysis was used allowing the best discriminators to be selected from 18 possible predictors. The final model adopted included 11 variables. These and their standardized coefficients are shown in Table 3. By far the best discriminators were candidate favorability ratings. These alone provided almost as good discrimination as the entire model. [Using candidate favorabilities alone produces a hit rate of 82%.] As can be seen, there is a very marked division between those variables involved mainly in the first discriminant function and those involved mainly in the second. The first discriminant function relies only on variables involving Reagan and Carter; the second uses variables involving Anderson and Carter. In broad terms respondents intending to vote for Reagan are separated out by the first discriminant function. The remainder are then divided between Carter and Anderson by the second discriminant function. It is as well to point out that we make no attempt to attribute causality in this model. It is not intended to model how voters choose candidates, or even to assert that candidate preferences determine voting intention. Almost certainly there are major simultaneity problems. Candidate favorabilities both influence and are influenced by voting intention. The model in Table 3 is purely descriptive of the differences between groups of decided voters



Classifying Decided Voters

The two discriminant functions can now be used to calculate two discriminant scores for each decided voter. The mean scores for intending Carter, intending Anderson and intending Reagan voters are plotted in Figure 1. These represent the group centroid, or position of the average voter of each type. Each decided voter is classified by plotting their discriminant scores in Figure 1 and assigning them to the group whose centroid is closest. [This assignment rule assumes groups are of equal sizes. If it is known a priori that groups differ in size the allocation rule can be improved (Anderson, 1958). In all that follows equal sized groups are assumed. Use of a priori probabilities to reflect unequal sizes could improve hit rates.] The effective boundaries between groups are shown by the broken lines in Figure 1. For example, respondent 102 (Figure 1) most closely resembles the average intending Carter voter and is therefore classified as an intending Carter voter. If this respondent has, in fact, reported the intention to vote for Carter this is a correct classification. a "hit".





Explanatory power of the discriminant analysis can be judged by the classification matrix, or "hits table", shown in Table 4. Diagonal elements give the number of correctly classified voters. Overall, 86% of decided voters were assigned to the correct candidate. This hit rate was highest for intending Reagan voters (93%) and lowest for intending Anderson voters (80%). The greatest confusion was between Carter and Anderson: 28 voters were misallocated between this pair compared to 12 between Carter and Reagan and 11 between Anderson and Reagan.



Although there are some problems in the direct interpretation of hit rates (Morrison, 1969), in particular we are fitting and testing the model with the same set of data, the 86% hit rate is much higher than would be expected by chance alone and reassuringly high over all three voter groups.

Classifying Undecided Voters

The hit rate shows the discriminant model fits well for decided voters, however our major concern is undecided voters. Undecided voters can be classified by following exactly the same procedure as outlined above. A pair of discriminant scores are calculated for each undecided voter and plotted in Figure 1; again the voter is allocated to the candidate whose decided voter centroid is closest. This procedure assigned 20% of undecided voters to Anderson, 37% to Carter and 43% to Reagan. [Three undecided voters have missing values on one or more of the discriminating variables and were consequently dropped from the analysis.]


Although the discriminant model fits the data well for decided voters, there is no immediate way to test its classification of the crucial undecideds. Thus, it was decided to conduct short follow-up interviews with a sample of the original respondents. This follow-up sample included 47 undecided voters and 52 decided voters. Field work took place on November 18-20, two weeks after the election.

Amongst other questions, respondents were asked how they voted in the 1980 Presidential election. Of those undecided voters who claimed to have cast a vote for one of the three major candidates, 61% voted as classified by the model. Accuracy was particularly high for Reagan voters (85%) and poorer for Carter (43%) and Anderson (46%).

Some part of the deviation between classification and voting behavior could be explained by changes in voter or candidate occurring between the initial survey (October 17 - 19) and the November 4th election; or by response bias in the follow-up survey. These non-model sources of error cannot be quantified directly for undecided voters. However it is possible to estimate these errors for decided voters. We can compare decided voters' intentions as stated on October 17 - 19 with their reports of voting behavior gathered on November 18 - 20. To the extent that decided voters are influenced by the same events and response biases as undecided voters, they may be used to calibrate predictions for undecided voters.

A comparison of pre and post election responses revealed that 24% of decided voters did not vote for the candidate that they had planned to vote for during the period October 17 - 19. Greatest switching occurred amongst intending Anderson voters of whom a third did not vote for Anderson. As voters claiming to be "undecided" on October 17 - 19 were presumably more susceptible to pre-election events than their decided counterparts, at least a 24 S error rate in classifying undecided voters could be attributed to voter/candidate changes between survey and election date. Consequently our 61% correct classification is very good: 80% of the suggested maximum.


Almost all opinion research produces some non-substantive responses. The way these responses are treated can have important implications for analysis conclusions. The 1980 Presidential election polls provided a useful environment for the study of this problem. Undecided voters were particularly prevalent and in many states held the balance between the leading candidates. However, the Presidential election was a temporarily fixed event such that the undecided were forced to reach a final decision by polling day. Consequently it was possible to arrange follow-up interviews in order to check the undecideds final actions. Although response error is a possibility (respondents may not report their true voting behavior), validation is more convenient than in many opinion research settings.

The model presented in this paper uses the characteristics of decided voters to estimate a discriminant function which then predicts voting patterns for undecided voters. Follow-up interviews show the model to be fairly accurate, correctly classifying over 60% of undecided voters. This is considerably better than could be achieved by chance and close to the accuracY achieved for decided voters (76%).

The key variables in this study were those related to candidate favorability. Whether or not such variables would be the major discriminating variables in future studies can only be speculated.

Clearly, the method needs further testing and validation. A successful application in one poll cannot justify widespread adoption. In particular it will be useful to compare discriminant classifications with alternative methods of allocating undecided voters.

The appeal of the method outlined in this paper is that it can be applied to any method of data collection. Existing methods such as the leaning approach and the secret ballot approach each has a major limitation, thus reducing their usefulness.


Anderson, T. R., (1958), An Introduction to Multivariate Statistical Analysis (New York: John Wiley & Sons).

Bogart, C., (1966), 'No Opinion, Don't Know and Maybe No Answer," Public Opinion Quarterly, 31, 331-345.

Ferber, R., (1966), "Item Nonresponse in a Consumer Survey," Public Opinion Quarterly, 30, 399-415.

Francis, J. D. and Busch, L., (1975), "What We Now Know About 'I don't Knows'," Public Opinion Quarterly, 39, 207-218.

Morrison, D. G., (1969), "On the Interpretation of Discriminant Analysis," Journal of Marketing Research, 6, 156-163.

Perry, P., (1960), "Election Survey Procedures of the Gallup Poll," Public Opinion Quarterly, 24, 531-562.

Perry, P. (1979), "Problems in Election Survey Methodology," Public Opinion Quarterly, 43, 312-325.

Sicinski, A., (1970), "'Don't Know' Answers in Cross-National Survey," Public Opinion Quarterly, 34, 126-129.