The Use of Characteristic Rules For Identifying Latently Dissatisfied Customers

ABSTRACT - Data mining is the automated search for hidden, previously unknown, interesting and ultimately understandable knowledge from large databases. This paper describes the use of data mining, more specificly characteristic rules, in the domain of a customer satisfaction studie. More specifically it endeavours to tackle the problem of how to identify latently dissatisfied customers. These customers still reporting satisfaction, but with approximately identical characteristics of dissatisfied customers, might have a high probability of becoming dissatisfied. Identifying these customers at an early stage provides the opportunity to take corrective action.



Citation:

Josee Bloemer, Tom Brijs, Gilbert Swinnen, and Koen Vanhoof (2001) ,"The Use of Characteristic Rules For Identifying Latently Dissatisfied Customers", in AP - Asia Pacific Advances in Consumer Research Volume 4, eds. Paula M. Tidwell and Thomas E. Muller, Provo, UT : Association for Consumer Research, Pages: 328-334.

Asia Pacific Advances in Consumer Research Volume 4, 2001      Pages 328-334

THE USE OF CHARACTERISTIC RULES FOR IDENTIFYING LATENTLY DISSATISFIED CUSTOMERS

Josee Bloemer, Limburg University Centre, Belgium

Tom Brijs, Limburg University Centre, Belgium

Gilbert Swinnen, Limburg University Centre, Belgium

Koen Vanhoof, Limburg University Centre, Belgium

ABSTRACT -

Data mining is the automated search for hidden, previously unknown, interesting and ultimately understandable knowledge from large databases. This paper describes the use of data mining, more specificly characteristic rules, in the domain of a customer satisfaction studie. More specifically it endeavours to tackle the problem of how to identify latently dissatisfied customers. These customers still reporting satisfaction, but with approximately identical characteristics of dissatisfied customers, might have a high probability of becoming dissatisfied. Identifying these customers at an early stage provides the opportunity to take corrective action.

INTRODUCTION

This paper deals with the issue of customer (dis)satisfaction and proposes a data mining technique, more specifically characteristic rules, to identify latently dissatisfied customers. Briefly, latently dissatisfied customers can be described as customers who when asked, report overall satisfaction but who show characteristics indicating a level of dissatisfaction. We demonstrate the effectiveness of the adopted approach by using real-world data obtained from a large-scale customer satisfaction survey carried out by a leading Belgian bank.

In the second section of this article, the concept of latently dissatisfied customers will be introduced and described in more detail. In the third section, we introduce the 3-step methodology of partial classification. In the fourth section, the proposed technique will be applied to empirical data from the financial services sector. The fifth section discusses the validity of the model and finally, the sixth section provides conclusions, limitations and directions for future research.

LATENTLY DISSATISFIED CUSTOMERS

Research on the determinants of customer satisfaction generally assumes that there is no difference between the causes of satisfaction or dissatisfaction. Several authors, however, suggest that there are some determinants which tend to be primarily a source of satisfaction and others that tend to be primarily a source of dissatisfaction (Cadotte and Turgeon, 1988; Hausknecht, 1988; Swan and Combs, 1976). In a previous study (Vanhoof and Swinnen, 1996), the authors introduced a method that enables the drawing of a distinction between the impact of criteria on satisfaction and dissatisfaction. The results indicated that there is indeed a difference in the direction and magnitude of the impact.

In a study of the bank sector, Johnston (1995) used the critical-incident technique (see also Bitner, 1990, for the use of this method in a service setting) to classify customer perceptions (anecdotes) into satisfying and dissatisfying factors. While most determinants were found to be a source of both satisfaction and dissatisfaction (but ranking differently with respect to impact) there were a few (4 out of 18) that exclusively determined either satisfaction or dissatisfaction with the bank.

Most studies on customer satisfaction and dissatisfied typically assume a 0-1 situation in which a customer is either satisfied or dissatisfied. This is not entirely accurate. The level of customer satisfaction or dissatisfaction can be presented on a continuum with at the one end manifest satisfaction and at the other manifest dissatisfaction (Oliver, 1997). At the midpoint of this continuum, there is an area in which it is no longer clear whether the customer is really satisfied or dissatisfied. When asked, the customer may report satisfaction but he or she may also report dissatisfaction. These customers are not absolutely sure about their service evaluation, and more importantly, how they should act upon that evaluation. From the literature it has become clear that those customers who are manifestly satisfied with a product show a greater tendency to act in accordance with that evaluation than those customers who are only latently satisfied (Bloemer and Kasper, 1995). This means that customers who are manifestly satisfied are more often truly brand loyal than those who are only latently satisfied.

The ability to clearly identify satisfied and dissatisfied customers in terms of other characteristics, like their satisfaction with particular service items and their socio-demographic characteristics, entails the ability to identify those customers, although reporting that they are satisfied, with typical characteristics of dissatisfied customers. These customers can be defined as latently dissatisfied, even though they report satisfaction. Figure 1 illustrates the idea of identifying latently dissatisfied customers.

Curve 1 in the above figure represents thecumulative relative number of dissatisfied customers that is characterized by a diminishing marginal rate of increase when we move to the area that is more characteristic for satisfaction (to the right). In contrast, curve 2 represents the cumulative relative number of satisfied customers that is characterized by a rising marginal rate of increase when we move to the area that is more characteristic for satisfaction. When we are able to capture characteristics that are typical of (dis)satisfaction and we can also define an ordering on the basis of these characteristics, we then expect that the cumulative relative number of dissatisfied (satisfied) customers described by these characteristics increases with a diminishing (respectively increasing) marginal rate when we move to the area that is more characteristic for satisfaction

Furthermore, it can be observed that, for some customers, their overall service satisfaction evaluation does not correspond with their characteristics. This means that some customers, like for instance those situated in the hatched area, report satisfaction although they are typified by characteristics for dissatisfaction. Therefore, we define the customers in this area as latently dissatisfied because they may be highly vulnerable to become manifestly dissatisfied customers in the near future.

METHODOLOGY

Identifying latently dissatisfied customers cannot be considered as a straightforward classification task, therefore we use a partial classification procedure.

A partial classification procedure describes the discovery of models that show characteristics of the data classes, but may not cover all examples of any given class. The aim is to learn rules that are individually valid instead of constructing a model which covers all instances of a given class. In this study, we will use characteristic rules as a partial classification technique to infer strong characteristics (under the form of rules) from the group of dissatisfied customers These characteristic rules will then be used to identify latently dissatisfied customers. The proposed methodology involves three steps:

1. Discover characteristic rules for dissatisfied customers (Agrawal et al., 1996; Mannila, 1997).

2. Remove non-interesting characteristic rules from step 1 by means of a filtering procedure (Anand et al., 1997; Kamber and Shinghal, 1996).

3. Match satisfied customers covered by the rules retained from step 2.

FIGURE 1

IDENTIFYING LATENTLY DISSATISFIED CUSTOMERS

AN EMPIRICAL STUDIE

Data Collection

The dataset contains the results of a survey a leading Belgian bank carried out with its private clients in 1996. The objective of the survey is to assess the level of customer satisfaction with the bank’s services.

Questionnaire

The questionnaire (see Appendix 1) includes questions related to specific service aspects of the bank, questions on socio-demographic characteristics of the respondents and a question probing for the overall level of satisfaction. Results were obtained for a random sample of 7264 clients.

Subjects were asked to indicate to what extent they could agree with he statements presented in the questionnaire. All statements related to the bank’s service aspects were measured on a 5-level ordinal scale with responses ranging from always, most often, sometimes, rarely, to never and no opinion, the latter indicating a missing value.

A total of 7264 instances were obtained of which only 445 (6.1%) were classified in the group of dissatisfied customers, again illustrating the skewness of the class frequency distribution.

Data Analysis and Results

The identification of latently dissatisfied customers follows the 3-step methodology introduced in the previous section.

Step 1: the algorithm following for discovering frequent sets was used to discover opinions frequently occurring together in the group of dissatisfied customers. Frequent is defined by the user in the form of a support (frequency of occurrence) threshold, which in the given study was set at 20%. The threshold specifies that no opinion or set of opinions will be considered for the discovery of characteristic rules if it is not mentioned by at least 20% of the dissatisfied respondents. This is to prevent the discovery of random phenomena in the data. Indeed, the specification of the support threshold allows the discovery process to be directed in order to discover only structural patterns in the data. 97 characteristic rules for dissatisfaction were discovered of which the following sample provides a brief illustration:

RULE 1: Question 1=dissatisfied -> Question 15=dissatisfied Dissatisfaction=TRUE [21.35%]

RULE i: Question 4=dissatisfied -> Question 7=dissatisfied Dissatisfaction=TRUE [37.53%]

For instance, rule 1 should be interpreted as follows: 21.35% of the respondents who indicated to be dissatisfied on the target question (general level of satisfaction) reported dissatisfaction with question 1 (Q1=In my bank office, leaflets are available with all necessary information) and question 15 (Q15=I get enough information from my bank by means of correspondence).

This information is very useful because it describes the typical characteristics of a dissatisfied customer. However, as already indicated in the methodological section, one must be careful with the interpretation of these results. The discovered rules for dissatisfaction may be characteristic for the whole dataset too.

Step 2: Therefore, in the second step, an interestingness measure (Anand et al. 1997) is introduced to construct a rank-order of the rules from interesting to non-interesting for dissatisfaction. Ranking the characteristic rules according to their interest underlines the importance of focusing on the most interesting rules for dissatisfaction. Interest values between 0.9195 (highly interesting) and -0.68 (less interesting) were obtained. The question however remains where to draw the borderline between strong and weak rules for dissatisfaction. However, this problem can be resolved as follows:

If we define an ordering between the characteristic rules according to their interestingness measure, then, for each rule Ri the additional number of dissatisfied instances "| Ri |d not already covered by previous rules with a higher interest can be computed as follows (see appendix 2 for more details):

"| Ri |d=| Ri |dB|( R0 U ... U Ri-1 ) U Ri|d

The same procedure can be applied to the group of satisfied instances. As a result, for the discovered characteristic rules under the given support settings, the cumulative relative number of satisfied and dissatisfied customers can be plotted as illustrated in figure 2.

FIGURE 2

OBSERVED IDENTIFICATION OF SATISFIED AND DISSATISFIED CUSTOMERS BASED ON EMPIRICAL DATA

Note that w used a moving average procedure to smoothen both curves, so that the major breakpoints in both lines become more clearly visible. Otherwise, one could be trapped in small local differences in the data, suggesting putting breakpoints where this would not be recommendable from an overall perspective.

In figure 2, two points of interest can be recognized. A first point of interest is situated on the upper curve (1), representing the cumulative relative number of dissatisfied customers. A simple visual inspection of this curve shows that from the 40th rule (see black arrow in figure 2), the additional number of instances covered, not already described by the 39 previous rules, becomes very small. In fact this is the law of diminishing returns, which indicates that the marginal gain in the number of dissatisfied instances becomes smaller with each additional rule considered in the analysis. As a result, one can decide to consider only the 40 most interesting rules and discard the others as non-interesting, based on the low additional number of instances that will be covered by rules 41 to 97.

A second point of interest in figure 2 is situated on the dashed curve (2), representing the cumulative relative number of satisfied customers. A visual inspection again reveals a breakpoint in the curve (see white arrow in figure 2) where suddenly the additional number of satisfied instances covered, by introducing additional characteristic rules, increases rapidly.

The above results strengthen our belief that the best 29 characteristic rules for dissatisfaction only describe a small proportion (10.7%) of the population of satisfied respondents and that these individuals must be considered latently dissatisfied. By introducing lower quality rules, i.e. by moving into the direction of the area that is characterized by satisfaction, the probability that an instance will belong to the group of latently dissatisfied customers will decline.

Step 3: The third and final step in the methodology involves the matching of interesting characteristic rules for dissatisfaction with 'satisfied’ instances in the database. From figure 2, the decision is open to consider the 29 or 40 most interesting rules. When using the minimum description length (MDL) principle to determine the optimal cutpoint (for details, see Brijs et al., 1999) this cutpoint is situated at the 29th rule. This set of 29 rules covers 73.9% of the dissatisfied customers and 10.7% of the latently dissatisfied customers; the latter group being identified by the hatched area in figure 2.

FIGURE 3

COMPLAINT BEHAVIOUR FOR DIFFERENT CUSTOMER GROUPS

MODEL VALIDITY

Internal Validity

To evaluate the stability of the selected rules, the results of a second, but identical questionnaire carried out in 1997, have been used. It was observed that under the same support setting (20%), 27 of the 29 interesting rules discovered in 1996 were still valid in 1997, indicating a high stability of the selected ruleset. For the two rules that were negatively validated, support was slightly insufficient. Altogether however, it is worthwhile to state that for the 29 selected rules the correlation between the interestingness measures for 1996 versus 1997 was 95%. Therefore, one can conclude that 27 of the 29 rules for dissatisfaction discovered in the 1996 analysis are both interesting and consistent over time. Slight variations over time may be caused by concept drift.

External Validity

To assess the external validity of the model, two tests are carried out. From the literature it is known that the number of complaints formulated by the customer is a valid indicator for the level of dissatisfaction of that customer (Day, 1984; TARP, 1986; Fornell and Wernerfelt, 1987; Heskett, Sasser and Schlesinger, 1997) Therefore, as a first test of validity, we use complaint behaviour as a measure of criterion validity of the discovered model. The second validity test concerns the analysis of the defection rate of each group of customers (dissatisfied, satisfied and latently satisfied).

Test 1: Analysis of Complaints Behaviour

Specifically, for each customer in the survey the number of complaints he or she submitted in 1997 was obtained. Then, for each group of customers, i.e. dissatisfied, satisfied and latently dissatisfied, the number of complaints (in percentages) is plotted as can be seen in figure 3.

The first bar represents dissatisfied customers who are covered by our model, i.e. they are considered as prototypical examples of dissatisfied customers. The second bar represents the group of latently dissatisfied customers, i.e. satisfied customers that are covered by the proposed model for dissatisfaction. Finally, the last bar represents the group of satisfied customers, who are not covered by our model, i.e. they are considered as manifestly satisfied customers.

Two important observations can be made with regard to this figure. Firstly, it can be observed that complaints behaviour for the 2 groups of customers who are covered by the model (bar 1 and 2) is significantly different from the complaints behaviour in the group of customers not covered by the model (bar 3). Indeed, the percentage number of complaints for the groups of dissatisfied and latently dissatisfied customers is significantly higher than the percentage number of complaints that was observed for the group of satisfied customers. A statistical test on the diffences between the proportions of the different customer groups could be carrried out. However, statistical significance testing in this situation can be discussed. Indeed, because of the large number of observations in each customer group, carrying out a significance test on the proportions would be misleading as most differences would turn up as being significant. This observation can be considered as a proof for the effectiveness and validity of our model.

Secondly, however not directly related to validity, it can be seen that complaints behaviour is different with regard to the type of complaint that was formulated. In general, all customers seem to have problems with the opening hours of the bank, while flexibility of the staff is a far less cause for complaints. In the light of dissatisfaction management, these results can be used to set priorities for corrective actions.

Test 2: Analysis of Defection Rate

As we have mentioned before, latently dissatisfied customers have a high probability of becoming manifestly dissatisfied in the near future. Consequently, an important managerial implication of the presented method would be to enable managers to take corrective actions to prevent latently dissatisfied customers to become manifestly dissatisfied and eventually to defect. Therefore, we analyse the defection rate for the different customer groups (dissatisfied, satisfied and latently dissatisfied). Unfortunately, because the survey was carried out anonymously, we have no exact data concerning the defection rate of the customers in our study. Instead, we use a proxy variable in the survey which assesses whether, in the future, the customer has the intention to have more of his activities concentrated with other banks instead of with the current bank. Although the answers on this question may not be a fully reliable indicator for customer defection, figure 4 shows some remarkable differences between the different customer groups.

FIGURE 4

DOES THE CUSTOMER HAVE THE INTENTION TO GO ELSEWHERE?

Figure 4 illustrates that, for the different groups of customers, different proportions of customers have the intention to concentrate their activities more with other banking institutions in the near future. For instance, it can be seen that the relative number of customers in the group of dissatisfied and latently dissatified customers that has the intention to go elsewhere (first and second bar with answer 'yes’), is much higher than in the group of satisfied customers (third bar with answer 'yes’). Also, the relative number of customers in the group of satisfied customers that does not have the intention to go elsewhere (third bar with answer 'no’) is much higher than in the other two groups (first and second bar with answer 'no’). To conclude, both observations indicate that the probability that a customer will go elsewhere, is much higher in the group of dissatisfied and latently dissatisfied customers than in the group of satisfied customers.

Finally, it is remarkable that within the group of customers that is undecided, the bigger proportion consitutes of latently dissatisfied customers. In other words, customers that tend to be somewhat ambiguous, i.e. expressing overall satisfaction but possessing characteristics of dissatisfied customers, also tend to be undecided with regard to their future involvement with the bank.

CONCLUSIONS AND DIRECTIONS FOR FUTURE RESEARCH

Theoretical and Managerial Implications

From a theoretical point of view, we tackled the problem of identifying latently dissatisfied customers, i.e. identifying customers who report overall satisfaction but who posess characteristics of dissatisfied customers. We argued that complete classification techniques are not appropriate when unbalanced datasets are used and target groups cannot be clearly separated. Therefore, a partial classification technique, i.e. characteristic rules, in combination with an interestingness measure were used to discover interesting rules for dissatisfaction. To pick out the set of most relevant rules with regard to the problem presented, we calculated breakpoints in the curves of the cumulative number of satisfied and dissatisfied instances. The model was validated internally by using test data and externally by using data on complaints behaviour and data concerning a proxy variable for defection rate. Results indicated that remarkable differences with regard to complaints behaviour and defection rate could be identified when comparing the figures for dissatisfied, latently dissatisfied and manifestly satisfied customers.

From a managerial point of view, this study shows that the identification of latently dissatisfied customers can indeed be considered as an early warning signal, providing the opportunity to correct a problem before real damage is done. We have shown that after having identified latently dissatisfied customers, standard profiling and classification techniques should be used to target the right customer segments with the right corrective actions to deal with latent dissatisfaction most effectively. This means that latenly dissatisfied customers should be turned into manifestly satisfied customer that show true loyalty to the company (Bloemer and Kasper, 1995; Jones and Sasser, 1995; Reichheld and Sasser, 1990; Reichheld, 1993, 1996; Waterhouse and Morgan, 1994).

Limitations of the Study

Although the results of this study indicate that characteristic rules provide an efficient and effective instrument to identify latently dissatisfied customers, one should consider the following limitations. First ofall, limitations with regard to response tendency, i.e. the survey reflects the intentions of customers instead of their actual behaviour. Indeed, because the questionnaire was carried out anonymously, actual behaviour (for example defection rate) could not be observed. Instead, customers were asked about their intentions to leave the bank (proxy variable). As a result, it is not certain whether intentional behaviour can be considered a valid indicator for actual behaviour. Secondly, limitations with regard to the dataset that has been used: the current results are based on a single (analysis) data set. A longitudinal instead of an ad-hoc research project will increase the reliability of the results of this study.

Future Research

The results of the study suggest some interesting topics for future research. First of all, a more rigorous approach to determining the borderline between rules for dissatisfaction and satisfaction should be based on a cost-benefit analysis. Indeed, when we are able to determine the costs of managing and monitoring more (less interesting) rules for dissatisfaction by moving the borderline more to the right, and we are also able to determine the benefits of this action in terms of being able to grasp and retain more latently dissatisfied customers (cfr. lifetime value), then it is possible to determine an optimal positioning of this borderline. Secondly, different recoding schemas could be used to discretize the possible range of answers to the questions. This will also affect the selection of threshold settings, which in turn, will influence the structure and volume of characteristic rules that will be discovered in the data. With regard to this topic, currently new characteristic rule discovery techniques are being developed that are able to find optimal recoding schemas automatically.

APPENDIX 1

QUESTIONS ON SPECIFIC SERVICE ASPECTS OF THE BANK

REFERENCES

Agrawal, R., Imielinski, T., and Swami, A., (1993), "Mining association rules between sets of items in large databases". In: Buneman, P., Jajodia, S., (Eds.), Proceedings of ACM SIGMOD Conference on Management of Data (SIGMOD’93), 207-216.

Anand, S., Hughes, J., Bell, D., and Patrick, A., (1997), "Tackling the cross-sales problem using data mining". In: Lu, H., Motoda, H., Liu, H., (Eds.), Proceedings of the First Pacific-Asia Conference on Knowledge Discovery and Data Mining, 331-343.

Bloemer, J.M.M., and Kasper, J.D.P. (1995), "The complex relationship between consumer satisfaction and brand loyalty". Journal of Economic Psychology 16, 311-329.

Cadotte, E.R., and Turgeon, N. (1988), "Dissatisfiers and satisfiers: suggestions for consumer complaints and compliments". Journal of Consumer Satisfaction, Dissatisfaction and Complaining Behavior 1, 74-79.

Day, R.L. (1984), "Modelling choices among alternative responses to dissatisfaction". Advances in Consumer Research 11, 496-499.

Fornell, C., and Wernerfelt, B. (1987), "Defensive Marketing strategy by complaint management: a theoretical analysis". Journal of Marketing Research 24, 337-346.

Hausknecht, D. (1988), "Emotional measures of satisfaction/dissatisfaction". Journal of Consumer Satisfaction, Dissatisfaction and Complaining Behavior 1, 25-33.

Heskett, J.L., Sasser, W.E., and Schlesinger, L.A. (1997), The service profit chain: How leading companies link profit and growth to loyalty, satisfaction and value. The Free Press, New York.

Johnston, R. (1995). "Determinants of Service Quality: Satisfiers and Dissatisfiers". International Journal of Sevice Industry Management 6 (5), 53-71.

Jones, T., and Sasser, W. (1995), "Why Satisfied Customers Defect". Harvard Business Review, November-December, 88-99.

Kamber, M., and Shinghal, R., (1996). "Evaluating the interestingness of characteristic rules". In: Simoudis, E., Han, J., Fayyad, U.M., (Eds.), Proceedings of the Second International Conference on Knowledge Discovery & Data Mining, 263-266.

Mannila, H. (1997), "Methods and problems in data mining", Proceedings of the International Conference on Database Theory, pp. 41-55.

Oliver, R.L., (1997). Satisfaction, A behavioural perspective on the consumer, McGrawhill NY, 98-131.

Reichheld, F., and Sasser, E. Jr. (1990), "Zero defections: Quality comes to Services". Harvard Business Review, September-October 1990, 106-111.

Reichheld, F. (1993), "Loyalty-based management". Havard Business Review, March-April, 64-73.

Reichheld, F. (1996). The Loyalty Effect. The Hidden Force Behind Growth, Profits and Lasting Value, Bain Company, Inc., Harvard Business School Press, Boston.

Swan, J.E., and Combs, L.J. (1976). "Product performance and consumer satisfaction: a new concept". Journal of Marketing 40, 25-33.

Vanhoof, K., and Swinnen, G. (1996), "Attribute Importance. Assessing nonlinear patterns of factors contributing to Customer Satisfaction". In: ESOMAR Publication Series Volume 204: Research Methodologies for the New Marketing, November, 160-171.

Waterhouse, K., and Morgan, A. (1994), "Using research to help to keep good customers: understanding the process of customer defection and developing a strategy for customer retention". Marketing & Research Today 22 (3), 181-194.

----------------------------------------

Authors

Josee Bloemer, Limburg University Centre, Belgium
Tom Brijs, Limburg University Centre, Belgium
Gilbert Swinnen, Limburg University Centre, Belgium
Koen Vanhoof, Limburg University Centre, Belgium,



Volume

AP - Asia Pacific Advances in Consumer Research Volume 4 | 2001



Share Proceeding

Featured papers

See More

Featured

Linguistic Antecedents of Anthropomorphism

N. Alican Mecit, HEC Paris, France
tina m. lowrey, HEC Paris, France
L. J. Shrum, HEC Paris, France

Read More

Featured

That's Just Plain Creepy: Understanding Consumer Responses to Personalized Food Products That Resemble People

Freeman Wu, Vanderbilt University, USA
Adriana Samper, Arizona State University, USA
Andrea Morales, Arizona State University, USA
Gavan Fitzsimons, Duke University, USA

Read More

Featured

M10. I Need a Hero: How Loneliness Interacts with the Symbolic Meaning of Products to Affect Consumer Attitude

Sirajul Arefin Shibly, SUNY Binghamton, USA
Jinfeng Jiao, SUNY Binghamton, USA

Read More

Engage with Us

Becoming an Association for Consumer Research member is simple. Membership in ACR is relatively inexpensive, but brings significant benefits to its members.