A Simultaneous Approach to Constrained Multiple Correspondence Analysis and Cluster Analysis For Market Segmentation

EXTENDED ABSTRACT - A common practice for cluster-based market segmentation is to first uncover a low-dimensional representation of variables (e.g., the first few principal components) with data reduction techniques such as principal components analysis, factor analysis, or multidimensional scaling, and to subsequently use cluster analysis to identify a set of segments based on the low-dimensional data (Arimond & Elfessi, 2001; Furse, Punj, & Stewart, 1984; Green, Shaffer, & Patterson, 1988; Sheppard, 1996). This two-step sequential or tandem approach (Arabie & Hubert, 1994) has been advocated for substantive reasons (see Green & Krieger, 1995).



Citation:

Heungsun Hwang, Byunghwa Yang, and Yoshio Takane (2005) ,"A Simultaneous Approach to Constrained Multiple Correspondence Analysis and Cluster Analysis For Market Segmentation", in AP - Asia Pacific Advances in Consumer Research Volume 6, eds. Yong-Uon Ha and Youjae Yi, Duluth, MN : Association for Consumer Research, Pages: 197-199.

Asia Pacific Advances in Consumer Research Volume 6, 2005      Pages 197-199

A SIMULTANEOUS APPROACH TO CONSTRAINED MULTIPLE CORRESPONDENCE ANALYSIS AND CLUSTER ANALYSIS FOR MARKET SEGMENTATION

Heungsun Hwang, HEC Montreal, Canada

Byunghwa Yang, University of Michigan, U.S.A.

Yoshio Takane, McGill University, Canada

EXTENDED ABSTRACT -

A common practice for cluster-based market segmentation is to first uncover a low-dimensional representation of variables (e.g., the first few principal components) with data reduction techniques such as principal components analysis, factor analysis, or multidimensional scaling, and to subsequently use cluster analysis to identify a set of segments based on the low-dimensional data (Arimond & Elfessi, 2001; Furse, Punj, & Stewart, 1984; Green, Shaffer, & Patterson, 1988; Sheppard, 1996). This two-step sequential or tandem approach (Arabie & Hubert, 1994) has been advocated for substantive reasons (see Green & Krieger, 1995).

Despite its popularity, many authors have warned about a critical problem which is inherent to the tandem approach. Specifically, there is no guarantee that the low-dimensional representation of the data obtained in step one are optimal for subsequently identifying segmentation structures because data reduction is carried out with no reference to cluster analysis (Arabie & Hubert, 1994; Chang, 1983; DeSarbo, Jedidi, Cool, & Schendel, 1990; De Soete & Carroll, 1994). This suggests that preliminary data reduction may mask or distort the true segmentation structures in the original data. Green and Krieger (1995) offer empirical examples which support the legitimacy of this concern. Similarly, Vichi and Kiers (2001) present a simulation-based example in which tandem analysis failed to identify correct segments in the context of principal components analysis. Technically, this problem stems from the fact that each step of the tandem approach involves a different optimization criterion (i.e., one criterion for data reduction and another for cluster analysis) and that these criteria are addressed separately.

As a solution to the problem, the combined use of data reduction and cluster analysis in a single framework has been recommended (Bock, 1987; DeSarbo, Howard, & Jedidi, 1991; De Soete & Carroll, 1994; Heiser, 1993; van Burren & Heiser, 1989; Vichi & Kiers, 2001). In essence, this amounts to obtaining a low-dimensional representation of variables and classifying cases into a set of segments simultaneously. More technically, this involves combining the two different optimization criteria into a single one. This simultaneous approach ensures that low-dimensional data are optimally chosen in such a way so as to facilitate the identification of segments.

Nevertheless, in the simultaneous approach, it is not uncommon that the low-dimensional data are often difficult to interpret so that the resultant segments become difficult to characterize. To enhance the interpretability of low-dimensional data, one may utilize additional information or prior knowledge on the data. One can incorporate such additional information in the form of linear constraints (Bockenholt & Bockenholt, 1990; Nishisato, 1984; Takane & Shibayama, 1991; Takane, Yanai, & Mayekawa, 1991; ter Braak, 1988; van Buuren & de Leeuw, 1992; Yanai, 1986, 1998). By imposing constraints on data, one may simplify the interpretations of the obtained solutions because the data to be analyzed are already structured by the constraints (Bockenholt & Bockenholt, 1990). From a more technical perspective, one may obtain more reliable parameter estimates if imposed constraints are consistent with the data (Hwang & Takane, 2002).

In this paper, a new tool for market segmentation is proposed. The method is designed to simultaneously provide a low-dimensional representation of categorical variables and to classify cases into a set of segments. It is also designed to allow one to impose linear constraints on variables so as to facilitate the interpretations of solutions. More specifically, it involves the combination of (1) constrained multiple correspondence analysis (Hwang & Takane, 2002; van Buuren & de Leeuw, 1992) for obtaining a constrained low-dimensional data with (2) the k-means algorithm (MacQueen, 1967) for identifying segments.

In this paper, an optimization criterion that combines the criterion for constrained multiple correspondence analysis and that for the k-means algorithm in a single framework is presented. An alternating least squares algorithm (de Leeuw, Young, & Takane, 1976) is developed to minimize the optimization criterion for parameter estimation. By analyzing data collected on clothing brands and attributes, the authors empirically demonstrate that the method affords a flexible and parsimonious graphical display of segmentation structures inherent in multivariate categorical data. Although the contribution of the proposed method to the segmentation literature is largely technical, its important implications for consumer researchers and practitioners are also discussed.

REFERENCES

Arabie, P., and L. Hubert (1994), "Cluster Analysis in Marketing Research," in Advanced Methods of Marketing Research, R. P. Bagozzi, ed. Oxford: Blackwell, 160-189.

Arimond, G., and A. Elfessi (2001), "A Clustering Method for Categorical Data in Tourism Market Segmentation Research," Journal of Travel Research, 39, 391-397.

Bezdek, J. C. (1974), "Numerical Taxonomy with Fuzzy Sets," Journal of Mathematical Biology, 1, 57-71.

Bock, H. H. (1987). "On the Interface between Cluster Analysis, Principal Component Analysis, and Multidimensional Scaling," in Multivariate Statistical Modeling and Data Analysis, H. Bozdogan and A. K. Gupta, eds. New York: Reidel, 17-34.

Benzecri, J. P. (1973). L’Analyse des donnees. Vol. 2. L’Analyse des Correspondances. Paris: Dunod.

Bockenholt, U., and I. Bockenholt (1990), "Canonical Analysis of Contingency Tables with Linear Constraints," Psychometrika, 55, 633-639.

Bockenholt, U., and I. Bockenholt (1991), "Constrained Latent Class Analysis: Simultaneous Classification and Scaling of Discrete Choice Data," Psychometrika, 56, 699-716.

Bockenholt, U., and Y. Takane (1994), "Linear Constraints in Correspondence Analysis," in Correspondence Analysis in Social Sciences, M. J. Greenacre and J. Blasius, eds. London: Academic Press, 112-127.

Botschen, G., E. M. Thelen, and R. Pieters (1999), "Using Means-End Structures for Benefit Segmentation: An Application to Services," European Journal of Marketing, 33, 38-58.

Chang, W. (1983), "On Using Principal Components before Separating a Mixture of Two Multivariate Normal Distributions," Applied Statistics, 32, 267-275.

de Leeuw, J., F. W. Young, and Y. Takane (1976), "Additive Structure in Qualitative Data: An Alternating Least Squares Method with Optimal Scaling Features," Psychometrika, 41, 471-503.

DeSarbo, W. S., K. Jedidi, K. Cool, and D. Schendel (1990), "Simultaneous Multidimensional Unfolding and Cluster Analysis: An Investigation of Strategic Groups," Marketing Letters, 2, 129-146.

DeSarbo, W. S., D. J. Howard, and K. Jedidi (1991), "MULTICLUS: A New Method for Simultaneous Performing Multidimensional Scaling and Clustering," Psychometrika, 56, 121-136.

De Soete, G., and J. D. Carroll (1994), "K-Means Clustering in a Low-Dimensional Euclidean Space," in New Approaches in Classification and Data Analysis, E. Diday et al., eds. Heidelberg: Springer, 212-219.

Dolni?ar, S., and F. Leisch (2001), "Behavioral Market Segmentation of Binary Guest Survey Data with Bagged Clustering," in ICANN 2001, G. Dorffner, H. Bischof, and K. Hornik, eds. Berlin: Springer-Verlag, 111-118.

Furge, D., G. N. Punj, and D. W. Stewart (1973), "A Typology of Individual Search Strategies among Purchasers of New Automobiles," Journal Consumer Research, 10, 417-431.

Gifi, A. (1990), Nonlinear Multivariate Analysis, Chichester: Wiley.

Goodman, L. A. (1974), "Exploratory Latent Structure Analysis Using Both Identifiable and Unidentifiable Models", Biometrika, 61, 215-231.

Green, P. E., F. J. Carmone, and D. P. Wachspress (1976), "Consumer Segmentation via Latent Class Analysis", Journal of Consumer Research, 3, 170-174.

Green, P. E., and A. M. Krieger (1995), "Alternative Approaches to Cluster-Based Market Segmentation," Journal of the Market Research Society, 37, 221-239.

Green, P. E., C. M. Schaffer, and K. M. Patterson (1988), "A Reduced-Space Approach to the Clustering of Categorical Data in Market Segmentation," Journal of the Market Research Society, 30, 267-288.

Greenacre, M. J. (1984), Theory and Applications of Correspondence Analysis. London: Academic Press.

Gutman, J. (1982), "A Means-End Model Based on Consumer Categorization Processes," Journal of Marketing, 46, 60-72.

Heiser, W. J. (1993), "Clustering in Low-Dimensional Space," in Information and Classification: Concepts, Methods, and Applications, Opitz, O., Lausen, B., and Klar, R., eds. Heidelberg: Springer-Verlag, 162-173.

Hoffman, D. L., and G. R. Franke (1986), "Correspondence Analysis: Graphical Representation of Categorical Data in Marketing Research," Journal of Marketing Research, 23, 213-227.

Hoffman, D. L., J. de Leeuw, and R. V. Arjunji (1994), "Multiple Correspondence Analysis," in Advanced Methods of Marketing Research, R. P. Bagozzi, ed. Oxford: Blackwell.

Hwang, H., and Y. Takane (2002), "Generalized Constrained Multiple Correspondence Analysis," Psychometrika, 67, 211-224.

Javalgi, R., T., Whipple, M. McManamon, and V. Edick (1992), "Hospital Image: A Correspondence Analysis Approach," Journal of Health Care Marketing, 12, 34-41.

Kruskal, J. B. (1964), "Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis," Psychometrika, 29, 1-27

Lazarsfeld, P. F. (1950), "Logical and Mathematical Foundations of Latent Structure Analysis," in Studies in Social Psychology in World War II, Vol. IV, S. A. Stouffer et al., eds. Princeton: Princeton University Press, 362-412.

Lebart, L., A. Morineau, and K. M. Warwick (1984), Multivariate Descriptive Statistical Analysis. New York: Wiley.

Lloyd, S. P (1982), "Least Squares Quantization in PCM," IEEE Transactions on Information Theory, 28, 129-37.

Manton, K. G., M. A. Woodbury, and H. D. Tolley (1994), Statistical Applications Using Fuzzy Sets, New York: John Wiley & Sons.

Nishisato, S. (1980), Analysis of Categorical Data: Dual scaling and Its Applications. Toronto: University of Toronto Press.

Nishisato, S. (1984), "Forced Classification: A Simple Application of a Quantitative Technique," Psychometrika, 49, 25-36.

Punj, G. and D. W. Stewart (1983), "Cluster Analysis in Marketing Research: Review and Suggestions for Application," Journal of Marketing Research, 20, 134-148.

Ramsay, J. O. (1988), "Monotone Regression Splines in Action (with Discussion)," Statistical Science, 3, 425-461.

Ramsay, J. O. (1998), "Estimating Smooth Monotone Functions," Journal of Royal Statistical Society B, 60, 365-375.

Reynolds, T. J., and J. C. Olson (1998), The Means-End Approach to Understanding Consumer Decision Making: Applications to Marketing and Advertising Strategy. NJ: Lawrence Erlbaum Associates.

Sheppard, A. (1996), "The Sequence of Factor Analysis and Cluster Analysis: Differences in Segmentation and Dimensionality through the Use of Raw and Factor Scores," Tourism Analysis, 1, 49-57.

Takane, Y., and T. Shibayama (1991), "Principal Component Analysis with External Information on Both Subjects and Variables," Psychometrika, 56, 97-120.

Takane, Y., H. Yanai, and S. Mayekawa (1991), "Relationships Among Several Methods of Linearly Constrained Correspondence Analysis," Psychometrika, 56, 667-684.

ter Braak, C. J. F. (1986), "Canonical Correspondence Analysis: A New Eigenvalue Technique for Multivariate Direct Gradient Analysis," Ecology, 67, 1167-1179.

ter Hofstede, F., J. E. M. Steenkamp, and M. Wedel (1999), "International Market Segmentation Based on Consumer-Product Relations," Journal of Marketing Research, 36, 1-17.

van Buuren, S., and J. de Leeuw (1992), "Equality Constraints in Multiple Correspondence Analysis," Multivariate Behavioral Research, 27, 567-583.

van Buuren, S., and W. J. Heiser (1989), "Clustering N Objects into K Groups under Optimal Scaling of Variables," Psychometrika, 54, 699-706.

Vichi, M., and H. A. L. Kiers (2001), "Factorial K-Means Analysis for Two-Way Data," Computational Statistics and Data Analysis, 37, 49-64.

Wedel, M., and W. A. Kamakura (1998), Market Segmentation: Conceptual and Methodological Foundations. Boston: Klwer Academic Publishers.

Wedel, M., and J. B. E. M. Steenkamp (1989), "Fuzzy Clusterwise Regression Approach to Benefit Segmentation," International Journal of Research in Marketing, 6, 45-49.

Yanai, H. (1986), "Some Generalizations of Correspondence Analysis in terms of Projection Operators," in Data Analysis and Informatics IV, E. Diday, Y. Escoufier, L. Lebart, J. P. PagFs, Y. Schektman, and R. Thomassone, eds. Amsterdam: North Holland, 193-207.

Yanai, H. (1998), "Generalized Canonical Correlation Analysis with Linear Constraints," in Data Science, Classification, and Related Methods, C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock, and Y. Baba, eds. Tokyo: Springer-Verlag, 539-546.

Yang, B. (1997), "A Consumer Research on Underwear in South Korea," Unpublished report, Chung-Ang University, Seoul, South Korea.

----------------------------------------

Authors

Heungsun Hwang, HEC Montreal, Canada
Byunghwa Yang, University of Michigan, U.S.A.
Yoshio Takane, McGill University, Canada



Volume

AP - Asia Pacific Advances in Consumer Research Volume 6 | 2005



Share Proceeding

Featured papers

See More

Featured

Penny for Your Preferences: Leveraging Self-Expression to Increase Prosocial Giving

Jacqueline R. Rifkin, Duke University, USA
Katherine Crain, Duke University, USA
Jonah Berger, University of Pennsylvania, USA

Read More

Featured

How Do Platform-Based Networks Shape Systemic Value Creation Through Experiences?

Bernardo Figueiredo, RMIT University
daiane scaraboto, Pontificia Universidad Católica de Chile

Read More

Featured

G4. That's So Sweet: Baby Cuteness Semantically Activates Sweetness to Increase Sweet Food Preference

Shaheer Ahmed Rizvi, University of Alberta, Canada
Sarah G Moore, University of Alberta, Canada
Paul Richard Messinger, University of Alberta, Canada

Read More

Engage with Us

Becoming an Association for Consumer Research member is simple. Membership in ACR is relatively inexpensive, but brings significant benefits to its members.