A Preliminary Study on Common Variable Selection Strategy in Data Fusion

Jonathan S. Kim, Hanyang University
Seung Baek, Hanyang University
Sungbin Cho, Konkuk University (Corresponding author)
ABSTRACT - Data fusion has been known as a major approach for estimating missing values in large databases. Although selecting common variables is one of the important factors in data fusion, few studies have systematically investigated the various methods available. In this study three strategies are considered for selecting a set of common variables and their results are compared using a Monte Carlo simulation. Selection strategies by variance and by weighted importance perform better than random selection. The results also show that, in locating a donor, the Euclidean distance-based selection outperforms the inter-respondent correlation-based selection. Directions for future research are also discussed.
[ to cite ]:
Jonathan S. Kim, Seung Baek, and Sungbin Cho (2004) ,"A Preliminary Study on Common Variable Selection Strategy in Data Fusion", in NA - Advances in Consumer Research Volume 31, eds. Barbara E. Kahn and Mary Frances Luce, Valdosta, GA : Association for Consumer Research, Pages: 716-720.