Data quality is an important but usually been ignored issue in data mining. However, in this paper, we just focus on the missing data problem, which is one factor that affects data quality. Firstly we propose an association rule mining based missing nominal data imputation method and the corresponding association rule ranking approach, then we used three publicly available data sets to evaluate the method with K-NN imputation as a benchmark. The results suggest that the proposed method outperforms the k-NN imputation methods.
Citation:
Jianhua Wu, Qinbao Song, Junyi Shen, "An Novel Association Rule Mining Based Missing Nominal Data Imputation Method," snpd, vol. 3, pp.244-249, Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007), 2007