Vital statistics data offer a fertile ground for data mining. In this paper, we discuss the results of a data mining project on the causes of death aspect of the vital statistics data in the state of California. A data mining tool called Cubist is used to build predictive models out of two million cases over a nine-year period. The objective of our study is to discover knowledge that can be used to gain insight into various aspects of mortality in California, to predict health issues related to the causes of death, to offer an aid to decision-or policy-making process, and to provide useful information services to the customers. The results obtained in our study contain valuable new information.
Index Terms:
Vital statistics data, causes of death, data mining, predictive models, Cubist.
Citation:
Du Zhang, Quoc Ha, Meiliu Lu, "Mining California Vital Statistics Data," icdm, pp.671, First IEEE International Conference on Data Mining (ICDM'01), 2001