We developed a novel method called Analogy-X to provide statistical inference procedures for analogy-based software effort estimation. Analogy-X is a method to statistically evaluate the relationship between useful project features and target features such as effort to be estimated, which ensures the dataset used is relevant to the prediction problem, and project features are selected based on their statistical contribution to the target variables. We hypothesize that this method can be (1) easily applied to a much larger dataset, and (2) also it can be used for incorporating joint effort and duration estimation into analogy, which was not previously possible with conventional analogy estimation. To test these two hypotheses, we conducted two experiments using different datasets. Our results show that Analogy-X is able to deal with ultra large datasets effectively and provides useful statistics to assess the quality of the dataset. In addition, our results show that feature selection for duration estimation differs from feature selection for joint-effort duration estimation. We conclude Analogy-X allows users to assess the best procedure for estimating duration given their specific requirements and dataset.
Index Terms:
Software effort prediction, duration prediction, case-based reasoning, analogy, Mantel?s correlation, Analogy-X, ISBSG
Citation:
Jacky Keung, Barbara Kitchenham, "Experiments with Analogy-X for Software Cost Estimation," aswec, pp.229-238, 19th Australian Conference on Software Engineering (aswec 2008), 2008