Data-driven scientific applications utilize workflow frameworks to execute complex dataflows, resulting in derived data products of unknown quality. We discuss our on-going research on a quality model that provides users with an integrated estimate of the data quality that is tuned to their application needs and is available as a numerical quality score that enables uniform comparison of datasets, providing a way for the community to trust derived data.
Citation:
Yogesh L. Simmhan, Beth Plale, Dennis Gannon, "Towards a Quality Model for Effective Data Selection in Collaboratories," icdew, pp.72, 22nd International Conference on Data Engineering Workshops (ICDEW'06), 2006