Quality of Descriptive Information on Cultural Heritage Objects: Definition and Empirical Evaluation

arXiv:2602.21249v1 Announce Type: new
Abstract: Effective data processing depends on the quality of the underlying data. However, quality issues such as inconsistencies and uncertainties, can significantly impede the processing and subsequent use of data. Despite the centrality of data quality to a wide range of computational tasks, there is currently no broadly accepted, domain-independent consensus on the definition of data quality. Existing frameworks primarily define data quality in ways that are tailored to specific domains, data types, or contexts of use. Although quality assessment frameworks exist for specific domains, such as electronic health record data and linked data, corresponding approaches for descriptive information about cultural heritage objects remain underdeveloped. Moreover, existing quality definitions are often theoretical in nature and lack empirical validation based on real-world data problems. In this paper, we address these limitations by first defining a set of quality dimensions specifically designed to capture the characteristics of descriptive information about cultural heritage objects. Our definition is based on an in-depth analysis of existing dimensions and is illustrated through domain-specific examples. We then evaluate the practical applicability of our proposed quality definition using a curated set of real-world data quality problems from the cultural heritage domain. This empirical evaluation substantiates our definition of data quality, resulting in a comprehensive definition of data quality in this domain.

Liked Liked