The issue of metadata quality in Cultural Heritage Platforms


Over the last years, the Cultural Heritage (CH) sector has seen an incredible transformation. A surge of massive digitisation and annotation activities along with action towards multimodal cultural content generation from all possible sources. This has resulted in vast amounts of digital content being available through a variety of cultural institutions, such as museums, libraries, archives and galleries (GLAMs).


Initiatives to aggregate this cultural content on an international level have resulted in digital platforms such as the Europeana portal and the Digital Public Library of America. These kinds of platforms operate as cross-domain hubs, making content accessible to users, readily available for search and study, or through creative applications and web services that reuse and repurpose it. While the main strength of such platforms lies in the vast number of the items they contain, they offer limited usability and accessibility due to insufficient data and metadata quality.


There are many factors that affect the quality of metadata. Some of the main causes that result to poor content discoverability and reusability, are the lack of structured and rich descriptive metadata. Indeed, because of, the complex, heterogeneous and multi-channel aggregation workflow, possible shortcomings can appear in the data providing process, surpassing manual quality control of automatic metadata generation in digital repositories. This drawback highly affects the accessibility, visibility and dissemination range of the available digital content.


Metadata quality improvement usually faces the problem of scale: improving the

metadata quality of hundreds of thousands or even millions of records coming from different

sources often requires a huge amount of time, effort and resources that aggregators and

cultural heritage institutions unfortunately cannot afford. In this context, metadata enrichment services through automated metadata processing and feature extraction, along with crowdsourcing annotation services, can offer a remarkable opportunity for improving the metadata quality of digital content stored in CH platforms. At the same time, it helps with engaging users and raising awareness about cultural heritage assets. A Cultural Heritage platform may address various types of users from cultural heritage domain and creative industries, with different levels of expertise, and offers enriched services based on published resources through the use of the platform’s pragmatic interfaces (APIs).

The main motivation of a content aggregation platform, is to utilize Cultural Heritage repositories in unison and promote the digital cultural content by enhancing its accessibility and discoverability.


A content aggregation platform like the one we described, found application in the project WE-Hope. However, for the specific needs of the project, some important adjustments were necessary, for the purpose of building and curating collections with content generated and aggregated for the project. In order to accommodate the testimony videos collected by the partners of the project, it was necessary to add several specific metadata fields on

top of the platform’s existing metadata model, which is based on the Europeana Data Model

(EDM). The necessary information concerned the actual video of the interview (like title,

description, tags, language, category, subcategory) as much as the interviewee (like name,

description, year of birth, nationality). All those adaptations will be included by NTUA in the

WE-Hope platform that will be released during the project.


This article was written by George Marandianos, from the National Technical University of Athens.