diff --git a/_site/feed.json b/_site/feed.json index 236011c..58d4665 100644 --- a/_site/feed.json +++ b/_site/feed.json @@ -51,7 +51,7 @@ "id": "https://phd.julsraemy.ch/thesis.html", "url": "https://phd.julsraemy.ch/thesis.html", "title": "Linked Open Usable Data for Cultural Heritage: Perspectives on Community Practices and Semantic Interoperability", - "content_html": "Linked Open Usable Data for Cultural Heritage: Perspectives on Community Practices and Semantic Interoperability PhD Thesis in Digital Humanities, completed as part of the Graduate School of Social Sciences’ (G3S) doctoral programme. It was successfully defended on 18 November 2024 (slides). This page will host a lightweight HTML version of my thesis, optimised for easy access and readability. The PDF version (e-dissertation) is available on the University of Basel’s repository: https://doi.org/10.5451/unibas-ep96807. Page in construction (please be patient ⌛) Author Dr. Julien A. Raemy (University of Basel) https://orcid.org/0000-0002-4711-5759 Supervisors Prof. Dr. Peter Fornaro (University of Basel) https://orcid.org/0000-0003-1485-4923 Prof. Dr. Walter Leimgruber (University of Basel) Dr. Robert Sanderson (Yale University) https://orcid.org/0000-0003-4441-6852 Abstract Digital technologies have fundamentally transformed how Cultural Heritage (CH) collections are accessed and engaged with. Linked Open Usable Data (LOUD) specifications, including the International Image Interoperability Framework (IIIF) Presentation API 3.0, Linked Art, and the W3C Web Annotation Data Model, have emerged as web standards to facilitate the description and dissemination of these valuable resources. Despite the widespread adoption of IIIF, implementing LOUD specifications, particularly in combination, remains challenging. This is especially evident in the development and assessment of infrastructures, or sites of assemblage, that support these standards. This research is guided by two perspectives: community practices and semantic interoperability. The first perspective assesses how organizations, individuals, and apparatuses engage with and contribute to the consensus-making processes surrounding LOUD. By examining these practices, the social fabrics of the LOUD ecosystem can be better understood. The second perspective focuses on making data meaningful to machines in a standardized, interoperable manner that promotes the exchange of well-formed information. This research is grounded in the SNSF-funded project, Participatory Knowledge Practices in Analogue and Digital Image Archives (PIA) (2021–2025), which aims to develop a citizen science platform for three photographic collections from the Cultural Anthropology Switzerland (CAS) archives. Actor-Network Theory (ANT) forms the theoretical foundation, aiming to describe the collaborative structures of the LOUD ecosystem and emphasize the role of non-human actors. Beyond its implementation within the PIA project, this research includes an analysis of the social dynamics within the IIIF and Linked Art communities and an investigation of Yale’s Collections Discovery platform, LUX. The research identifies socio-technical requirements for developing specifications aligned with LOUD principles. It also examines how the implementation of LOUD standards in PIA highlights their potential benefits and limitations in facilitating data reuse and broader participation. Additionally, it explores Yale University’s large-scale deployment of LOUD standards, emphasizing the importance of ensuring consistency between Linked Art and IIIF resources within the LUX platform for the CH domain. The core methodology of this thesis is an actor- and practice-centered inquiry, focusing on a detailed examination of specific cosmologies within LOUD-driven communities, PIA, and LUX. This micro-perspective approach provides rich empirical evidence to unravel the intricate web of cultural processes and constellations in these contexts. Key empirical findings indicate that LOUD enhances the discoverability and integration of data in CH, requiring community-driven consensus on model interoperability. However, significant challenges include engaging marginalized groups, sustaining long-term participation, and balancing technological and social factors. Strategic use of technology and the capture of digital materiality are critical, but LOUD also poses challenges related to resource investment, data consistency, and the broader implementation of complex patterns. LOUD should lead efforts to improve the accessibility and usability of CH data. The community-driven methodologies of IIIF and Linked Art inherently foster collaboration and transparency, making these standards essential tools in evolving data management practices. Even for institutions and projects that do not adopt these specifications, the socio-technical practices of LOUD offer vital insights into effective digital stewardship and strategies for community engagement. Keywords: Actor-Network Theory; Community of Practice; Cultural Anthropology Switzerland; Cultural Heritage; Digital Infrastructure; International Image Interoperability Framework; Knowledge Practices; Linked Art; Linked Data; LUX; Participatory Archives; Photographic Archives; Semantic Interoperability; Web Annotation Data Model Table of Contents Introduction Context Interlinking Cultural Heritage Data Exploring Relationships through an Actor-Network Theory Lens Research Scope and Methodology The Social Fabrics of IIIF and Linked Art PIA as a Laboratory Yale’s LUX and LOUD Consistency Discussion Conclusion 1. Introduction Since its inception in 2011, the IIIF has revolutionised[1] the accessibility of image-based resources. Initially driven by the needs of manuscript scholars, IIIF focused on two-dimensional images, but has since expanded to encompass a wide range of image-based resources, including audiovisual materials and, in the near future, 3D images. Similarly, Linked Art, formally established in 2017, initially concentrated on art museum objects but has since broadened its scope to model a variety of CH entities, leveraging CIDOC-CRM, a renowned ontology in the museum and DH space. Both initiatives aim to break down silos: IIIF focuses on improving the presentation of digital objects, while both initiatives enhance their dissemination. Together, they make CH data more accessible through IIIF and more meaningful through Linked Art for machines. These efforts have primarily benefited the CH domain. A key commonality is that the main APIs these communities create align with the LOUD design principles, either intentionally or empirically demonstrated through use cases. These principles enable software developers to develop compliant tools and services without needing to fully understand RDF, a syntax for representing information on the web. Additionally, they may not need to grasp all LOD principles, which promote the interlinking of data from diverse datasets using tools like KOS such as thesauri. WADM, a W3C standard, is also recognised as a LOUD specification. It provides a framework for creating interoperable annotations on web resources, facilitating the linking and sharing of data across different platforms and applications. These LOUD design principles include the right abstraction for the audience, few barriers to entry, comprehensibility by introspection, documentation with working examples, and the use of many consistent patterns rather than few exceptions. Additionally, both IIIF and Linked Art are driven by vibrant communities, mainly comprising GLAM and higher education institutions. While the standards and principles discussed have broad applications, it is important to clarify the scope of this dissertation. This work does not focus on KGs by assessing triplestores – databases specifically designed to store and retrieve triples, which are the fundamental data structures in RDF. Similarly, it does not deal with evaluating SPARQL engines, which are specifically designed to query KGs. Additionally, this dissertation does not address the intersection of ML and IIIF, or the ontological reasoning of Linked Art. Instead, this dissertation concentrates on LOUD, the consistency of its standards, design principles and the vibrant communities behind it. It examines JSON-LD serialisation efforts and the crucial intersection required to establish robust semantic interoperability baselines between presentation and semantic layers. It also presents real-world use case implementations, both on a small scale in a laboratory and flexible space within the PIA research project, and on a large scale at Yale, exemplified by the LUX platform that provides access to (meta)data from YUL, YCBA, YUAG, and YPM. The focus is therefore on digital infrastructures capable of delivering JSON-LD files from the above specifications, which are primarily, though not exclusively, CH resources. It is more about the different actors – both human and non-human – that create and maintain these interconnected systems and the dynamic interactions that sustain them. The deployment of various LOUD specifications addresses the need for semantic interoperability between CH resources and disparate datasets by establishing a standardised approach to representing and linking data, ensuring that information can be seamlessly shared and understood across different platforms and contexts. This dissertation seeks to carve out a distinct niche by addressing an often-overlooked aspect of IIIF and Linked Art. IIIF is sometimes perceived and studied merely as a service or an appendix, with the content it delivers taking precedence. However, this PhD thesis positions IIIF as a first-class citizen worthy of in-depth study. Similarly, Linked Art, despite its potential and its relatively recent establishment, has been the subject of very few scholarly papers. This gap underscores the significance of LOUD in this context. Furthermore, this thesis elevates Linked Art to a position of primary importance, recognising its significance and advocating for its thorough examination. To thoroughly study LOUD and its adherence to design principles, it is essential to immerse ourselves actively in both communities – an approach I have embraced for years. The thesis also emphasises the importance of participatory efforts and collaboration between research projects, which typically have shorter lifespans, and memory institutions, which need to implement technical standards as a lingua franca. In doing so, it reveals the mediating role of LOUD in advancing the heritage sphere. To truly understand IIIF, Linked Art, and to a lesser extent WADM, it is crucial to examine the social fabrics and consensus decision-making of each community. Among these considerations are how the specifications can be implemented pragmatically, and how the standards can support the implementation and maintenance of more extensive semantic interoperability efforts. The significance of this research lies in highlighting the commitment and diligence of the individuals and organisations that make up both the IIIF and Linked Art communities. It aims to demonstrate that community-driven practices, such as those exemplified by IIIF and Linked Art, have a potential that goes beyond the mere sharing of digital objects and their associated metadata. The more people who embrace these approaches and implement the associated specifications, the more society as a whole will benefit. Furthermore, this research illustrates that IIIF is no longer limited to two-dimensional images, that Linked Art is not restricted to artworks, and that WADM is a simple, content-agnostic standard that can be easily integrated into a range of systems. This adaptability is a strength of LOUD standards, which are designed to be simple yet effective. LOUD can serve a variety of purposes, primarily rooted in CH, but with the potential to extend its benefits to other sectors. The true beauty of LOUD lies in its ability to foster networking opportunities and transparent socio-technical practices, demonstrating its value beyond mere technical implementation. By emphasising these aspects, this dissertation highlights the wider impact of LOUD in promoting semantic interoperability and enhancing collaborative efforts within the heritage field and beyond. In addition, the implementation of standards through PIA underlines the potential for similar participatory or citizen science projects, while the LUX initiative serves as an illustrative example of robust infrastructure and cross-unit engagement. These examples demonstrate the practical applications and far-reaching implications of adopting LOUD standards in different contexts. This dissertation is structured across ten chapters, each building upon the previous ones up to Chapter 5 to provide a comprehensive understanding of the research. These initial chapters lay the foundation of the study, establishing the context, theoretical framework, and methodological approaches. After this foundational section, Chapters 6, 7, and 8 present empirical studies that, while interconnected, can be read independently if desired. These chapters offer detailed insights into specific aspects of the research and can be appreciated on their own or as part of the broader narrative. The thesis continues with Chapter 2, which extends this introduction by providing more information about the research setting, specifically PIA. Chapter 3 follows with an extensive literature review, offering a comprehensive overview of methods to interlink CH data. Next, Chapter 4 presents the theoretical framework, conceptualised as a toolbox and firmly rooted in ANT, guiding the analysis and discussion throughout the dissertation. Following this, Chapter 5 details the research scope and methodology, explaining the approaches and methods employed in the study. Moving on to the empirical work, Chapter 6 sheds light on the social fabrics of IIIF and Linked Art, exploring the communities and practices that underpin these initiatives.Chapter 7 then examines the implementation of LOUD standards within PIA, highlighting the practical aspects and challenges encountered. This is followed by Chapter 8, which focuses on the LUX initiative at Yale, examining the underlying governance and interdepartmental ownership of the Yale Collections Discovery platform. The discussion of findings is presented in Chapter 9, where the results from the empirical chapters are synthesised and analysed in relation to the theoretical framework. Finally, Chapter 10 concludes the thesis, summarising the key insights and contributions of the research while outlining potential directions for future study. 2. Context In this chapter, I will set the stage for my PhD thesis by providing important background information. First, in Section 2.1, I will explain why I chose the title for my thesis. This will give you an understanding of the main focus and the direction of my research. Next, in , I will describe the PIA research project, which is central to my work. This section will cover the project’s goals, significance, and overall framework. In , I will detail my specific contributions to the PIA project. I will emphasise how my work fits into the larger project and its importance to my thesis. Finally, in , I will talk about my active participation in the IIIF and Linked Art communities. This section will highlight how my involvement in these communities has influenced my research and its broader implications. 2.1 PhD Title I chose the title ‘Linked Open Usable Data for Cultural Heritage: Perspectives on Community Practices and Semantic Interoperability’ as it encapsulates the essence of my research focus but I could have indeed chosen other ones. During the initial stages of my research, multiple working titles were explored to capture the diverse facets of my interests and objectives. If I was quite sure about having in the title after the third iteration, I was quite unsure of what should follow and if a subtitle was actually needed at all. Amidst this dynamic progression, the underlying theme of my research remained steadfast – to delve into the transformative potential of LOUD for CH. I also opted to maintain in the title of my thesis subsection. While holds its appeal, my choice reflects a broader narrative that acknowledges the crucial role of CHIs and spotlighting the multifaceted nature of heritage preservation, encapsulating both its digital facets and the essential contribution of individuals and institutions in curating, interpreting, and making heritage accessible. As for the subtitle, while I do explore CoP as defined by @lave_situated_1991 and @wenger_communities_2011 through investigating the social fabrics of the IIIF and Linked Art communities, my main interest lies in the broader application of LOUD for describing and interlinking CH resources. Thus, I decided to opt for the more generic as the first axis or perspective. For the second perspective, I wanted to see how semantic interoperability can be achieved through standards adhering to the LOUD design principles, as they seem to be key enablers for seamless collaboration and knowledge exchange among practitioners. There was a time in my research when I envisaged decoupling and , perceiving them as two distinct dimensions. However, what really captivates me is the unification of these factors to facilitate collective reasoning for both humans and machines. In summary, this title reflects my enthusiasm for using web-based and community-driven technologies to transform the way we understand, share and value CH. 2.2 The PIA Research Project I undertook my doctoral studies within the scope of the PIA research project financed by the SNSF under their Sinergia funding scheme from February 2021 to January 2025[2]. The project aimed to analyse the interplay of participants, epistemological orders and the graphical representation of information and knowledge in relation to three photographic collections from CAS. It sought to bring together the world of data and things in an interdisciplinary manner, exploring the phases of the analogue and digital archive from a cultural anthropological, technical and design research perspective [@felsing_community_2023 p. 42]. As part of this endeavour, interfaces were developed to enable the collaborative indexing and use of photographic archival records [@chiquet_participatory_2023 p. 110]. I discuss in more detail the interdisciplinary components and briefly introduce the people involved in the project in Subsection 2.2.1, then talk about the photographic collections that were the overarching narrative of the research in Subsection 2.2.2, and lastly in Subsection 2.2.3, the vision that we had put together. The project, divided in three interdisciplinary teams, was led by the University of Basel through the Institute for Cultural Anthropology and European Ethnology[3] (Team A) and the DHLab[4] in collaboration with the DBIS group (Team B) as well as by the HKB[5], an art school and department of the Bern University of Applied Sciences (Team C) [@felsing_community_2023 p. 43]. Table 2.1 lists the people who contributed to the project, broken down by the three teams and their particular perspectives. Table 2.1: PIA Team Core Members Perspective People A) Anthropological Prof. Dr. Walter Leimgruber, Team Leader and Dissertation Supervisor Dr. Nicole Peduzzi, Photographic Restoration and Digitisation Supervisor Regula Anklin, Conservation and Restoration Specialist (project partner at Anklin & Assen) Murielle Cornut, PhD Candidate in Cultural Anthropology Birgit Huber, PhD Candidate in Cultural Anthropology Fabienne Lüthi, PhD Candidate in Cultural Anthropology B) Technical Prof. Dr. Peter Fornaro, Team Leader and Dissertation Supervisor Prof. Dr. Heiko Schuldt, Dissertation Supervisor (project partner at the University of Basel) Dr. Vera Chiquet, Postdoctoral Researcher Adrian Demleitner, Software Developer (2021-2023) Fabian Frei, Software Developer (2023-2025) Christoph Rohrer, Software Developer (2023-2025) Julien A. Raemy, PhD Candidate in Digital Humanities Florian Spiess, PhD Candidate in Computer Science C) Communicative Dr. Ulrike Felsing, Team Leader and Dissertation Supervisor Prof. Dr. Tobias Hodel, Dissertation Supervisor (project partner at the University of Bern) Daniel Schoeneck, Research Fellow Lukas Zimmer, Designer (project partner at A/Z&T) Max Frischknecht, PhD Candidate in Digital Humanities 2.2.2 Photographic Collections/Archives as Anchors CAS has historically been engaged in active collaborations that bridge the academic research and the public sphere, primarily through traditional analogue methods. The PIA project was created with the intention of exploring the complexities inherent in both analogue and digital approaches, and to encourage and investigate these collaborative endeavours between academia and the wider public. As such, PIA represents a paradigm shift within the scope of projects associated with or supported by CAS, facilitating the seamless integration of digital tools to explore multiple facets of participation and engagement. This transformative endeavour embodies a profound exploration of new intersections where scholarly endeavours intertwine with the active involvement of citizens. PIA drew on three collections: one focusing on scientific cartography and titled (Atlas der Schweizerischen Volkskunde), a second from the estate of the photojournalist Ernst Brunner (1901–1979), and a third collection consisting of vernacular photography which was owned by the Kreis Family (1860–1970). SGV_05 ASV consists of 292 maps and 1000 pages of commentary published from 1950 to 1995 — an example of such a map is shown in Figure 2.1. This collection was commissioned by the CAS to do an extensive survey of the Swiss population in the 1930s and 1940s on many issues pertaining, for instance, to everyday life, local laws, superstitions, celebrations or labour [@weiss_atlas_1940]. The contents were compiled by researchers and by people who were described as [6]. Questions were asked about everyday habits, community rights, work, trade, superstitions, and many other topics [@schmoll_richard_2009; @schmoll_vermessung_2009]. This collection offers a snapshot of everyday life in Switzerland right before the beginning of a modernisation process that fundamentally changed lifestyles in all areas during the postwar period. A digitised version of the ASV would not only allow the results of that time to be enriched with further findings [@schranz_critical_2021], but would also make transparent how knowledge was generated in cartographic form through a complex process along different types of media and actors. The restoration, digitisation, cataloguing and indexing efforts took all part throughout PIA under the supervision of Birgit Huber, who extensively based her doctoral research on this particular collection [see @huber_entdeckung_2023]. Figure 2.1: Map from the SGV_05 Collection Relating to Question 93 Showing Walks and Excursions at Pentecost. ASV. CAS. CC BY-NC 4.0 SGV_10 Kreis Family comprises approximately 20,000 loose photographic objects, where a quarter of them are organised and kept in 93 photo albums — as illustrated by Figure 2.2, from a wealthy Basel-based family and spanning from the 1850s to the 1980s. This private collection was acquired by CAS in 1991. The collection, which originally arrived in banana cases and was enigmatic due to the lack of clear organisation or accompanying information from the family, posed significant challenges. Despite these initial hurdles, CAS undertook meticulous efforts to catalogue and preserve its contents [@felsing_re-imagining_2024 p. 42]. The pictures were taken by studio photographers as well as by family members themselves. The Kreis Family collection represents a typical example of urban bourgeois culture and gives a comprehensive insight into the development of private photography over the course of a century [@pagenstecher_private_2009]. The photographic materials and formats are very diverse, ranging from prints to negatives, small, medium or large format photographs, black and white or colour. The collection also encompasses many photographic techniques, from the one-off daguerreotypes and ferrotypes, to the glass-based negatives that could be reproduced en masse, to the modern paper prints. While some of the albums and loose images were restored and digitised during the 2014 project, much of this work was completed during PIA and overseen by Murielle Cornut, whose doctoral investigation was centred on the study of photo albums [see @cornut_open_2023]. Figure 2.2: A photo Album Page from the SGV_10 Collection, Bearing the Following Inscription: Botanische Excursion ins Wallis, Pfingster 1928. SGV_10A_00031_015. Kreis Family. CAS. CC BY-NC 4.0 SGV_12 Ernst Brunner is a donation of about 48,000 negatives and 20,000 prints to the CAS archives from Ernst Brunner, a self-taught photojournalist, who lived from 1901 to 1979 and who documented mainly in the 1930s and 1940s a wide range of folkloristic themes — as shown by Figure 2.3. He is one of the most important photographers of the era and one of the most outstanding visual chroniclers of Swiss society [@pfrunder_ernst_1995]. His photographs show rural lifestyles, but also urban motifs. In his late work, he led the documentation and research on farmhouses in a specific Swiss district, a project initiated by CAS. Before Ernst Brunner became an independent photojournalist in the mid-1930s, he worked as a carpenter, influenced by the ideas of the Bauhaus and Neues Bauen movements. This can also be seen in the aesthetics and formal language of his photography. If all the black and white negatives were digitised and recorded between 2014 and 2018, the digitisation of prints, which is a selection done by Ernst Brunner, was conducted at the end of the PIA research project. The latter was supervised by Fabienne Lüthi, whose PhD was about organisational systems and knowledge practices in the Ernst Brunner Collection. Figure 2.3: Picture from the SGV_12 Collection Showing Walkers Looking at the Timetable Train. [Wanderer studieren den Fahrplan in der Bahnhofhalle]. Lucerne, 1938. Ernst Brunner. SGV_12N_00716. CAS. CC BY-NC 4.0 Whereas for each of the PhD Candidates in Cultural Anthropology, a particular collection was assigned to them and its content was to varying degrees part of their subject of study, this was not exactly the same for the PhD Candidates in DH, including myself, and in Computer Science. Put differently, we had relative leeway in terms of what interested us in each or all of these three photographic collections. In my case, I briefly explain my contribution to the project more in and then in as part of the empirical portion of my thesis focusing on the deployment of LOUD specifications using the three CAS photographic collections. Florian Spiess focused on the use of VR through vitrivr, a multimedia retrieval system developed by the DBIS research group at the Department of Mathematics and Computer Science [@spiess_multimodal_2022; @spiess_forschung_2023; @spiess_exploring_2024]. His work included experiments with PIA-related collections, such as the creation of virtual galleries clustered according to content-based similarity [see @peterhans_automatic_2022]. In the case of Max Frischknecht, his doctoral research centred on generative design[7], a methodology to visualise dynamic cultural archives. He mostly worked on the ASV collection and on a mapping tool which is a cartographic visualisation designed to explore the CAS photographic archives [see @frischknecht_generating_2022; @eggmann_digitalisierung_2024]. It should also be mentioned that not only did we use the three collections of the CAS photographic archives within the project, but that both formal and informal meetings took place most commonly within the photographic archives at the Spalenvorstadt premises in the old Gewerbemuseum and later either at the on Allschwilerstrasse, though less frequently, or at Rheinsprung where the Institute for Cultural Anthropology and European Ethnology is located. This meant that there was a strong and sometimes blurred entanglement between those involved in the archives and the PIA core team members. 2.2.3 Project Vision Between December 2021 and March 2022, we worked together to develop and finalise a vision for the project[8]. It includes seven key priorities, or pillars, which were meant to strengthen the interdisciplinary perspectives of PIA. Although ambitious, these elements were of paramount importance to us and served as a guiding blueprint for all PIA activities. Hereafter is a modified version of the vision[9] taken from @cornut_annotations_2023 [p. 4]. Accessibility by developing open interfaces and offering the possibility of expanding the archive and turning it into an instrument of current research that collects and evaluates knowledge with the participation of other users (Citizen Science). Heterogeneity by making visible where, why and under what circumstances the objects were created, how they were handled and what path they have taken to get to and in the archive. We work on visualisations that take into account the heterogeneous character of archival materials and make their respective biographies visible. Materiality by conveying the material properties of the objects: they have front and back sides, inscriptions, traces, development errors, they are transparent, multi-layered or fabric-covered. They tell of their origin, use, and peculiarities. We want to make this knowledge accessible and understandable in digital form. To this end, we also consider the necessary infrastructure involved in the creation as part of their narrative: the restoration, the relocation, the indexing, the storage devices, the research tools, the display medium, as well as the process of repro-photography. Interoperability as a crucial component and which will be done by supporting digital means that allow different stakeholders to freely access and interact with the project’s data. Both humans and machines can use, contribute to, correct and annotate the existing data in an open and interoperable manner, thus encouraging exchange and the creation of new knowledge. To do this, we use web-based standards that are widely adopted in the cultural heritage field. Affinities by leveraging data models and pattern recognition which can uncover semantic relationships between entities that were previously incomplete or difficult for users to access. Using specific interfaces and visualisations, we make it possible to explore digital assets and discover forms of relationships and similarities between images. AI that facilitates automated searches for simple image attributes such as colour, shapes, and localisation of image components. It should also become possible to recognise texts and object types for extracting metadata. Bias Management by taking into account that associated metadata was human-made[10] and thus is never objective. Collections and their metadata reflect biases or focus narrowly on selected areas and perceptions. Machines working on the basis of such data automatically reproduce the implicit biases in decision-making due to so-called biased algorithms. Therefore, understanding the data used for training and the algorithms applied for decision making is crucial to ensure the integrity of the application of these technologies in archives. We take ethical issues into account when using AI and visualisations, because the higher the awareness of a possible bias, the faster it can be detected or brought up for consideration with users. As my thesis is notably concerned with semantic interoperability, Interoperability and Affinities are of particular importance to my PhD thesis, although I recognise the importance of all pillars. Each of these resonated with me and my fellow PhD Candidates. As we immersed ourselves in the vision of the PIA research project, it became a unifying thread that brought us together in our research ambitions. We found that all these priorities within the project spoke to us at different points and provided a strong point of communication and practice in the development of processes, prototypes or interfaces. 2.3 Contribution to PIA and its Relevance to the Thesis To develop a participatory platform, an open and sustainable technological foundation for facilitating the reuse of CH resources was needed [@raemy_applying_2021]. Throughout the PIA project, I was mainly involved in the extension of the data infrastructure, the uptake of IIIF as well as designing the data model, leveraging Linked Art and WADM [@raemy_interlinking_2024]. As a member of Team B, I undertook this PhD as a bridge between the different teams, mostly participating in discussions with the three doctoral candidates from Team A to further develop and agree on the CAS data model and with the software developers from my team to discuss the impact of the data model on our evolving — yet transitory — infrastructure as well as helping in implementing the APIs adhering to the LOUD design principles. It was necessary to redesign the data model within the context of a database migration, from Salsah to the DSP, that happened between November 2021 and March 2024. This updated version, based on the Knora Base Ontology[11], corresponded to the needs of the CAS archives and to some extent to those of PIA, in particular to enable the PhD Candidates in Cultural Anthropology to make more precise assertions, whether in terms of descriptive metadata, or in the ability to link one object to another or to provide comments on these objects in several narrative forms. Moreover, an assessment of the appropriate technical standards for improved usability of the objects by both humans and machines was carried out, as a basis for extending the capabilities provided by DaSCH, such as helping the software developers to implement SIPI[12], a C++ image server compatible with the IIIF Image API and build services that create IIIF Presentation API 3.0 resources. While the theoretical framework of the thesis extends across the scope of PIA, the empirical part focuses on a specific set of findings derived from the research project outlined in , under the title . In this chapter, I discuss the data model and its refinement as well as the generation of custom IIIF Manifests during the specific digitisation, cataloguing and indexing efforts that took place throughout the project for the three CAS collections (SGV_05, SGV_10 and SGV_12) under investigation, the implementation of LOUD standards, and the overall design of the technological underpinnings. 2.4 Involvement within the IIIF and Linked Art communities I must acknowledge the invaluable role that my involvement within the IIIF and Linked Art communities has played in shaping my journey as a trained information specialist and an aspiring DH practitioner. Being an active participant in both communities has not only broadened my understanding of the latest developments in the field but has also profoundly influenced the trajectory of this dissertation. I have been involved within the IIIF community since October 2016 and the Working Groups Meeting that happened in The Hague[13]. This significant journey was, in fact, initiated by a recommendation from my first supervisor, Peter Fornaro, during my time as an undergraduate doing an internship at the DHLab. Little did I know that this recommendation would lead me to be carrying out a PhD and looking at IIIF not only as community-driven standards but as an object of study. Engaging with the IIIF community exposed me to cutting-edge advances in image interoperability and standards, and fostered a deeper appreciation for the importance of digital representations of cultural heritage. Through collaborative discussions with experts from diverse backgrounds, I gained new perspectives on the potential of technology to advance humanities research and preserve our collective cultural memory. Similarly, my involvement in the Linked Art community introduced me to the opportunities offered by LOUD and its transformative impact on research discourse. Exposure to Linked Data methodologies and the CIDOC-CRM has significantly influenced the way I have structured and interpreted the data in this dissertation, thereby enriching its scholarly breadth and rigour. I started to be actively involved in Linked Art at the beginning of my PhD in 2021, but I was already a by 2020, driven by the efforts of Rob Sanderson, my third supervisor. By mid-2023, I had become a member of the Editorial Board. The individuals I have met and the knowledge shared in these vibrant communities have deeply informed my approach as a scholar. The invaluable connections and collaborations I have made have expanded my network of fellow researchers, educators, and experts, leading to fruitful discussions that have significantly shaped the research questions addressed in this thesis. The events and workshops organised by these communities have also provided immersive learning experiences, giving me first-hand insights into the tools, technologies and methodologies used in the context of describing and disseminating CH data. The dynamic ecosystem of these communities has served as an inspiring backdrop, fostering innovative thinking and encouraging a more holistic approach to my research. 3. Interlinking Cultural Heritage Data Interlinking CH data is an important aspect of publishing heritage collections over the web, in particular by using LOD technologies to make assertions more easily readable and meaningful to machines [@marcondes_integrated_2021]. Due to the complexity of CH data and their intrinsic inter-relationships, it is necessary to define its nature and introduce controlled vocabularies and ontologies that can be integrated with existing web standards and interoperable with relevant platforms [@bruseker_cultural_2017; @hyvonen_using_2020]. Efforts to interlink CH data have brought about significant advancements, but challenges remain. One such challenge is finding a balance between completeness and precision of expression to ensure that the that CH data remain accessible and usable to a wider audience. Addressing this challenge, the Linked Open Usable Data (LOUD) design principles and the specifications that adhere to those, such as the IIIF Presentation API 3.0 and Linked Art, offer a promising approach [@raemy_enabling_2023]. By focusing on usability aspects from the perspective of software developers and data scientists involved in designing visualisation tools and data aggregation approaches, LOUD strives to enhance the overall user experience [@sanderson_keynote_2019]. Finding this equilibrium becomes crucial as CH data continues to grow in complexity and size, necessitating the seamless integration of native web technologies. The LOUD concept cultivates an environment that encourages the formation of vibrant CoP and the seamless integration of native web technologies, wherein an essential principle is the availability of comprehensive documentation supplemented with practical examples [@raemy_ameliorer_2022]. Moreover, the emphasis on leveraging widely adopted technologies enhances the interoperability of data and promotes its wider dissemination. With LOUD principles guiding the linking of CH data, the resulting web of knowledge becomes more than just a machine-readable resource; it transforms into a user-centric ecosystem where both accessibility of Linked Data and usability intersect to enable scholars and a wider audience to engage in the exploration and appreciation of CH [@newbury_loud_2018]. Finally, by fostering a collaborative, knowledge-sharing mindset, LOUD empowers software developers to implement data in a robust way, drawing insights from shared experiences [see @page_linked_2020]. In this chapter, which serves as the literature review of the PhD thesis, I attempt to draw on this brief introduction by dividing the insights into seven sections in order to provide an overview of the key concepts related to interlinking data in the CH domain. The literature review primarily encompasses works published up until December 2023, providing a comprehensive snapshot of the field’s current state and its evolution. Section 3.1 discusses what makes CH data stand out and Section 3.2 is about CH metadata standards, while ??? explores the technological trends, scientific movements and guiding principles that have shaped the field. ??? provides an overview of the web as an open platform, which are essential to understanding the current landscape of interlinking CH data. ??? focuses on LOUD, while ??? looks at characterising the community practices and semantic interoperability dimensions for CH. Finally, in ???, I summarise key elements from each section and within each of these I give some initial thoughts with respect to LOUD, and then conclude the chapter with some considerations on why we as a society need to care about CH data. 3.1 What Makes Cultural Heritage Data Stand Out? Here, I aim to establish the indirect territory of my study, as I am situated on a distinct plane that focuses on web technologies and standards — as well as software and services that enable them — as the subjects of investigation. However, it is crucial to acknowledge that LOUD specifications owe their existence to the available data that have served as case studies. Thus, their significance can be best understood through the lens of data and I recognise here the pivotal role played by CH practitioners — encompassing individuals from research and memory institutions — who have had a significant impact on specifying a series of web-based standards and who have helped to move forward the discovery of CH data and beyond, in particular those belonging to the public domain, in an open manner. In Subsection 3.1.1, I provide an introduction to CH as recognised by the UNESCO. I explore the tangible, intangible, and natural dimensions of CH, laying the foundation for further discussions on its representation and preservation, notably by giving a first definition of CH data. Next in 3.1.2, I look at the challenges of representation and embodiment of CH data. This subsection examines the challenges in describing and preserving its materiality or embodied aspects. Understanding the significance of collective efforts, communities, and the interplay of technologies. Thirdly, I discuss what I called ‘Collectives and Apparatuses’ in 3.1.3 where I highlight how actors in terms of collaborative actions and apparatuses play a pivotal role in CH. 3.1.1 Cultural Heritage The legacy of CH encompasses physical artefacts and intangible aspects inherited from past generations, reflecting the history and traditions of societies. Meanwhile, CH constantly evolves due to complex historical processes, necessitating preservation and protection efforts to prevent its loss over time [@loulanski_revising_2006]. The dynamic nature of CH demands collaborative actions, including documentation and the use of a range of technologies. The concept of CH is also characterised by perpetual evolution, mirroring the historical processes that shape societies over time. Social, political, economic, and technological shifts invariably influence the definition and perception of CH, prompting continuous reinterpretations and reevaluations of its significance. Over the years, the enthusiasm for the protection of cultural property has enriched the term with new shades of meaning. As societies undergo transformations, new layers of meaning and relevance are superimposed on existing CH, perpetually enriching its essence. As articulated by [@ferrazzi_notion_2021 p. 765]: ‘Cultural heritage’, as an abstract legacy or as a merge of tangible and intangible values, is able to encompass the totality of culture(s); in so, assuming a symbolic value that brings a clear break with all other terminologies. In conclusion, ‘cultural heritage’ as a legal term has demonstrated more than any others to be a real ensemble of historical stratification and cultural diversity. The advent of globalisation and rapid advancements in technology have further accelerated the evolution of CH. Increased interconnectedness and cross-cultural interactions have led to the fusion of traditions and the emergence of novel cultural expressions. Moreover, the digital era has facilitated the dissemination of CH resources on a global scale, transcending geographical barriers and preserving cultural knowledge for future generations as [@portales_digital_2018]. Thus, the intriguing nature of CH resources can be attributed to their multifaceted and diverse characteristics. The conservation and promotion of these resources demand a nuanced comprehension of the various types of heritage resources, culminating in effective preservation and promotion strategies that can account for their heterogeneity [@windhager_visualization_2019]. According to [@unesco_institute_for_statistics_unesco_2009], CH includes tangible and intangible heritage. Tangible CH refers to physical objects such as artworks, artefacts, monuments, and buildings, while intangible CH comprises practices, knowledge, folklore and traditions that hold cultural significance [@munjeri_tangible_2004]. The concept of heritage has evolved through a process of extension to include objects that were not traditionally considered part of the heritage. The criteria for selecting heritage have also changed, taking into account cultural value, identity, and the ability of the object to evoke memory. This shift has led to the recognition and protection of intangible CH, challenging a Eurocentric perspective and embracing cultural diversity as a valuable asset for humanity [@vecco_definition_2010]. Conservation guidelines have broadened the concept of heritage to include not only individual buildings and sites but also groups of buildings, historical areas, towns, environments, social factors, and intangible heritage [@ahmad_scope_2006]. In 2019, another instance of UNESCO defines CH in an even more comprehensive manner, taking into account natural heritage: Cultural heritage is, in its broadest sense, both a product and a process, which provides societies with a wealth of resources that are inherited from the past, created in the present and bestowed for the benefit of future generations. Most importantly, it includes not only tangible, but also natural and intangible heritage. [@unesco_culture_for_development_indicators_methodology_2014 p. 130] In thinking about the concept of CH, I find this last definition particularly resonant. This broader perspective is motivated by my interest with LOUD specifications as a research area, particularly because of their notable data agnosticism and as it resonated with @hyvonen_cultural_2012 [pp. 1-3]'s subdivision of CH as well. These services have the adaptability to process and use different types of data, transcending the boundaries of specific domains or disciplines. Although grounded in concrete CH cases, their potential to extend to any type of data, including those from STEM, is a compelling prospect that warrants further exploration, a point that I will explore later. The following sub-subsections aim to briefly discuss tangible, intangible, and natural heritage, as well as providing a definition of CH data which can serve as a foundational reference for this thesis. 3.1.1.1 Tangible Heritage Tangible CH encompasses physical artefacts and sites of immense cultural significance that are passed through generations in a society [@vecco_definition_2010]. These objects are tangible manifestations of human creativity, representing artistic creations, architectural achievements, archaeological remains as well as collections held by CHIs. One aspect of tangible CH is artistic creations such as paintings, sculptures and traditional handicrafts. These artefacts embody cultural values and artistic expressions and serve as essential reflections of a society’s collective ethos. For example, artworks such as ‘Irises’ from Vincent van Gogh[14] and Alberto Giacometti’s ‘L’Homme qui Marche I’ [15] are revered works of art that have deep cultural significance in Europe and all over the world. The built heritage, including monuments, temples and historic buildings, is another important component of the tangible CH. These architectural marvels not only represent past civilisations, but also convey the social values and aspirations of their time. The Taj Mahal, an exemplary white marble structure in India, stands as a poignant testament to Mughal architecture. Closer to where I write this dissertation one can mention the Abbey of St Gall, a convent from the century which is inscribed on the UNESCO World Heritage List. In the context of urban heritage, conventional definitions of built heritage often focus narrowly on the architectural and historical value of individual buildings and monuments, which are well protected by existing legislation. However, the challenge is to preserve urban fragments - areas within towns and cities that may not qualify as designated conservation areas, but are of significant cultural and morphological importance [@tweed_built_2007]. For instance, [@rautenberg_lemergence_1998] proposes two categories of built CH: heritage by designation and heritage by appropriation. Heritage by designation involves experts conferring heritage status on sites, buildings, and cultural objects through a top-down approach, often without public participation. This method can be predictable and uncontroversial, but can be criticised for being elitist and neglecting unconventional heritage. On the other hand, heritage by appropriation emphasises community and public involvement in identifying and preserving cultural expressions, leading to a more inclusive and dynamic understanding of heritage. Archaeological sites are also an integral part of the tangible CH, offering invaluable insights into past societies and ways of life. As per May 2024, UNESCO's long list of World Heritage Sites includes 1,199 cultural and natural sites in 168 different state parties — including 48 sites in transboundary regions[16]. Sites such as Machu Picchu, an impressive Inca citadel in the Peruvian Andes, bear witness to the architectural achievements and cultural practices of ancient civilisations. If archaeological sites are invaluable, they face significant threats such as looting, destruction, exploitation, and extreme weather phenomena [@bowman_transnational_2008; @micle_archaeological_2014]. To safeguard them, conservation efforts must be case-specific and include documentation and assessment of experiences gained [@aslan_protective_1997]. The preservation of tangible CH extends beyond physical objects to include libraries, archives and museums that house collections of books, manuscripts, historical documents and artefacts. Incidentally, the term “cultural property” is also employed as a related concept to tangible CH, encompassing both movable and immovable properties as opposed to less tangible cultural expressions [@ahmad_scope_2006]. Cultural property is protected by a number of international conventions and national laws. For instance, the Blue Shield[17] — an international organisation established in 1996 by four non-governmental organisations[18] — aims to protect and preserve heritage in times of armed conflict and natural disasters [@van_der_auwera_unesco_2013]. Its mission has been revised in 2016: The Blue Shield is committed to the protection of the world’s cultural property, and is concerned with the protection of cultural and natural heritage, tangible and intangible, in the event of armed conflict, natural- or human-made disaster. [@blue_shield_blue_2016 art. 2.1] Overall, tangible CH is a testament to human ingenuity and cultural diversity, and serves as a bridge between the past and the present. Its preservation is a collective responsibility, ensuring that the legacy of past generations endures and the wealth of cultural diversity continues to enrich the fabric of society. 3.1.1.2 Intangible Heritage The concept of intangible heritage emerged in the 1970s and was coined at the UNESCO Mexico Conference in 1982 [@leimgruber_switzerland_2010] with the aim of protecting cultural expressions that were previously excluded from preservation efforts [@hertz_politiques_2018]. UNESCO's previous focus had been on material objects, primarily from wealthier regions of the global North, leaving the intangible cultural heritage of the South overlooked. Attempts to protect intangible heritage through legal measures like copyright and patents were ineffective due to the collective nature of these cultural expressions and the anonymity of creators. The Convention acknowledges that intangible CH is essential for cultural diversity and sustainable development. Below is the definition given by the Convention for the Safeguarding of the Intangible Cultural Heritage: ‘The Intangible Cultural Heritage’ means the practices, representations, expressions, knowledge, skills – as well as the instruments, objects, artefacts and cultural spaces associated therewith – that communities, groups and, in some cases, individuals recognize as part of their cultural heritage. This intangible cultural heritage, transmitted from generation to generation, is constantly recreated by communities and groups in response to their environment, their interaction with nature and their history, and provides them with a sense of identity and continuity, thus promoting respect for cultural diversity and human creativity. [@unesco_basic_2022] According to UNESCO, intangible CH can be manifested in the following domains: oral traditions and expressions, including language as a vehicle of the intangible CH; performing arts; social practices, rituals and festive events; knowledge and practices concerning nature and the universe; traditional craftsmanship. Overall, intangible CH is a multifaceted concept that encompasses both traditional practices inherited from the past and contemporary expressions in which diverse cultural groups actively participate [@munjeri_tangible_2004; @leimgruber_was_2008]. It includes inclusive elements shared by different communities, whether they are neighbouring villages, distant cities around the world, or practices adapted by migrant populations in new regions. These expressions have been passed down from generation to generation, evolving in response to their environment, and play a crucial role in shaping our collective identity and continuity. Intangible CH promotes social cohesion, strengthens a sense of belonging and responsibility, and enables individuals to connect with different communities and society at large. Central to the nature of intangible CH is its representation within communities. Its value goes beyond mere exclusivity or exceptional importance; rather, it thrives on its association with the people who preserve and transmit their knowledge of traditions, skills and customs to others within the community and across generations. The recognition and preservation of intangible CH depends on the communities, groups or individuals directly involved in its creation, maintenance and transmission. Without their recognition, no external entity can decide on their behalf whether a particular practice or expression constitutes their heritage. The community-based approach ensures that intangible CH remains authentic and deeply rooted in the living fabric of society, protected by those who care for and perpetuate it. In Switzerland, the Winegrower’s Festival in Vevey (La Fête des Vignerons), a plurisecular event celebrating the world of wine making [@vinckMetiersOmbreFete2019] and the Carnival of Basel (Basler Fasnacht) [@chiquet_how_2023] are examples of traditions that are listed among UNESCO's intangible CH. (In)tangibility is not always a straightforward concept and can indeed be blurred, i.e. it goes beyond the mere idea of materialisation. Many artefacts and elements of CH possess both tangible and intangible qualities that intertwine and complement each other, making the distinction less clear-cut. For instance, this Male Face Mask, held at the Art Institute Chicago[19], also known as ‘Zamble’, from the Guro people in the Ivory Coast holds dual significance as both a tangible and intangible CH. As a tangible object, the mask is a physical artefact made from wood and pigment, fabric, and various adornments, that combines animal and human features representing the Guro people’s artistic skills. On the other hand, as an intangible cultural object, the Zamble mask carries profound spiritual and cultural meaning. It plays a significant role in commemorating the deceased during a man’s second funeral. These second funerals are organised months or even years after the actual burial as a way to honour and remember the departed [see @haxaire_power_2009]. Thus, the preservation and appreciation of both the tangible and intangible aspects of the mask are essential to its cultural relevance. Another example of the blurred line between tangible and intangible heritage is emphasised by @de_muynke_ears_2022 in recreating reported perceptions of the acoustics of Notre-Dame de Paris through a collaboration between sciences of acoustics and anthropology. The authors highlight the heritage value of how people subjectively perceive sound in a space, particularly in places of worship where sound and music are integral to the religious experience. The authors advocate integrating the study of both material and non-material aspects to understand the changing sonic environments of heritage buildings [@de_muynke_ears_2022 pp. 1-2]. @katz_digitally_2023 articulates that ‘acoustics is an intangible product of a tangible building’. This integrated perspective could lead to a more holistic understanding of the dynamics between physical spaces and the perceptual and experiential dimensions attached to them. 3.1.1.3 Natural Heritage Natural heritage, encompassing geological formations, biodiversity, and ecosystems of cultural, scientific, and aesthetic value, shares a significant overlap with CH. Many natural sites hold spiritual and symbolic importance for communities, becoming repositories of cultural memory and identity [@lowenthal_natural_2005]. Traditional ecological knowledge developed by various cultures also underscores the interconnectedness of cultural and natural heritage, as indigenous communities have accumulated wisdom on sustainable resource use and ecological balance [@azzopardi_what_2023]. Moreover, the conservation and sustainable management of natural heritage is often intertwined with efforts to protect CH, fostering a collective commitment to preserve these entangled legacies for future generations. The link between natural and CH goes beyond their shared values; spatial overlaps further accentuate their interdependence. Natural sites may have cultural significance, while CH sites may be situated within natural landscapes. For example, a national park may include archaeological sites or culturally revered landscapes, thus intertwining the cultural and natural dimensions. This spatial intermingling highlights the inextricable relationship between human societies and the natural environment, as cultural practices and beliefs become intertwined with the landscapes they inhabit. In this way, the preservation of both natural and cultural heritage becomes essential not only for their intrinsic worth but also for sustaining the narrative of our shared human and environmental history. Additionally, the distinction between nature and culture is not only subjective and dependent on human appreciation [@vandenhende_management_2017]. Rather, it is a concept intrinsically linked with the overarching framework of modernism, a perspective that has been critically examined and deconstructed by the influential sociologist and philosopher, Bruno Latour, that have argued that ‘we have never been modern’ [@latour_we_1993]. Latour’s deconstruction of the modernist perspective extends to the recognition that the ‘the proliferation of hybrids has saturated the constitutional framework of the moderns’ [@latour_we_1993 p. 51]. This assertion underscores the fundamental challenge posed by hybrid entities – those that blur the boundaries between nature and culture – to the traditional categories upon which modernist thinking has been predicated. In essence, the concept of hybrids disrupts the neat divisions between the natural and social worlds that have been a hallmark of modernist discourse and provide us an opportunity to situate ourselves as ‘amodern’ as opposed to postmodern [@latour_postmodern_1990]. In addition to Latour’s critique of the modernistic distinction between nature and culture, the concept of the ‘parasite’, as expounded by Michel Serres, one of the influential thinkers who significantly influenced Latour’s intellectual development [@berressem_deja_2015]. It offers a valuable lens through which to examine the intricacies of interconnectedness and interdependence within our world. In his view, everything is enmeshed in a complex web of relationships that negates the existence of self-contained entities. Rather than seeing discrete and isolated entities, Serres invites us to see everything as an integral part of a larger system in which each component is inextricably dependent on the others [@serres_parasite_2014]. Together, these complementary perspectives invite us to reevaluate our understanding of the intricate tapestry of existence, emphasising the complexities of our relationship with the world. Thus, the appreciation of nature and culture is not mutually exclusive, but rather forms a continuous and evolving relationship. The modern perspective has historically separated these realms, treating them as distinct and disconnected. However, a more inclusive approach dissolves this artificial boundary and recognises the interconnectedness of nature and culture [@haraway_encounters_2008; @haraway_staying_2016]. This paradigm shift challenges the traditional modern understanding and invites a more holistic view in which natural and cultural heritage are mutually constructed within a complex network of relationships. Recognition of this relationship is essential in the context of heritage conservation and understanding. The dynamic interplay between nature and culture is recognised, and the acknowledgement of their coexistence promotes a more holistic approach to heritage conservation, where cultural practices, traditions and ecological systems are seen as interdependent aspects of the wider heritage tapestry. This recognition encourages us to see heritage sites not as isolated entities, but as part of a larger web of interconnectedness, and urges us to conserve and value both cultural and natural heritage with a shared responsibility. Adopting this interconnected perspective enables us to appreciate the profound connections between human societies and the natural world, and inspires a collective commitment to safeguarding these precious legacies for future generations. 3.1.1.4 Cultural Heritage Data As I embark on the exploration of CH data, it is first necessary to establish a basic understanding of data in this context. At its core, data represents more than mere numbers and facts; it constitutes a collection of discrete or continuous values that are assembled for reference or in-depth analysis. In essence, data are the rich tapestry upon which the narratives of CH are woven, making its comprehension a critical prerequisite for our expedition into this domain. Luciano Floridi — a prominent philosopher in the field of information and digital ethics — provides a thorough perspective on the term ‘data’ and offers valuable insights into its fundamental nature in its PI. He perceives ‘data at its most basic level as the absence of uniformity, whether in the real world or in some symbolic system. Only once such data have some recognisable structure and are given some meaning can they be considered information’ [@floridi_information_2010]. This initial definition sets the stage for a deeper exploration of Floridi’s understanding of data, as he further focuses on its transformative journey into a more meaningful and structured form, which we will explore next. Building upon Floridi’s foundational concept of data as the absence of uniformity, his subsequent definition provides a more comprehensive perspective. In a previous work, @floridi_is_2005 [p. 357] argues that ‘data are definable as constraining affordances, exploitable by a system as input of adequate queries that correctly semanticise them to produce information as output’. This definition highlights the dynamic role of data, not only as raw entities awaiting structure and meaning but also as elements imbued with the potential to constrain and guide systems towards the generation of meaningful information. Transitioning from Floridi’s concept of data, we progress to the view that data can be notably seen as interpretable texts within the DH perspective. According to (Owens, 2011) @owens_defining_2011: there are four main perspectives on how Humanists can engage with data: Data as constructed artefacts: data are a product of human creation, not something inherently raw or neutral; Data as interpretable texts: Humanists can interpret data as authored works, considering the intentions of the creators and how different audiences understand and use the data; Data as processable information: data can be processed by computers, allowing various forms of visualisation, manipulation and analysis, which can lead to further perspectives and insights; Data can hold evidentiary value: data, as a form of human artefact and cultural object, can provide evidence to support claims and arguments. These considerations highlight the multifaceted nature of data within the field of DH. It is in this complex landscape that we recognise that data transcends its traditional role as a passive entity. As @rodighiero_mapping_2021 [p. 26, citing [@akrich_sociologie_2006]] suggests that ‘there is no doubt that data are full-fledged actors that take part in the social network the actor-network theory describes, in which both human and non-human intertwine and overlap’. This notion – rooted and borrowed from STS – reinforces the idea that data, as an active and dynamic entity, plays a significant role in shaping the interactions between human and non-human actors in any digital spheres. From these angles, I can look at the characteristics of CH data. @bruseker_cultural_2017 [p. 94] articulate that ‘data coming from the cultural heritage community comes in many shapes and sizes. Born from different disciplines, techniques, traditions, positions, and technologies, the data generated by the many different specializations that fall under this rubric come in an impressive array of forms’. In exploring CH data, it is important to recognise the inherent diversity stemming from diverse disciplines, techniques, and traditions. @bruseker_cultural_2017 [p. 94] aptly emphasise this, highlighting the extensive array of forms in which data manifests. This heterogeneity raises fundamental questions about the unity and identity of CH data — a crucial aspect deserving acknowledgement within this context. As the authors astutely ponder: It could be a natural problem to pose from the beginning: if the data of this community indeed presents itself in such a state of heterogeneity, does it not beg the question if there is truly an identity and unity to cultural heritage data in the first place? It could be argued that Cultural Heritage, as a term, offers a fairly useful means to describe the fuzzy and approximate togetherness of a wide array of disciplines and traditions that concern themselves with the human past. Expanding on these insights, CH data refer to digital or data-driven affordances of CH[20], embodying a rich and varied compilation of insights originating from a variety of disciplines, techniques, traditions, positions and technologies. It encompasses both tangible and intangible aspects of a society’s culture as well as natural heritage. These data, derived from a wide range of disciplines, offer a latent capacity to support the generation of knowledge relating to historical time periods, geospatial areas, as well as current and past human and non-human activities. They are collected, curated and maintained by various entities such as libraries, archives, museums, higher education institutions, non-governmental organisations, indigenous communities and local groups as well as by the wider public. Building further on the mosaic of CH data, three primary dimensions come to the fore: heterogeneity, knowledge latency, and custodianship. Heterogeneity: As a fundamental characteristic, signifies the diverse forms and origins that shape this invaluable reservoir of human heritage. Different techniques and varying viewpoints in treating modelling also contribute to this heterogeneity [@guillem_faire_2023]. Knowledge latency: It highlights the temporal dimension, presenting CH data as a repository of latent knowledge awaiting discovery and interpretation. Notably, not all artefacts are – or should be – digitised, and even among those that are, (mis)representation and challenges in interconnecting them persist [@rossenova_iterative_2022]. Besides, the issue of structured data – or the lack of it – reinforces the aspect of knowledge latency [@haciguzeller_emerging_2021]. Custodianship: This dimension reinforces the essential role played by a variety of entities, predominantly CHIs, in safeguarding and managing resources, ensuring their preservation and accessibility for present and future generations. However, it is very important to acknowledge the great divide in terms of resources, with indigenous and local communities often facing challenges in custodianship responsibilities. Taken together, these dimensions contribute to a comprehensive understanding of the nuanced fabric of CH data. They reveal the diversity of forms and origins, the temporal aspects and the responsible stewardship that are crucial to the sustainability of such data. By shifting our focus to the sphere of humanities data, we broaden our scope to extend beyond the peculiarities of CH data. Drawing parallels between these areas allows us to grasp the interconnectedness of our heritage. CH data usually refers to information about cultural artefacts, sites, and practices that hold historical or cultural significance. Humanities data encompasses information about human culture, history, and society, including literature, philosophy, art, and language [@tasovac_cultural_2020]. Both often involve ethical considerations, such as ownership, access, and preservation, and require a comprehensive understanding of their various meanings and values [@ioannides_towards_2019]. Moreover, @schoch_big_2013 explains that data in the humanities, such as text and visual elements, have unique qualities. While these analogue forms could be considered data, they lack the ability to be analysed computationally as they are non-discrete. The semiotic nature of language, text and art introduces dimensions tied to meaning and context, making the term ‘data’ problematic. Critics question its use because it conflicts with humanistic principles such as contextual interpretation and the subjective position of the scholar. @schoch_big_2013 distinguish data in the humanities further into two core types: smart and big data. The former tends to be small in volume, carefully curated, but harder to scale such as digital editions. As for the latter, it describes voluminous and varied data and it loosely relies on the three ⋁ by @laney_3d_2001: volume, velocity and variety (see 3.3.1.2). Yet, big data in the humanities differs significantly from other fields as it rarely requires rapid real-time analysis, is less focused on handling massive volumes, and instead deals with diverse, unstructured data sources. @schoch_big_2013 concludes by arguing that ‘I believe the most interesting challenge for the next years when it comes to dealing with data in the humanities will be to actually transgress this opposition of smart and big data. What we need is bigger smart data or smarter big data, and to create and use it, we need to make use of new methods’. Data processing offers great potential for humanities research as @owens_defining_2011 argues: ‘In the end, the kinds of questions humanists ask about texts and artifacts are just as relevant to ask of data. While the new and exciting prospects of processing data offer humanists a range of exciting possibilities for research, humanistic approaches to the textual and artifactual qualities of data also have a considerable amount to offer to the interpretation of data’. While the term ‘data’ in the context of the humanities may raise questions due to its semiotic and contextual complexities, it serves as a foundation for understanding both CH data and broader humanities data. The data originating from CH and the humanities are inherently intertwined, as they often share a similar nature and purpose for scholars. This strong interconnection leads to a collaborative relationship between the GLAM sector and the humanities or DH. Scholars in the humanities frequently rely on digitised cultural artefacts, historical records, linguistic resources, and literary works provided by GLAM institutions to gain valuable insights into human history, culture, and traditions. The digitisation efforts and research collaborations between these entities play a pivotal role in preserving CH data and advancing our understanding of diverse societies, fostering a deeper appreciation of our shared human heritage. CH data and humanities data are distinct from other scientific data due to their qualitative and subjective nature, which requires different methods of analysis than quantitative scientific data. They include archival and special collections, rare books, manuscripts, photographs, recordings, artefacts, and other primary sources that reflect the cultural beliefs, identity, and memory of a people [see @sabharwal_2_2015; @izu_sociocultural_2022]. In summary, while CH data and humanities data share some commonalities, they differ in terms of scope and subject matter. CH data focuses specifically on the preservation and documentation of physical artefacts and intangible attributes, while humanities data encompasses a broader range of disciplines within the humanities [@munster_digital_2019]. However, it is important to note that the distinction between CH data and humanities data can be blurred, as (meta)data should ideally be co-created and integrated across both domains. 3.1.2 Representation and Embodiment of Cultural Heritage Data Digital representation of CH data, while preserving their context and complexity, remain a significant challenge. Those representations, sometimes referred to as digital surrogates or digital twins [@conway_digital_2015; @shao_digital_2018; @semeraro_digital_2021], of CH data can potentially lead to a loss of context and a reduction in the richness of the CH represented. For instance, a digital image of a cultural artefact may not capture its materiality, such as its texture, weight, and feel, which are essential aspects of the artefact’s cultural significance [@force_context_2021]. Furthermore, digital representations may also exclude vital social, cultural, and historical contexts surrounding the object, which is crucial to understanding its full cultural value [@cameron_beyond_2007]. This subsection is structured around two key dimensions. Firstly, it explores materiality, highlighting how digital representations may fail to capture important aspects that are integral to understanding the significance of CH resources. Secondly, it navigates the convergence and divergence between digitised CH and digital heritage. 3.1.2.1 Materiality Briefly, materiality refers to the physical qualities of an object or artefact, such as its colour, texture, and composition. As part of built heritage, the emphasis for materiality relates primarily to architecture, its associated techniques and the range of materials used in the construction or renovation of a building. More specifically, materiality acts as a pivotal factor in the transformation of disparate fragments of material culture into heritage, providing a vital link to the intangible facets of heritage. It contributes significantly to an individual’s social position and ability to navigate specific social milieus, thereby determining their ability to transmit cultural knowledge and values to future generations. The transformative potential of materiality in this regard underscores its fundamental role in perpetuating heritage and the transmission of cultural legacies [@carman_where_2009]. The physical attributes of objects, including texture, colour and shape, can evoke different emotions and associations, shaping people’s perceptions and memories of these events. Beyond retrospective influences, the potential of materiality extends to the creation of new memories and meanings, as exemplified by the use of materials such as glass in contemporary art. In such cases, materials evoke not only their inherent properties but also symbolic connotations, adding new layers of meaning and memory to the artistic narrative [@fiorentino_persistence_2023]. @edwards_photographs_2004 [p. 3] argue that materiality is not just concerned with physical objects in a positivist sense, but also involves complex and fluid relationships between people, images, and things. This relationship is influenced by social, cultural, and historical contexts, and plays a crucial role in shaping our perceptions and experiences of the world. Moreover, materiality is central to giving meaning to non-human entities [see @latour_actor-network_1996; @haraway_companion_2003; @star_institutional_1989], which emphasises the role of both humans and non-humans in shaping social and cultural phenomena. For CH data, diversity is at its core, as it allows for the exploration of different ways of knowing, experiencing, and expressing the world. Therefore, it is important to approach materiality not as a static and fixed concept, but as a dynamic and evolving phenomenon that is shaped by multiple forces [@hahn_digitale_2018 pp. 62-63]. When discussing materiality, there is also its negation, i.e. the notion of space or emptiness, such as how people interact with it through built heritage, which is regarded as a primordial medium of material culture, as expounded by @guillem_rcc8_2023 [p. 2]: The most intuitive and foundational definition of architecture is the built thing, that is the architecture qua building or built work. Human beings continuously interact with the built materiality through the non-materiality of space. Space as emptiness is formed and defined by the materiality that affects its existence. That relation between fullness and emptiness is what makes possible architecture as lived and experienced space. Materiality also offers a means of challenging dominant narratives and power structures, particularly the Western-centric perspective on CH. It gives greater recognition to the importance of intangible CH, which often takes a back seat to tangible objects in dominant narratives [@lenzerini_intangible_2011]. By highlighting the materiality of marginalised or forgotten elements, individuals can reclaim their heritage and challenge dominant narratives that marginalise certain groups, contributing to a more inclusive and accurate representation of CH. The primary focus in terms of digitisation is also on preserving material-based knowledge, often overlooking the dynamic and living nature of intangibility. @hou_digitizing_2022 stress the crucial role of computational heritage and information technologies advances in preserving and improving access to intangible CH. Effectively documenting the ephemeral aspects of intangible heritage and communicating the knowledge that is deeply linked to individuals are pressing challenges. Recent initiatives seek to capture the dynamic facets of cultural practices, using visualisation, augmentation, participation and immersive experiences to enhance experiential narratives. There is a strong call for a strategic re-evaluation of the intangible CH digitisation process, emphasising the human body as a vessel for traditions and memories, such as capturing traditional Southern Chinese martial arts, who has been passed down colloquially from generations and needs a methodological approach to capture such embodied knowledge [see @adamou_facets_2023; @hou_ontology-based_2024]. Even in cases where considerable efforts have been devoted to digitisation of physical objects such as medieval manuscripts and rare books over the past few decades [@nielsen_digitisation_2008], a lingering concern persists regarding the authentic encounter with the original artefact, despite its enhanced accessibility through digital surrogates [@van_lit_digital_2020]. Material attributes present a persistent challenge to achieving full replication. Despite advances facilitated by techniques such as RTI, 3D digitisation, or VR and AR, which offer better experiential immersion and are more effective than two-dimensional representations in addressing certain materiality concerns, the ability to replicate the multifaceted sensory experience associated with the original object, including the palpable emotions and spatial sensation, remains an ongoing endeavour, presenting a complex and multifaceted dimension of a challenge that is quite unlikely and may never be fully feasible [see @endres_digitizing_2019]. 3.1.2.2 Digitised Cultural Heritage and Digital Heritage The concepts of digitised CH and digital heritage intersect through the use of digital technology for the preservation, access, and dissemination of CH resources. Digitised CH focuses on converting physical artefacts into digital forms, ensuring their long-term preservation and accessibility through digital means. Conversely, digital heritage includes a broader range of digital tools and resources ‘to preserve, research and communicate cultural heritage’ (@munster_digital_2021 p. 2, citing [@georgopoulos_cipas_2018]). Digitised CH acts as a critical bridge, facilitating a transition from traditional or analogue GLAM practices to a digital environment. This shift is pivotal in unlocking the potential of digitised CH. These values extend beyond scholarly pursuits, despite the majority of digitisation efforts being driven by research funding. In doing so, it becomes evident that the creative reuse and data-driven innovation stemming from digitised CH necessitate substantial and sustained investment in the GLAM sector. This investment is fundamental, especially amidst reduced funding due to years of austerity. @terras_value_2021 underscore this need, shedding light on the delicate balance required with commercial outcomes. They emphasised that leveraging CH datasets offers vast opportunities for technological innovation and economic benefits, urging professionals from various domains to collaborate and experiment in a low-risk environment. Digital heritage[21] encompasses a wide range of human knowledge and expression in cultural, educational, scientific and various other domains. In today’s rapidly evolving technological landscape, an increasing amount of this knowledge is either digitally created or in the process of being converted from analogue to digital formats [@he_digital_2017]. These digital resources cover a wide range, including text, multimedia, software and more, and require deliberate and strategic management to ensure their long-term preservation. This valuable heritage, spread across the globe and expressed in multiple languages [@unesco_charter_2009]. In summary, digitised CH not only forges the path to digital heritage but also embodies an ever-evolving cultural landscape. Recognising the transformative potency with digital heritage is essential to enriching our understanding and engagement with our cultural roots. Both concepts are intimately embedded in CH and play a vital role as conduits. 3.1.3 Collectives and Apparatuses The collaborative efforts of collectives and the operation of various apparatuses play a fundamental part in shaping the preservation, interpretation and dissemination of cultural artefacts and practices. This subsection is concerned with the central contributions of human and non-human actors engaged in cooperative action and the modus operandi of various apparatuses, such as building (digital) infrastructures. Some of these considerations are drawn from STS, which are more fully captured in , serving as the theoretical framework for the thesis. Bruno Latour’s concept of the importance of collectives and apparatuses [see @latour_habiter_2022 p. 15] can be extrapolated to CHIs. Every institution’s or project’s ultimate success hinges on the collaboration and support of individuals, as well as the tools, systems and technologies they use. Indeed, paralleling CHIs with wider contexts suggests that collective efforts and apparatuses play a critical role in shaping the effectiveness of any institution. This highlights the importance of recognising the influence of both human and non-human entities in institutional functioning and underlines the need for a more comprehensive understanding of the dynamics involved therein. ANT can be a useful lens to analyse the creation, use, and dissemination of CH data. ANT posits that actors are not independent entities but are instead part of a network that consists of both human and non-human entities. According to ANT, every actor, be it a person or a technology, is a node in the network and contributes to the overall functioning of the network [@latour_reassembling_2005; @callon_actor_2001]. When we apply this framework to CHIs, we can identify the different actors involved in the creation, use, and dissemination of CH data. These actors can include individuals, such as curators, conservators, and historians, as well as non-human entities, such as databases, digitisation equipment, and software. Moreover, this approach can help us understand the interactions between these actors and how they shape the overall functioning of CHIs. For instance, digitisation equipment can enable the creation of high-quality digital images of artefacts, which can then be disseminated globally through online platforms. Examining the Notre-Dame de Paris, one can discern the keystones at the summit of its arches as indispensable actors within its architectural narrative. These keystones, imbued with historical narratives and a non-human facet, played a central role in the (digital) rescue and subsequent restoration efforts following the tragic roof fire in April 2019. @guillem_faire_2023’s study further elucidates this restoration journey, emphasising how the keystones, with their individual narratives and structural significance, contributed to the (digital) reassembly. Building on this perspective, we can explore the importance of community involvement in the preservation and management of CH data, thereby increasing the potential for sustainable practices and inclusive engagement. Local communities have an integral part to play in the management and preservation CH data, especially in the digital age where resources are often scarce for GLAM institutions. Community involvement has several benefits, including increased engagement and participation, access to local knowledge and expertise, and more sustainable and inclusive management and preservation practices [@ridge_12_2021]. For instance, geophysical technologies such as ground-penetrating radar have been used with great success in identifying and evaluating the depth, extent, and composition of CH resources for research and management purposes, easing tensions when working with sensitive ancestral places [@nelson_role_2021]. Collaborative environments can also help with CH information sharing and communication tasks because of the way in which they provide a visual context to users, making it easier to find and relate CH content [@respaldiza_hidalgo_metadata_2011]. Embarking on @brown_communities_2023 [pp. 6-7]'s insightful analysis, a prominent illustration of exemplary community practice can be found in the sphere of community museums in Latin America: Inicio - Museos Comunitarios de América[22]. The author highlights the role of community engagement and leadership in the creation and operation of these museums. Such engagement ensures that these museums are not imposed from outside, but rather emerge organically as museums the community, resonating with its unique CH and identity. This approach is consistent with the ethos of ‘telling a story’, building a future, which embodies a deep commitment to community empowerment and cultural preservation. This community-centric approach amplifies the museum’s resonance with the community’s lived experiences and historical narratives. At the same time, institutions can also benefit from collaborating with peer communities like IIIF to promote greater access to their collections. IIIF provides a set of open standards for delivering high-quality digital objects online at scale, which can help memory and academic institutions share their collections with each other and with the wider public [@snydman_international_2015; @weinthal_iiif_2019]. By adopting IIIF standards, organisations can make their collections more discoverable and accessible to researchers, developers, and other CH professionals [@padfield_joseph_practical_2022]. Involvement in communities such as IIIF also helps to mitigate costs as they develop shared or adaptable resources and services [@raemy_international_2017]. Participation of communities in the management and preservation of CH resources is essential to ensure that CH is protected and accessible for future generations. By involving and participating in communities, GLAMs can tap into local as well as peer knowledge and expertise, making management and preservation practices more sustainable and inclusive. This approach also increases engagement and participation, ensuring that CH is valued and appreciated by the wider community. Thus, memory institutions need to collaborate closely with communities to ensure that CH data, and their underlying infrastructures and services, is being effectively curated [@delmas-glass_fostering_2020]. Closely related to this context, @star_ethnography_1999 points out the often unacknowledged role of infrastructure within society. She argues that infrastructures are necessary but often invisible and taken for granted: People commonly envision infrastructure as a system of substrates – railroad, lines, pipes and plumbing, electrical power plants, and wires. It is by definition invisible, part of the background for other kinds of work. It is ready-to-hand. This image holds up well enough for many purposes – turn on the faucet for a drink of water and you use a vast infrastructure of plumbing and water regulation without usually thinking much about it. [@star_ethnography_1999 p. 380] @star_ethnography_1999 [pp. 381-382, citing [@star_steps_1994]] identifies nine dimensions to define infrastructure. They provide a comprehensive framework to comprehend the nuanced nature of infrastructure and its pervasive impact on diverse societal facets. The following dimensions are vital for analysing the often imperceptible, yet deeply embedded structures that constitute the foundational framework of both daily life and broader societal operations[23]: Embeddedness: Infrastructure is sunk into and inside of other structures, social arrangements, and technologies. People do not necessarily distinguish the several coordinated aspects of infrastructure. Transparency: Infrastructure is transparent to use, in the sense that it does not have to be reinvented each time or assembled for each task, but invisibly supports those tasks. Reach or scope: This may be either spatial or temporal – infrastructure has reach beyond a single event or one-site practice. Learned as part of membership: Strangers and outsiders encounter infrastructure as a target object to be learned about. New participants acquire a naturalised familiarity with its objects, as they become members. Links with conventions of practice: Infrastructure both shapes and is shaped by the conventions of a community of practice. Embodiment of standards: Modified by scope and often by conflicting conventions, infrastructure takes on transparency by plugging into other infrastructures and tools in a standardised fashion. Built on an installed base: Infrastructure does not grow de novo; it wrestles with the inertia of the installed based and inherits strengths and limitations from that base. Becomes visible upon breakdown: The normally invisible quality of working infrastructure becomes visible when it breaks: the server is down, the bridge washes out, there is a power blackout. Is fixed in modular increments, not all at once or globally: Because infrastructure is big, layered, and complex, and because it means different things locally, it is never changed from above. Changes take time and negotiations, and adjustment with other aspects of the systems are involved. An appreciation of these dimensions is crucial to the analysis of the network of infrastructural systems that underpin contemporary society, and is necessary for the analysis of any digital infrastructure that manages CH data. Digital infrastructures – also known as e-infrastructures or cyberinfrastructures – are forms of infrastructure that are essential for the functioning of today’s society [see @jackson_understanding_2007; @ribes_sociotechnical_2010]. These kinds of infrastructure need to be understood as socio-technical systems, showcasing the interplay between technological components (such as hardware, software, and networks) and the social and organisational contexts in which they operate [@star_steps_1994]. According to @fresa_data_2013 [p. 33], digital CH infrastructures should be able to serve the research needs of humanities scholars as well as having dedicated services for education, learning, and general public access. In terms of requirements, @fresa_data_2013 [pp. 36-39] identifies three different layers of services: for content providers, for managing and adding value to the content, and for the research communities. For the latter, several sub-services tailored to research communities are listed. These encompass long-term preservation, PIDs[24], interoperability and aggregation, advanced search, data resource set-up, user authentication and access control, as well as rights management. Overall, (digital) infrastructures are imperative apparatuses in preserving and sharing CH data. First, they support preservation by archiving digital artefacts and their metadata, protecting them from deterioration and loss. Secondly, these infrastructures facilitate accessibility, allowing a global audience to explore and appreciate cultural heritage online. Finally, they encourage interpretation and engagement, promoting cross-cultural understanding and knowledge sharing. Moreover, infrastructure is a fundamental component that demands extensive investment, particularly in the creation of streamlined integration layers capable of interacting seamlessly with different systems. This can be exemplified by such institutions as the Rijksmuseum[25] , where a well-constructed infrastructure allows for efficient integration and interaction with various technological and organisational systems [@dijkshoorn_building_2023]. This investment serves as the foundation for an institution’s functionality, allowing for the smooth flow of data, the coordination of processes and the optimal use of resources. In a similar vein, @canning_power_2022 argue that the often invisible structures of metadata, particularly in Linked Data ontologies, play a crucial role in shaping the interpretation of data. These structures, while not immediately apparent, are imbued with value judgements and ideological implications, extending the impact of metadata beyond mere technicalities to encompass diverse and inter-sectional perspectives. This multidimensional ontological approach addresses the complexity and diversity of data sources, paralleling the need for sophisticated infrastructures in institutions like the Rijksmuseum. It underscores the importance of integrating inter-sectional feminist principles in information systems, reflecting a commitment to diverse ways of knowing and nuanced storytelling. Furthermore, as all (meta)data requires storage, it raises an important concern in terms of the entrenched power dynamics governing knowledge representation within information systems, as pointed out by @canning_what_2023. This perspective, initially centred around museum objects, holds broader implications for all CH resources [see @simandiraki-grimshaw_what_2023]. Canning strongly advocates for the essential adaptation of databases to embrace a diverse array of epistemological approaches by introducing new types of affordances. Databases, despite their role in information preservation, wield significant influence that can inadvertently stifle diverse modes of knowledge interpretation and ‘can constrain ways of knowing’. Furthermore, she compellingly argues that modifications to databases extend beyond technical adjustments; they are inextricably linked to shifts in institutional power dynamics and the enduring, often inequitable, power dynamics governing the world of museums – or any CHIs – and their curation. In understanding the interplay of collectives and apparatuses, it is clear that key actors, including individuals, institutions, local and global communities, as well as the sophisticated fabric of (digital) infrastructures and their components, are deeply entangled and interconnected. These entities, both human and non-human, collectively shape and navigate the rich networks of human interactions and technologies that underpin the foundations of contemporary society. 3.2 Cultural Heritage Metadata This subsection offers insights into the importance of metadata in CH, underlining its role in enhancing the understanding and accessibility of cultural artefacts. It is structured into three four[26] essential parts. I start with an introductory segment in 3.2.1, then I explore the types and functions of metadata in 3.2.2, thirdly in 3.2.3, I outline some of the most important CH metadata standards, and finally in 3.2.4, I explore the use of KOS, such as generic classification systems and controlled vocabularies. 3.2.1 Data about Data For curating CH resources, metadata[27], ‘data about data’, is probably one of the key concept that needs to be introduced here. Metadata permeate our digital and physical landscapes, playing a vital role in organising, describing and managing a vast array of information. Rather than being confined to a specific domain, they are ubiquitous and pervade many aspects of our everyday lives [@riley_understanding_2017 pp. 2-3]. From websites and databases to social media platforms and online marketplaces, metadata adds meaning to data, enabling users to understand their context, relevance and provenance. As an example, Figure 3.1 shows the metadata of a book[28]. Figure 3.1: Snapshot from the Swisscovery Platform Showing the Bibliographic Record of @zeng_metadata_2022 Metadata are central to the management and preservation of CH data, providing essential information to ensure that data can be properly organised, discovered and retrieved. For example, they facilitate the understanding and interpretation of data, enabling scholars and the public to access and use them effectively [@constantopoulos_aspects_2008]. Metadata also help to ensure the long-term preservation and accessibility of CH data [@zeng_metadata_2022 pp. 490-491]. Providing metadata in a structured manner facilitates forms of aggregation, i.e. individuals and institutions being able to harvest and organise metadata from multiple sources or repositories into a centralised location [see @freire_survey_2017; @freire_metadata_2021]. In addition, the importance of metadata as a gateway to information is particularly compelling when the primary embodiment of a record is either unavailable or lost. In cases where resources, time constraints, sensitive content or strategic decisions prevent the digitisation of an item, metadata becomes the principal means of representation and access. If a physical record is lost or damaged, the metadata associated with that record acts as a proxy for the record. @riley_understanding_2017 [p. 5] discusses the transformation of libraries over time. Initially, libraries moved from search terminals to the modern web-based resource discovery systems we use today. This shift was driven by advances in computerisation. Libraries’ basic approach to metadata is ‘bibliographic’, deeply rooted in their traditional expertise in describing books. This approach involves providing detailed descriptions of individual items so that users can easily locate them within the library’s collection. On the other hand, archives use ‘finding aids’, which are descriptive inventories of their collections, coupled with historical context. These aids are essential for users to understand the material and to find groups of related items within the archive. The metadata used in archives allows for the contextualisation of materials, particularly papers of individuals or records of organisations, providing a richer understanding of the content. Similarly, museums actively manage and track their acquisitions, exhibitions and loans through metadata. Museum curators use metadata to interpret collections for visitors, explaining the historical and social significance of artefacts and describing the relationships and connections between different objects. This helps to enhance the overall visitor experience and understanding of the artefacts on display or the digital resources on a particular website. 3.2.2 Types and Functions CHIs share common objectives and concerns related to information management, as highlighted by @lim_metadata_2011 [pp. 484-485]. These goals typically include facilitating access to knowledge and ensuring the integrity of CH data. However, it is important to note that CHIs also differ widely in how they deal with metadata. Different domains have unique approaches and standards for describing the materials they collect, preserve and disseminate, and even within a single domain there are significant differences. There have been different attempts to categorise the metadata landscape. For instance, @baca_setting_2016 identified the following five categories of metadata and their respective functions: Administrative: Metadata used in managing and administering collections and information resources, such as acquisition and appraisal information or documentation related to repatriation. Descriptive: Metadata used to identify, authenticate, and describe collections and related trusted information resources. Finding aids, cataloguing records, annotations by practitioners and end users, as well as metadata generated by or through a given DAM system can often be classified as descriptive metadata. Preservation: Metadata related to the preservation management of collections and information resources. Common examples of preservation metadata are documentation of physical condition of resources or of any actions taken to preserve resources, whether physical restoration or data migration. Technical: Metadata related to how a system functions or metadata behaves. Examples include software documentation and digitisation information. Use: Metadata related to the level and type of use of collections and information resources, such as circulation records, search logs, or rights metadata. Meanwhile @riley_seeing_2009, as illustrated in a comprehensive visualisation graph[29], suggested seven functions, i.e. the role a standard play in the creation and storage and metadata, and seven purposes referring to the general type of metadata. Functions: Conceptual Model, Content Standard, Controlled Vocabulary, Framework/Technology, Markup Language, Record Format, and Structure Standard. Purposes: Data, Descriptive Metadata, Metadata Wrappers, Preservation Metadata, Rights Metadata, Structural Metadata, and Technical Metadata. Almost a decade later, @riley_understanding_2017 [pp. 6-7] summarised metadata types into four groupings instead of the seven purposes previously mentioned. is removed from the list and technical, preservation and rights metadata are now grouped into a newly created administrative metadata category. Descriptive metadata: For finding or understanding a resource Administrative metadata: Umbrella term referring to the information needed to manage a resource or that relates to its creation 2.1 Technical metadata: For decoding and rendering files 2.2 Preservation metadata: Long-term management of files 2.3 Rights metadata: Intellectual property rights attached to content Structural metadata: Relationships of parts of resources to one another Markup Language: Integrates metadata and flags for other structural or semantic features within content[30]. This classification of metadata types and function differs to the categories identified by @baca_setting_2016 mostly due to the addition of structural metadata and markup language as their own categories [@zeng_metadata_2022 p. 19]. Table 3.1 lists the major types of metadata according to @riley_understanding_2017 [p 7] and include example properties and their primary uses. Table 3.1: Types of Metadata According to @riley_understanding_2017 [p. 7] Metadata (Sub)type Example properties Primary uses 1. Descriptive metadata Title, Author, Subject, Genre, Publication date Discovery, Display, Interoperability 2.1 Technical metadata File type, File size, Creation date, Compression scheme Interoperability, Digital object management, Preservation 2.2 Preservation metadata Checksum, Preservation event Interoperability, Digital object management, Preservation 2.3 Rights metadata Copyright status, Licence terms, Rights holder Interoperability, Digital object management 3. Structural metadata Sequence, Place in hierarchy Navigation 4. Markup languages Paragraph, Heading, List, Name, Date Navigation, Interoperability Ultimately, metadata can also be leveraged to create more inclusive and diverse representations of CH. For instance, metadata can be used to document and promote underrepresented communities and their heritage, providing greater visibility and recognition. This approach aligns with the principles of decolonising CH, promoting equity and social justice by recognising and valuing diverse cultural perspectives, especially in the prevailing anglophone and Western-centric standpoint in DH [@mullaney_internet_2021; @mahony_cultural_2018]. Moreover, the distinction between data and metadata, as discussed in the work of @alter_view_2023, is not always distinct, leading to the concept of ‘semantic transposition’. This complexity reflects in CH where what is considered metadata in one context might be primary data in another, underscoring the necessity for adaptable frameworks in data management. This understanding is crucial for fostering inclusive and diverse representations in CH, ensuring that all cultural narratives are appropriately documented and acknowledged. 3.2.3 Standards Metadata standards play a crucial role in ensuring that data are organised and consistent, facilitating mutual understanding between different stakeholders [@raemy_enabling_2020]. CHIs such as GLAMs typically follow established conventions or standards when organising their resources. Current methods of cataloguing have historical roots dating back to the century, particularly with the development of cataloguing systems such as Antonio Panizzi’s at the British Museum and Charles Coffin Jewett’s efforts to mechanically duplicate entries at the library of the Smithsonian Institution [@zeng_metadata_2022 pp. 14-15]. Unique metadata standards, rules and models have been established and maintained within specific sub-fields. In addition, certain standards for information resources have been endorsed by authoritative bodies [@greenberg_understanding_2005], and some are used exclusively within specific domain communities [@hillmann_metadata_2008]. @riley_understanding_2017 [p. 5] underscores the predilection of CH metadata – whether these standards emanate from libraries, archives, or museums – toward accentuating descriptive attributes. The foundational CH metadata standards, primarily conceived to [@zeng_metadata_2022 p. 11], manifest this thematic focus. Within the CH domain, metadata standards vary widely in scope, and a number of different standards have been developed to meet different needs and priorities[31] [@freire_availability_2018]. The following quoted passage sheds some light on the different approaches and levels of collaboration in metadata standardisation, namely among the library and museum sectors. Despite the striving for homogeneity, in practice, the production of metadata among information specialists and the use of metadata standards is already marked by considerable diversity. This has come about for very pragmatic reasons. Different types of objects and collections require different types of metadata. The curatorial interest for particular information differs for example between images held in an art gallery and a library, as does the information specialists’ domain expertise. Accordingly, diversity in metadata practice seems to be greatest in museums as they are the institutions that govern the most diverse collections. While the library sector has ‘systematically and cooperatively created and shared’ metadata standards since the 1960s, the museum sector, mostly handling images and objects, has been slower to establish such collaboration and consensus. [@dahlgren_diversity_2020 p. 244] In this context, I want to focus on some metadata standards that have proved vital across libraries, archives, museums and galleries. These standards, which I will briefly describe, serve as the foundation for organising, describing, and enabling efficient access to vast and diverse collections. Of particular interest I will be taking a closer look at CIDOC-CRM as it serves as the cornerstone of Linked Art, a fundamental LOUD standard. 3.2.3.1 Library Metadata Standards In libraries, several metadata standards have played crucial roles in organising and accessing collections over the years. The most prevalent historical standard, MARC[32], was a pilot project from the 1960s funded by the CLIR and led by the LoC to structure cataloguing data and distribute them through magnetic tapes [@avram_marc_1968 p. 3]. The standard evolved into MARC21 in 1999 [@zeng_metadata_2022 p. 418] – as exemplified by Code Snippet 3.1, providing a structured format for bibliographic records and related information in machine-readable form. It uses codes, fields, and sub-fields to structure data. Another significant historical standard is the AACR, published in 1967 and revised in 1978 that provides sets of rules for descriptive cataloguing of various types of information resources. Code Snippet 3.1: MARC21 Record of @zeng_metadata_2022 in the Swisscovery Platform leader 01424nam a2200397 c 4500 001 991170746542405501 005 20220427104002.0 008 210818s2022 xxu b 001 0 eng 010 ##$a 2021031231 020 ##$a9780838948750 $qBroschur 020 ##$a0838948758 035 ##$a(OCoLC)1264724191 040 ##$aDLC $bger $erda $cDLC $dCH-ZuSLS UZB ZB 042 ##$apcc 050 00$aZ666.7 $b.Z46 2022 082 00$a025.3 $223 082 74$a020 $223sdnb 100 1#$aZeng, Marcia Lei $d1956- $4aut $0(DE-588)136417035 245 10$aMetadata $cMarcia Lei Zeng and Jian Qin 250 ##$aThird edition 264 #1$aChicago $bALA Neal-Schuman $c2022 300 ##$axxvi, 613 Seiten $bIllustrationen 336 ##$btxt $2rdacontent 337 ##$bn $2rdamedia 338 ##$bnc $2rdacarrier 504 ##$aIncludes bibliographical references and index 650 #0$aMetadata 650 #7$aMetadata $2fast $0(OCoLC)fst01017519 650 #7$aMetadaten $2gnd $0(DE-588)4410512-5 776 08$iErscheint auch als $nOnline-Ausgabe $tMetadata $z9780838937969 776 08$iErscheint auch als $nOnline-Ausgabe $tMetadata $z9780838937952 700 1#$aQin, Jian $d1956- $4aut $0(DE-588)1056085541 856 42$3Inhaltsverzeichnis $qPDF $uhttps://urn.ub.unibe.ch/urn:ch:slsp:0838948758:ihv:pdf 900 ##$aOK_GND $xUZB/Z01/202203/klei 900 ##$aStoppsignal FRED $xUZB/Z01/202203 949 ##$ahttps://urn.ub.unibe.ch/urn:ch:slsp:0838948758:ihv:pdf AACR is no longer maintained and was replaced by RDA[33] around 2010 to be a more adaptive standard to contemporary needs. RDA, while not a markup language like MARC, serves as a content standard that guides the description and discovery of resources, focusing on user needs and facilitating improved navigation of library collections. Its goal is to provide a flexible and extensible framework for the description of all types of resources, ensuring discoverability, accessibility, and relevance for users[34] [@sprochi_where_2016 p. 130]. Libraries often leverage other standards to enrich their metadata practices. MODS[35], introduced in 2002, offers a more flexible XML-based schema for bibliographic description, allowing for better integration with other standards and systems. It was initially developed to carry [@zeng_metadata_2022 p. 423]. MODS provides a balance between human readability and machine processing, making it suitable for a wide range of resources and use cases [@guenther_mods_2003 p. 139]. METS[36], on the other hand, is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library. METS, developed as an initiative of the DLF, provides a flexible and extensible framework for structuring metadata, allowing for the packaging of complex digital objects [@cantara_mets_2005 pp. 238-239]. While MODS is primarily concerned with bibliographic information, METS focuses on structuring metadata for digital objects, making it particularly useful for digital libraries and repositories. A further important standard is FRBR, a conceptual framework for understanding and structuring bibliographic data and access points. Originally developed by IFLA in 1997 as part of its functional requirements family of models, FRBR describes three main groups of entities, relationships, and attributes as illustrated by Figure 3.2. The first group of entities are the foundation of the model which characterises four levels of abstraction: WEMI [@denton_functional_2006 p. 231]. FRBR has had a significant impact on the development of RDA, which is loosely aligned with the principles and structures defined by the conceptual framework, but as it isn’t a data model per se; it does not inform how to record bibliographic information in day-to-day practice and focus heavily on textual resources[37] [@sprochi_where_2016 pp. 130-131]. Furthermore, @cossham_models_2017 [p. 11] asserts that FRBR and RDA, ‘don’t align well with the ways that users use, understand, and experience library catalogues nor with the ways that they understand and experience the wider information environment’. Figure 3.2: The FRBR Conceptual Framework. Adapted from @zou_constructing_2018 [p. 36] A further important standard in the field of library science is the LRM, which was introduced as a comprehensive conceptual framework. It provides a broad understanding of bibliographic data and user-centric design principles, aligning with FRBR. LRM defines key entities, attributes, and relationships important for bibliographic searches, interpretation, and navigation – as shown in Figure 3.3. It operates at the conceptual level and does not dictate data storage methods. Attributes in LRM can be represented as literals or URIs. The model is presented in a structured document format to support LOD applications and reduce ambiguity. During its development, a parallel process created FRBRoo (see 3.2.3.3), a model that extends the original FRBR model by incorporating it into CIDOC-CRM. FRBRoo focuses on CH data and is more detailed than LRM, which is designed specifically for library data and follows a high-level, user-centric approach [@riva_ifla_2017 pp. 9-13]. The LRM model, known as LRMer[38], was released in 2020 by IFLA [@zeng_metadata_2022 p. 163]. Figure 3.3: Overview of Relationships in LRM [@riva_ifla_2017 p. 86] BibFrame[39] is another metadata standard in the library domain. It was initiated around 2011 by the LoC to be a successor of MARC, which had become obsolete [see @tennant_marc_2002] as well as being invisible to web crawlers and search engines preventing adequate discoverability of bibliographic resources [@sprochi_where_2016 p. 132]. BibFrame is a loosely RDF-based model [@sanderson_linked_2015], intending to improve the interoperability and discoverability of library resources. While the BibFrame model may not perfectly correspond with the WEMI entities outlined in FRBR, it is possible to effectively link BibFrame resources to FRBR entities, ensuring their compatibility [@sprochi_where_2016 p. 133]. BibFrame aims to transition from MARC by providing a more web-friendly framework, focusing on the relationships between entities, improving data sharing, and accommodating the digital environment. Conversely, @edmunds_bibframe_2023 argues that BibFrame is unaffordable and leads to elitism within libraries, with the main beneficiaries being well-funded institutions, particularly in North America, while placing a financial burden on others. This approach, endorsed by bodies such as the LoC, is criticised for its high cost, impracticality, inequity and limited benefits for cataloguers, libraries, vendors and the public they serve. In addition, the author highlights BibFrame's lack of user friendliness, regardless of the intended users, and criticises the notion of adopting Linked Data for its own sake without substantial practical benefits. 3.2.3.2 Archival Metadata Standards For archives, metadata standards like EAD[40] and ISAD(G)[41] have been pivotal. EAD, introduced in the mid-1990s – it originated in 1993 and the first version of EAD was released in 1998, provides a hierarchical structure for representing information about archival collections, offering comprehensive descriptions that aid researchers, archivists, and institutions in managing and providing access to archival records. Its goal is to create a standard for encoding finding aids to improve accessibility and understanding of archival collections [@pitti_encoded_1999 pp. 61-62]. On the other hand, ISAD(G), released in its first version in 1994 by ICA, offers a more general international standard for archival description, providing a framework for describing all types of archival materials, including fonds, sub-fonds, series, files, and items [@shepherd_application_2000 p. 57]. ISAD(G) aims to establish consistent and standardised archival description practices on a global scale, facilitating the sharing and exchange of archival information. PREMIS[42], is another metadata standard that was initially released in 2005 – version 3.0 is the latest specification, published in 2016 – and focuses on the preservation of digital objects, consisting of four interrelated entities: Object, Event, Agent, and Rights [@caplan_practical_2005 p. 111]. The main objective of PREMIS is to help institutions ensure the long-term accessibility of data by capturing key details about their creation, format, provenance, and preservation events. It is seen as an elaboration of OAIS, which categorises information required for preservation in several functional entities and types of information package [see @bates_open_2009 pp. 425-426] – as illustrated by Figure 3.4, expressed through the mapping of preservation metadata onto the conceptual model [@zeng_metadata_2022 pp. 493-494]. Figure 3.4: OAIS Functional Model Diagram by @mathieualexhache_oais_2021 The latest development in metadata standards for archives is the creation of RiC, which has been developed since 2012 by ICA [@clavaud_ica_2021 pp. 79-80]. RiC is structured into four complementary parts [@ica_expert_group_on_archival_description_records_2023 p. 1] intended to cover and replace existing archival standards such as ISAD(G): RiC Foundations of Archival Description: A brief description of the foundational principles and purposes of archival description. RiC Conceptual Model: A high-level framework for archival description[43], as shown in Figure 3.5. RiC-O: The ontology[44], which embodies a specific implementation of the conceptual model. It is formally expressed in OWL to make archival description available using LOD techniques – which facilitating extensions [see @mikhaylova_extending_2023] – and adheres to a conceptual vocabulary specific to archival description. It provides the ability to navigate and interpret complex archival holdings and foster meaningful research and discovery. The ontology includes seven main groups of entities: Record, Agent, Rule, Event, Date, Place, and Instantiation. RiC Application Guidelines: A part in development at the time of writing which will provide practitioners and software developers with guidance and examples for implementing the conceptual model and the ontology in records and archival management systems. Figure 3.5: Global Overview of the Core Entities Defined by the RiC Conceptual Model. Slightly Adapted from https://github.com/ICA-EGAD/RiC-O 3.2.3.3 Museum and Gallery Metadata Standards In the museum and gallery domain, various metadata standards and conceptual models have significantly contributed to the management, organisation, and accessibility of CH objects and artworks. Notable among these are CDWA, CCO, LIDO, CIDOC-CRM, as well as Linked Art. CDWA[45], developed in the mid-1990s and maintained by the Getty Vocabulary Program, and CCO[46] created by the VRA[47], introduced in the early 2000s, primarily focus on describing art and cultural artefacts, providing a framework for recording essential details like artist, title, medium, date, and provenance. CDWA is a comprehensive set of guidelines for cataloguing and describing various cultural objects, including artworks, architectural elements, material culture items, collections of works, and associated images. While not a data model itself, it offers a conceptual framework for designing data models and databases, as well as for information retrieval. It then evolved into CDWA Lite, an XML schema for data harvesting purposes [@baca_categories_2017 pp. 1-2]. CCO comprises of both rules and examples of the CDWA categories and the VRA Core 4.0 for describing, documenting, and cataloguing cultural works and their visual surrogates[48] [@coburn_cataloging_2010 pp. 17-18]. Both CCO and CDWA are standards that the CIDOC[49] recommends and supports for museum documentation. LIDO[50] is a CIDOC standard introduced in the early 2000s which offers a lightweight XML-based serialisation used for describing museum-related information – as shown in Code Snippet 3.2. It provides a format for the interchange of data about art and CH objects, complementing CDWA and CCO as it integrates and extends CDWA Lite with elements of CIDOC-CRM [@stein_using_2019 p. 1025]. Ultimately, LIDO's goal is to enhance interoperability, accessibility, and the sharing of collection information, enabling institutions to connect and showcase their collections in diverse contexts [@coburn_lido_2010 p. 3]. LIDO is also a CIDOC Working Group, which are created to tackle particular issues or areas of interest[51]. Code 3.2: Example of a LIDO Object in XML from @lindenthal_lido_2023 <lido:lido> <lido:lidoRecID lido:source="ld.zdb-services.de/resource/organisations/DE-Mb112" lido:type="http://terminology.lido-schema.org/lido00099"> ld.zdb-services.de/resource/organisations/DE-Mb112/lido/obj/00076417 </lido:lidoRecID> <lido:descriptiveMetadata xml:lang="en"> <lido:objectClassificationWrap> <lido:objectWorkTypeWrap> <lido:objectWorkType> <skos:Concept rdf:about="http://vocab.getty.edu/aat/300033799"> <skos:prefLabel xml:lang="en"> oil paintings (visual works) </skos:prefLabel> </skos:Concept> </lido:objectWorkType> </lido:objectWorkTypeWrap> </lido:objectClassificationWrap> <lido:objectIdentificationWrap> <lido:titleWrap> <lido:titleSet> <lido:appellationValue lido:pref="http://terminology.lido-schema.org/lido00169" xml:lang="en"> Mona Lisa </lido:appellationValue> </lido:titleSet> </lido:titleWrap> </lido:objectIdentificationWrap> </lido:descriptiveMetadata> </lido:lido> CIDOC-CRM[52], developed since 1996 by the CIDOC and more specifically maintained by the CRM-SIG — which convenes quarterly[53], is a formal and top-level ontology that offers a comprehensive conceptual framework for describing CH resources, allowing for a deep understanding of relationships between different entities, events, and concepts for museums [@doerr_cidoc_2003 pp. 75-76]. It aims to provide a common semantic framework for information integration, supporting robust knowledge representation and fostering collaboration and interoperability within the CH sector as it can also mediate different resources from libraries and archives. The latest stable version of the conceptual model is version 7.1.2[54], published in June 2022, and comprises of 81 classes and 160 properties[55] [see @bekiari_cidoc_2021]. Within the base ontology of CIDOC-CRM – or CRMBase – and despite the emergence of new developments and gradual changes, there is a fundamental and stable core that can be succinctly outlined. This fundamental structure acts as a basic orientation for understanding the way in which data is structured within CIDOC-CRM. Examining the hierarchical structure of CIDOC-CRM, one can identify the main top-level branches, namely: E18 Physical Thing: This class comprises all persistent physical items with a relatively stable form, human-made or natural. E28 Conceptual Object: This class comprises non-material products of our minds and other human produced data that have become objects of a discourse about their identity, circumstances of creation or historical implication. The production of such information may have been supported by the use of technical devices such as cameras or computers. E39 Actor: This class comprises people, either individually or in groups, who have the potential to perform intentional actions of kinds for which someone may be held responsible. E53 Place: This class comprises extents in the natural space we live in, in particular on the surface of the Earth, in the pure sense of physics: independent from temporal phenomena and matter. They may serve describing the physical location of things or phenomena or other areas of interest. E2 Temporal Entity: This class comprises all phenomena, such as the instances of E4 Periods and E5 Events, which happen over a limited extent in time. Complemented by entities tailored for the documentation of E41 Appellation and E55 Type, the structure – as shown in Figure 3.6 – provides a potent set of means to capture a broad range of general-level CH reasoning in a holistic manner [@bruseker_cultural_2017 pp. 111-112]. Figure 3.6: CIDOC-CRM Top-Level Categories by @bruseker_cultural_2017 [p. 112] CRMBase, is supplemented by a series of extensions – sometimes referred to as the CIDOC-CRM family of models – intended to support various types of specialised research questions and documentation, such as bibliographic records or geographical data. These compatible models[56], ordered alphabetically, include both works in progress and models to be reviewed by CRM-SIG[57]. They comprise as follows: CRMact[58]: An extension that defines classes and properties for integrating documentation records about plans for future activities and future events. CRMarchaeo[59]: An extension of CIDOC-CRM created to support the archaeological excavation process and all the various entities and activities related to it. CRMba[60]: An ontology for documenting archaeological buildings. Its primary purpose is to facilitate the recording of evidence and material changes in archaeological structures. CRMdig[61]: An ontology to encode metadata about the steps and methods of production (‘provenance’) of digitisation products and synthetic digital representations such as 2D, 3D or even animated models created by various technologies. CRMgeo[62]: An ontology intended to be used as a global schema for integrating spatio-temporal properties of temporal entities and persistent items. Its primary purpose is to provide a schema consistent with the CIDOC-CRM to integrate geoinformation using the conceptualisations, formal definitions, encoding standards and topological relations. CRMinf[63]: An extension of CIDOC-CRM that facilitates argumentation and inference in descriptive and historical fields. It serves as a universal schema for merging metadata related to argumentation and inference, primarily focusing on these disciplines. CRMsci[64]: The Scientific Observation Model is an ontology that extends CIDOC-CRM for scientific observation, distinguishing the process from results and providing a formal ontology for scientific data integration and research modelling. CRMsoc[65]: An ontology for integrating data about social phenomena and constructs that are of interest in the humanities and social sciences based on analysis of documentary evidence. CRMtex[66]: An extension of CIDOC-CRM created to support the study of ancient documents by identifying relevant textual entities and by modelling the scientific process related with the investigation of ancient texts and their features. FRBRoo[67]: An ontology intended to capture and represent the underling semantics of bibliographic information which interprets the conceptualisations of the FRBR framework. PRESSoo[68]: An ontology intended to capture and reresent the underling semantics of bibliographic information about continuing resources, and more specifically about periodicals (journals, newspapers, magazines, etc.). PRESSoo is also an extension of FRBRoo. Figure 3.7 shows CRMbase and eight of the extensions previously outlined in a pyramid shape, where the lower you go in the pyramid, the more specialised the concepts. Figure 3.7: CIDOC-CRM Family of Models. Diagram done and provided by Maria Theodoridou (Institute of Computer Science, FORTH) Linked Art[69], a recent addition to this landscape, is a community-driven initiative and a metadata application profile that has been in existence since the end of 2016 [@raemy_ameliorer_2022 pp. 136-137]. This community – recognised as a CIDOC Working Group – has created a common Linked Data model based on CIDOC-CRM for describing artworks, their relationships, and the activities around them (see 3.5.5). 3.2.3.4 Cross-domain Metadata Standards There are a few cross-domain standards that have been used to describe CH resources. For instance, the Dublin Core Elements, containing the original core sets of fifteen basic elements, and Dublin Core Metadata Terms[70], its extension, are widely used metadata standards for describing CH resources. It provides metadata properties and classes that are applicable to a wide range of resources [@weibel_dublin_2000]. Another good example is the EDM that has been specified so that national, regional and thematic aggregators in Europe can deliver resources of content providers to Europeana [see @charles_enhancing_2015; @freire_technical_2019]. Despite the presence of cross-domain standards and efforts to map between standards, whether from one version to another or across different domains, reconciling metadata from various sources remains a significant challenge in the CH sector. Institutions may collect metadata in different ways, using different standards and schemas, making it difficult to merge and compare metadata from different sources. Additionally, metadata may be incomplete, inconsistent, or contain errors, further complicating data reconciliation. To address these challenges, standardised, interoperable metadata are necessary to enable data sharing and reuse. While the use of different metadata standards can present challenges for data reconciliation, the adoption of standardised, interoperable metadata can facilitate data sharing and reuse, promoting the long-term preservation and accessibility of CH resources. Controlled vocabularies – included in what @zeng_metadata_2022 [pp. 24-25] called ‘standards for data value’ – such as those maintained by the Getty Research Institute[71]: the AAT, the TGN, and the ULAN, as well as various kinds of KOS (see 3.2.4). These vocabularies provide a common language for describing CH objects and can improve the interoperability of metadata across different institutions and communities. Alongside metadata reconciliation comes also the question of aggregation. Apart from LIDO in museums, the general and current operating model for aggregating CH (meta)data is still the OAI-PMH [see @raemy_enabling_2020], which is an XML-based standard that was initially specified in 1999 and updated in 2002 [@lagoze_open_2002]. Alas, OAI-PMH does not align to contemporary needs [@van_de_sompel_reminiscing_2015], and there are now some alternative and web-based technologies for harvesting resources that are slowly being leveraged such as AS [@snell_activity_2017], a W3C syntax and vocabulary for representing activities and events in social media and other web application. It can also be easily extended and used in different contexts, such as it is the case with the IIIF Change Discovery API (see 3.5.3.3) or with ActivityPub [@lemmer-webber_activitypub_2018], a decentralised W3C protocol being leveraged by Mastodon[72], a federated and open-source social network. Overall, the evolution of metadata standards in the CH domain paves the way for a more interconnected and accessible digital environment, thereby providing better access to disparate collections and facilitating cross-domain reconciliation. This transformation is complemented by a growing emphasis on web-based metadata aggregation technologies that are more suited to today’s needs. 3.2.4 Knowledge Organisation Systems KOS, also known as concept systems or concept schemes, encompass a wide range of instruments in the area of knowledge organisation. They are distinguished by their specific structures and functions [@mazzocchi_knowledge_2018 p. 54]. KOS include authority files, classification schemes, thesauri, topic maps, ontologies, and other related structures. Despite their differences in nature, scope and application, all share a common goal: to facilitate the structured organisation of knowledge and classification of information. According to @zeng_metadata_2022 [p. 284], ‘KOS have a more important function: to model the underlying semantic structure of a domain and to provide semantics, navigation, and translation through labels, definitions, typing relationships, and properties for concepts’. This overarching intent underpins the practice of information management and retrieval. The term KOS ‘became even more popular after the encoding standard Simple Knowledge Organization System (SKOS) was recommended by W3C’, although the use of such systems can be traced back over 100 years, whereas others have been created in the advent of the web [@zeng_metadata_2022 p. 188]. According to @hill_integration_2002 [pp. 46-47, citing [@hodge_systems_2000]], KOS can be divided into four main groups: term lists, metadata-like models, classification and categorisation, as well as relationship models. Term lists encompass authority files, dictionaries, and glossaries, serving as controlled sources for managing terms, definitions, and variant names within a knowledge organisation framework. Metadata-like models encompass directories and gazetteers, offering lists of names and associated contact information as well as geospatial dictionaries for named places, with can be extended for representing events and time periods. In the classification and categorisation domain, you find categorisation schemes and classification schemes that organise content, subject headings that represent controlled terms for collection items, and taxonomies that group items based on specific characteristics. Finally, relationship models feature ontologies, semantic networks, and thesauri, each capturing complex relationships between concepts and terms [@hill_integration_2002; @zeng_knowledge_2008]. Figure 3.8 represents an overview of the structure and functions of these four main groups, showcasing as well the subcategories of KOS previously mentioned. In this figure, the x characters indicate the extent to which each type of KOS embodies five key functions identified by @zeng_knowledge_2008, such as eliminating ambiguity or controlling synonyms. In this subsection, I will explore four subcategories of KOS, each representing a continuum from a more linear to a more structured network. These include folksonomy, taxonomy, thesaurus, and ontology. These KOS have been selected due to their significant impact on the organisation and interlinking of data within the contexts of CHI practices and LOD. Furthermore, the intent of these systems is to help bridge the gap between human understanding and machine processing. Figure 3.8: Overview of the Structures and Functions of KOS [@zeng_knowledge_2008 p. 161] 3.2.4.1 Folksonomy Positioned at one end of the organisational spectrum, folksonomies, also known as community tagging or social bookmarking, are characterised by their user-generated nature. These systems rely on individual users’ tagging of content with keywords or tags that reflect their personal perspectives and preferences. Folksonomies as integration or reconciliation is often hard to achieve [@zeng_metadata_2022 p. 401]. However, they do provide a wealth of source material for studying social semantics [@zeng_metadata_2022 p. 403] and can be done in parallel to more structured KOS. 3.2.4.2 Taxonomy Moving towards the centre of the spectrum, taxonomies present a more structured approach to knowledge organisation. [@zeng_knowledge_2008 p. 169]. Taxonomies employ hierarchical classifications to systematically categorise information into distinct classes and sub-classes, or in a parent/child relationship [@saa_dictionary_taxonomy_2023] - as shown by Code Snippet 3.3 [@niso_guidelines_2010 p. 18]. Taxonomy, in this context, extends beyond mere categorisation; it also establishes relationships. Code Snippet 3.3: Taxonomy Hierarchy Chemistry Physical Chemistry Electrochemistry Magnetohydrodynamics 3.2.4.3 Thesaurus Moving further along the spectrum, thesauri offer a more detailed and formalised method of organisation. They include not only hierarchical relationships but also explicit semantic connections between terms, making them valuable tools for information retrieval. As defined by @niso_guidelines_2010 [p. 9]: A thesaurus is a controlled vocabulary arranged in a known order and structured so that the various relationships among terms are displayed clearly and identified by standardized relationship indicators. For instance, consider a thesaurus related to photography, which encompasses categories for various aspects of photography, including photographic techniques, equipment, and materials. Within this taxonomy, ‘Kodachrome’ could be categorised not only as a specific type of colour film but also as a distinct photographic process. As a type, it could fall under the sub-category of ‘colour film photography’, and as a process, it would fit within the broader framework of ‘photographic techniques’. The AAT, commonly employed in the CH domain, stands as a significant example of a thesaurus [@harpring_development_2010 p. 67]. Homosaurus[73] is another example of a thesaurus with a distinct focus on enhancing the accessibility and discoverability of LGBTQ+ resources and related information. Leveraging Homosaurus in metadata can effectively contribute to diminishing biases present in such data, an essential step in promoting inclusivity and equity within information systems [see @hardesty_mitigating_2021]. 3.2.4.4 Ontology At the structured end of the spectrum, ontologies define complex relationships and attributes between concepts, whereby a series of concepts have been chosen to express what we understand, so that a computer can start making sense of our world. Ontologies are formalised KOS, enabling advanced data integration and KR for more sophisticated applications. The term is drawn from philosophy, where an ontology is a discipline concerned with studying the nature of existence, as articulated by @gruber_translation_1993 [pp. 199-200]: An ontology is an explicit specification of a conceptualization. The term is borrowed from philosophy, where an ontology is a systematic account of Existence. For knowledge-based systems, what “exists” is exactly that which can be represented. There are different kinds of ontologies, including axiomatic formal ontologies, foundational ontologies, and domain-specific ontologies [@beretta_interoperabilite_2022]. These different types of ontologies cater to various knowledge representation needs. Foundational ontologies, such as DOLCE [74], provide a high-level framework for modelling knowledge and offers a comprehensive system for representing entities, qualities, and relationships [see @masolo_wonder_2003; @borgo_dolce_2022]. DLs, a family of formal KR languages, play also a key role in developing ontologies and serve as the foundation for OWL (see 3.4.2), notably by providing a logical formalism. DLs are characterised by their ability to provide substantial expressive power that goes well beyond propositional logic, while maintaining decidable reasoning [@chang_abox_2014]. In computer science, the concepts of ABox and TBox, both statements in KBs, are relevant to the structuring and enrichment of KGs [@giacomo_tbox_1996][75]. The ABox, representing the ‘assertion’ or ‘instance’ level, encapsulates concrete data instances and their relationships, contributing to the factual knowledge of a given system. Conversely, the TBox, representing the ‘terminology’ or ‘schema’ level, defines the conceptual framework and hierarchies that govern the relationships and attributes of the instances. These two complementary components work in harmony to improve data interoperability, reasoning and knowledge sharing. Figure 3.9 depicts a high-level overview of a KB representation system. Figure 3.9: Knowledge Base Representation System Based on @patron_embedded_2011 [p. 205] Consider a scenario around artwork provenance held in a museum. The ABox strives to encapsulate the rich narratives of individual artworks, tracing their journey through time, ownership transitions and exhibition travels. At the same time, the TBox creates a conceptual scaffolding, imbued with classes such as Artwork, Creator, and Exhibition, painting an abstract portrait that contextualises each artefact within a broader cultural tapestry. It is here that the DL comes in, harmonising the symphony with its logical relationships and axioms, i.e. a rule or principle widely accepted as obviously true [@baader_13_2007]. The DL is represented as 𝒦 = (𝒯, ℛ, 𝒜), where: 𝒯: represents the TBox, defining the conceptual framework, which encompasses the hierarchical relationships, classes, and concepts within the KB. ℛ: represents the set of binary roles, delineating the relationships and connections between individuals or instances in the domain. These roles facilitate the understanding of how entities relate to one another within the KB. 𝒜: represents the ABox, encompassing the specific assertions or instances in the KB. This symbiotic interplay ensures that the provenance of each artwork is not just a static account, but a dynamic, interconnected narrative. The ABox-TBox relationship thrives in the realm of reasoning. Imagine an axiom embedded in the TBox: ‘A work of art presented in an exhibition curated by a distinguished patron is of heightened cultural significance’, or here phrased in DL terms: ∃ curates.Artwork.CulturalSignificance ⊑ true. This axiom serves as a beacon to guide the system’s reasoning. When an ABox instance of an artwork is woven into an exhibition curated by a prominent authority, the DL-informed engine responds by inferring an enriched cultural value that resonates beyond the artefact itself. This is where the TBox takes data and gives it life, producing insights that transcend the boundaries of individual instances. The KB, 𝒦, captures this orchestration, encapsulating the logical relationships for meaningful interpretation and knowledge discovery. Overall, the relationship between ABox and TBox in DL is vital for achieving semantic clarity, enabling meaningful data integration, and facilitating advanced reasoning mechanisms. The museum provenance scenario showcases a precisely orchestrated convergence of assertion, terminology, and rigorous logical reasoning. This engenders a computational landscape where historical artefacts intricately mesh within the complex network of human history’s data structures, seamlessly aligning with the underlying framework of algorithmic representation. These components enable software developers to harmonise disparate datasets, extract insightful knowledge, and support decision-making processes across a wide range of domains. In essence, the use of DL, ABox, and TBox in ontological KR enhances interoperability between different systems and allows for sophisticated reasoning and decision support. Moving beyond these foundational concepts, it is noteworthy to consider the work of @ehrlinger_towards_2016, who address the need for a clear and standardised definition of KGs. They highlight the term’s varied interpretations since its popularisation by Google in 2012 and propose a definitive, unambiguous definition to foster a common understanding and wider adoption in both academic and commercial realms. They define a KG as follows: ‘A knowledge graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge’. This definition crystallises the essence of KGs as dynamic and integrative systems that not only store but also process and enrich data through advanced reasoning. This conceptualisation underlines the transformative potential of KGs in various domains, bridging the gap between raw data and actionable insights. Finally, it is important to recognise that the importance of ontologies extends beyond individual systems. Shared ontologies are a cornerstone of semantic interoperability, thus facilitating a paradigm shift in the way systems and applications communicate. As @sanderson_rdf_2013 argues: ’shared ontologies increases semantic interoperability’ and ‘shared identity makes it possible for graph to merge serendipitously’. This shared understanding ensures that various entities can seamlessly connect and engage in meaningful interactions. 3.3 Trends, Movements, and Principles Technological trends, scientific movements, and guiding principles have played a crucial role in shaping the landscape of contemporary research. In recent years, there has been an increased emphasis on the need for academic and CH practices to be more transparent, inclusive, and accountable. This shift reflects a broader trend towards integrating advanced technological solutions and open-science principles in heritage management. As such, understanding the evolution of CH becomes imperative to comprehend how these practices have adapted and transformed in response to these guiding trends. The evolution of CH has been characterised by a series of technological and methodological shifts. Initially, the primary focus was on digitising physical artefacts to preserve information from degrading originals. This phase was crucial for transitioning tangible CH into a digital format, mitigating the risk of loss due to physical degradation. Following this, efforts shifted towards ensuring the persistence of digitised resources. This stage involved addressing challenges related to digital preservation, including data degradation and format obsolescence, to ensure the longevity of digital cultural assets. The advent of open data principles marked the next phase in CH development. This approach facilitated broader access to information, aligning with contemporary values of transparency and inclusivity in, governmental, academic, and cultural contexts. Subsequently, the focus expanded to enhancing the utility of this data. This stage involved contextualising and enriching CH data, thereby increasing their applicability and relevance across various domains. The current frontier in CH involves developing applications that leverage rich CH data. These applications serve not only as tools for engagement and education but also as justifications for the ongoing costs associated with data storage and archival. They illustrate the tangible benefits derived from preserving heritage resources, encompassing both cultural and economic returns. In summary, the trajectory of CH development mirrors broader technological and societal trends, transitioning from preservation to active utilisation. This progression underscores the dynamic nature of research and CH processes, highlighting the evolving requirements for transparency, inclusivity, and accountability in CH management. While automation has significantly enhanced the efficiency of digitisation processes in CH, cataloguing and indexing remain complex challenges. The intricacies involved in accurately understanding and categorising resources necessitate more than just technological solutions; they require context-aware and culturally sensitive approaches. Here, ML offers promising perspectives. ML, particularly in its advanced forms like deep learning, can assist in cataloguing and indexing by analysing large datasets to identify patterns, categorise content, and even suggest metadata. This can be particularly useful in handling large volumes of CH data, where manual processing is time-consuming and prone to human error. Typical applications of ML in this field include image recognition for identifying and classifying visual elements in artefacts, NLP for analysing textual content, and pattern recognition for sorting and organising data based on specific characteristics. Furthermore, prospective developments may entail the refinement of metadata mapping and the enhancement of quality control mechanisms. Moreover, ML algorithms can be trained to recognise stylistic elements, historical contexts, and other nuances that are essential for accurate cataloguing in CH. However, it is crucial to note that the effectiveness of ML depends heavily on the quality and diversity of the training data. Biases in this data can lead to inaccuracies in cataloguing and indexing. Thus, a collaborative approach, where ML is supplemented by expert human oversight, is often the most effective strategy. Overall, this section provides a comprehensive overview of six technological trends as well as five key scientific movements and guiding principles that are shaping research and how universities and GLAMs should provide environments, services, and tools with a view to collecting and disseminating content. By exploring each of these trends, movements, and principles, we can gain a deeper understanding of how research and CH processes are permeated by dynamic movements and how resources can be made more transparent, inclusive and accountable, as well as how data can be made available to human and non-human users. 3.3.1 Current and Emerging Technological Trends in Cultural Heritage (…) 4. Exploring Relationships through an Actor-Network Theory Lens As Jim Clifford taught me, we need stories (and theories) that are just big enough to gather up the complexities and keep the edges open and greedy for surprising new and old connections. [@haraway_staying_2016 p. 101] This chapter serves as the theoretical framework of the dissertation, and its primary goals are to elucidate the theoretical underpinnings and provide a comprehensive toolbox for addressing the identified problem. In the preceding literature review chapter, I highlighted the issue that necessitates attention around interlinking CH. The theoretical framework, sometimes referred to as the ‘toolbox’, which can be likened to ‘tools’ the that will be employed to understand and address this problem. Here, the primary purpose of this chapter is to offer an in-depth exploration of the tools – which comprises various theories, propositions, and concepts – delineating their characteristics, behaviours, historical applications, interrelationships, relevance to the study’s objectives, and potential limitations. Subsequently, the next chapter will elucidate how these tools will be operationalised in the research process. (…) 5. Research Scope and Methodology This chapter delineates the Research Scope and Methodology, laying the groundwork for the empirical exploration within this thesis. (…) 6. The Social Fabrics of IIIF and Linked Art (…) 7. PIA as a Laboratory (…) 8. Yale’s LUX and LOUD Consistency (…) 9. Discussion [Il] faut renoncer à l’idée d’une interopérabilité syntaxique ou structurelle par l’utilisation d’un modèle unique, qu’il s’agisse de la production, de stockage ou de l’exploitation au sein même d’un [système d’information]. [@poupeau_reflexions_2018] [76] This chapter presents a comprehensive discussion where I interpret, analyse and critically examine my findings in relation to the thesis and the wider application of LOUD. Through an in-depth analysis of the design principles of LOUD and their implications for CH, this discussion aims to demonstrate the many challenges and opportunities inherent in this framework. The focus is on achieving community-driven consensus, rather than simply pursuing technological breakthrough. The following sections are organised to provide a comprehensive review of the empirical findings, an evaluation abstracting LOUD, and a retrospective analysis of the research journey. Firstly, in Section 9.1, I will present a summary of the empirical findings from my research. This will include key themes and insights, structured to reflect the different areas of study and practice within LOUD. Secondly, in Section 9.2 I will provide an evaluation of LOUD by means of using the LoA approach. This evaluation will focus on the impact of LOUD on the perception of Linked Data within the CH domain and the wider DH field. This will include the key themes and insights that have emerged, structured in a way that reflects four levels of abstraction. I will also explore the dual nature of LOUD implementation, involving both simplicity and complexity, and discuss the various factors that influence such dynamics. Finally, in Section 9.3, I will offer a retrospective analysis of the research journey. This section will interpret the findings to situate LOUD as fully-fledged actors. It will reflect on the challenges, achievements, and lessons learned throughout the research process, providing a holistic view of the project’s trajectory and its implications for the future of LOUD. 9.1 Empirical Findings This section summarises the empirical findings of my research and already offers some suggestions. The structure does not follow the exact order of the three empirical chapters but is organised around overarching topics that emerged throughout the study. The seven topics include Community Practices and Standards, Inclusion and Marginalised Groups, Maintenance and Community Engagement, Interoperability and Usability, Future Directions and Sustainability, Digital Materiality and Representation, as well as Challenges of Scaling and Implementation. Community Practices and Standards GitHub serves as a vital hub for community involvement, with a core group of active contributors often attending meetings regularly. This platform simplifies decision-making within the community, although it also reflects biases similar to those in FLOSS communities. Behind visible activities like meetings, there is substantial preparatory work managed by co-chairs, editorial boards, or driven by community-generated use cases. This foundational work often determines the direction and outcomes of formal gatherings. The LUX project at Yale, as seen in , has successfully fostered collaboration across various units, bringing together libraries and museums on a unified platform. The technological foundation of LUX, based on open standards, facilitates data integration and cross-collections discovery. Not only does the deployment of FLOSS tools contribute to these achievements, but it also emphasises the social advantages of working collaboratively. The concept of the Tragedy of the Commons, as described by @hardin_tragedy_1968, highlights the potential for individual self-interest to deplete shared resources. However, @ostrom_governing_1990 offers a counterpoint by demonstrating how communities can successfully manage common resources through collective action and shared norms. In this context, initiatives like the CHAOSS initiative[77] play a significant role by providing metrics that help evaluate the health and sustainability of open source communities. These metrics include contributions, issue resolution times, and community growth, offering valuable insights into how collaborative efforts can be maintained and improved. Reaching consensus is another critical aspect of community practices and standards. While the minutes of meetings are valuable artefacts, they often reflect an Anglo-Saxon approach to decision-making characterised by few substantive points and critical turning points. The formal aspects of conversations captured in minutes do not fully encompass the decision-making process, which frequently involves informal conversations, consensus-building through open dialogue, and subtle cues that influence outcomes. These elements are integral to the English and American approach and hold valuable lessons for an international community. IIIF and Linked Art are international communities, but decisions are made in English and the majority of participants are based in North America and the UK, significantly imprinting this approach. Understanding these nuances can help us improve our collaborative efforts within the IIIF and Linked Art communities. By recognising and appreciating these different facets of decision-making, we can learn from each other and enhance our collective ability to make effective and inclusive decisions. Some of the challenges associated with these practices include the major demand on resources for community building, the slowness inherent in distributed development, and the difficulty in achieving consensus. Additionally, the concept of social sustainability can be seen as an imaginary construct that papers over differences, as discussed by @fitzpatrick_generous_2019. Addressing these challenges is crucial for the long-term success and effectiveness of the IIIF and Linked Art communities. Inclusion and Marginalised Groups The demographic homogeneity in these communities can perpetuate biases and neglect issues relevant to underrepresented or marginalised groups, as seen in . Participation in these standardisation processes is itself a privilege. The assumption that internet access and digital devices are universally available is critically examined, revealing key actors in the digital landscape. This mirrors issues within the IIIF community, where generating IIIF resources presupposes means that may not be accessible to all. We need clear terms of inclusion, as highlighted by @hoffmann_terms_2021. She argues that effective inclusion requires a critical examination of the frameworks and conditions under which inclusion is offered. The framework should ensure that inclusion initiatives do not merely add diversity to existing power structures but work to transform these structures fundamentally. This involves questioning who defines the terms of inclusion, who benefits from them, and who may be inadvertently excluded. @hoffmann_terms_2021 suggests a participatory approach, where marginalised communities are actively involved in shaping inclusion policies and practices, thus making inclusion an ongoing, reflective process rather than a static goal. The inclusion of marginalised groups is a necessary step, but it is not sufficient. To truly make a difference, there must be a strategic and concentrated effort to appropriate technologies, as emphasised by [@morales_apropiacion_2009; @morales_imaginacion_2017; @morales_apropiacion_2018] and further articulated by [@martinez_demarco_empowering_2019; @martinez_demarco_digital_2023]. This strategic approach highlights the political significance of challenging dominant neoliberal and consumerist perspectives on technology and individual engagement. @martinez_demarco_digital_2023 underscores the critical importance of focusing on practices that go beyond mere inclusion. Instead, it requires a deep understanding and critical assessment of how technology is intertwined with social, economic, and ideological contexts. It implies a reflective and deliberate process of technology adoption in which individuals creatively tailor technology to their specific needs, beliefs, and interests. Moreover, a key aspect highlighted by @martinez_demarco_digital_2023 is the implicit and explicit critique of a universalist approach to inclusion, which often lends itself to all too easy instrumentalisation. Understanding and studying resistance to inclusion in an oppressive digital transformation context is paramount, particularly given the highly unequal conditions that prevail. In this light, a comprehensive study of socio-material and symbolic processes, practices, and involved in embedding technologies into individuals’ lives is needed. This approach also recognises technology as a catalyst for change. It envisions the use of technology to drive meaningful change at multiple dimensions and realities—national, societal, or personal. By focusing on these practices, empowering individuals to navigate and use technology thoughtfully and purposefully becomes a reality, bridging the gap between technological advances and societal progress [@martinez_demarco_empowering_2019]. Maintenance and Community Engagement The tension between creating advanced specifications and their practical implementation by platforms is evident in the IIIF Cookbook recipes and Linked Art patterns, as discussed in Chapter 6. This ongoing development shows that the community is still finding the best ways to achieve broad adoption and interoperability. The deployment of the Change Discovery API, as illustrated in Chapter 7, demonstrates that establishing such a protocol on top of the IIIF Presentation API is feasible and straightforward. High-level support from leadership, particularly Susan Gibbons as Vice Provost, has been crucial in building trust and ensuring the project’s success as a valuable discovery layer at Yale. This integration of diverse collections through a unified platform, based on open standards, highlights the potential for transforming teaching, learning, and research by leveraging collaborative efforts. The topic modelling exercise in LUX reveals the intricate actor-networks composed of organisations, individuals, and non-human actors. This analysis underscores the importance of ongoing processes and relationships in maintaining and evolving infrastructure, akin to the concept of ‘infrastructuring’. As detailed in Chapter 8, following best practices and guidelines such as the SHARED Principles is essential for better involvement, but it is also crucial to uphold these commitments consistently over the long term to ensure meaningful participation. Between the PIA team members, there were sometimes ‘disconnects between different communities who undertake collaborative research’ [@vienni-baptista_foundations_2023]. This was something we had to navigate and learn from, which was manageable within the context of a laboratory setting. However, for any follow-up projects or whatever forms the digital infrastructure we built may take, it is imperative that these disconnects are addressed and solidified to ensure cohesive and sustained community engagement. Interoperability and Usability Within PIA, different APIs have been progressively deployed to meet various requirements while allowing parallel exploration of data modelling. Each API offers unique advantages, but their collective integration promotes semantic interoperability. For example, the IIIF Image API has been instrumental in rationalising image distribution across prototypes, providing efficient access to high-quality digital surrogates and the ability to resize them for different uses. Adherence to LOUD standards and schemas within LUX has generally been positive, although transitioning between versions of a specification can present challenges, highlighting the need to improve the consistency of compliant resources. Linked Art, for instance, has the capacity to generate various insights and sources of truth around different entities. However, additional or entirely new vocabularies from sources like the Getty may need to be used – such as Homosaurus. Complementary to Linked Art, using WADM allows for assertions that go beyond purely descriptive narratives, though it may sacrifice some semantic richness. This complexity in managing vocabularies and maintaining semantic richness directly ties into broader usability considerations within the community. Addressing these usability concerns, Robert Sanderson has suggested focusing on the use of full URIs in Linked Art to ensure computational usability, in contrast to IIIF‘s approach of minimising URIs to enhance readability. This difference highlights a fundamental question in usability: balancing readability and computational usability. Understanding developers’ perspectives on these approaches is critical. I would suggest as a way forward for the IIIF and Linked Art communities to focus on further improving usability of the specifications. This includes conducting comprehensive usability assessments of APIs to evaluate the experiences of new developers versus existing ones, understanding the steepness of the learning curve associated with each API, and guiding improvements in documentation, on-boarding processes, and overall developer support. Efforts should be made to lower the barriers to entry for new developers by developing more intuitive and user-friendly tutorials, providing example projects, and creating a robust support community. Ensuring that developers can quickly and effectively leverage APIs will foster greater adoption. Addressing the challenges of transitioning between different versions of specifications is critical, and developing tools and guidelines that help maintain consistency across versions will reduce friction and ensure smoother updates. Future Directions and Sustainability Survey findings, as discussed in , underscore the need for ongoing efforts to develop LOUD standards that foster an inclusive, dynamic digital ecosystem. Future strategies should include creating educational resources and frameworks that support interdisciplinary collaboration and reduce barriers to participation. While the Manifest serves as the fundamental unit within IIIF, the Linked Art protocol can play a similar central role as semantic gateways in broader contexts, allowing round-tripping across the APIs. The topic modelling exercise in LUX, detailed in , reveals complex actor-networks of organisations, individuals, and non-human actors, providing insights into the relationships sustaining the LUX initiative. The next steps for Linked Art might involve forming a new consortium independent of a CIDOC Working Group, which could provide the necessary support to sustain the initiative. Alternatively, integrating Linked Art into IIIF as a new TSG and specification could address the discovery challenges within IIIF, as discussed during the birds of a feather session led by Robert Sanderson [see @raemy_notes_2024] at the 2024 IIIF Conference in Los Angeles[78]. Design principles that act as bridges across different disciplines, as proposed by @roke_pragmatic_2022, are crucial. IIIF has demonstrated that this collaborative approach is feasible, and Linked Art could follow in its footsteps. However, achieving this requires increased dedication from passive members and broader adoption of the model and the API ecosystem in the near future. Digital Materiality and Representation As explored in Chapter 7, the detailed digital representation of photographic albums, such as the Kreis Family Collection, demonstrates the need to comprehensively capture the materiality of digital objects. This includes the structure and context of images, which are crucial for maintaining their historical and social significance. The implementation of the IIIF Presentation API in creating a detailed digital replica of the Getty’s Bayard Album shows how digital materiality can be enhanced through thoughtful use of technology, but also highlights the scalability challenges for such detailed representations. Creating these detailed digital representations can be seen as a ‘boutique’ approach, which, while labour-intensive and resource-demanding, is necessary for preserving the integrity and contextual significance of cultural heritage objects. The challenge lies in developing the appropriate means and methodologies to achieve this level of detail consistently. Future endeavours, whether through research projects or collaborative efforts between GLAM institutions and DH practitioners, should aim to address these challenges and create sustainable practices for digital materiality and representation. As Edwards aptly notes: ‘Presentational forms equally reflect specific intent in the use and value of the photographs they embed, to the extent that the objects that embed photographs are in many cases meaningless without their photographs; for instance, empty frames or albums. These objects are only invigorated when they are again in conjunction with the images with which they have a symbiotic relationship, for display functions not only make the thing itself visible but make it more visible in certain ways‘. [@edwards_photographs_2004 p. 11] Challenges of Scaling and Implementation As seen in Chapter 6, the IIIF Cookbook recipes and Linked Art patterns reflect the tension between creating advanced specifications and their practical implementation. This gap between ideation and real-world application underscores the challenges faced by the community in achieving broad adoption and interoperability. In Chapter 7, the exploration of APIs like the IIIF Change Discovery API illustrates the practical challenges and potential of scaling these technologies for wider adoption. The successful implementation in PIA demonstrates viability, but also points to the need for continued development and community engagement to fully realise the benefits. Furthermore, assessing the scalability of IIIF image servers, as discussed by [@duin_webassembly_2022] and exemplified by the firm Q42 with their Edge-based service Micrio[79], highlights the importance of optimising data performance. Erwin Verbruggen aptly noted that ‘optimising data performance in my opinion mens sending as little data over as needed’[80], emphasising the need for efficient data handling to enhance scalability. This insight reinforces the necessity of continual refinement in scaling digital infrastructure to support broader use and integration. Reflecting on these findings, I would like to assert that continuous participation, particularly for institutions that can afford to be part of initiatives like IIIF-C, is essential. Active members should not only focus on their own use cases but also consider the needs and perspectives of other, perhaps marginalised, groups. Achieving the dual goals of making progress within one community, whether it be IIIF or Linked Art, while also engaging in effective outreach and creating a solid baseline, will benefit everyone in the CH sector and beyond. Addressing where LOUD fits in, how people perceive this new concept or paradigm, and understanding how LOUD differs from Linked Data in general are essential. These questions help to clarify the stages at which themes related to one of the LOUD design principles emerge, crystallise, and potentially disappear. My thesis does not fully resolve these queries but offers insights and hints for further exploration. In conclusion, the empirical findings reveal the richness of the implementation and maintenance of LOUD standards in the CH domain. From the critical role of community practices and standards to the challenges of achieving interoperability and inclusivity, each theme underlines the complex interplay of social, technical and organisational factors. will look at the evaluation of LOUD and explore its overall impact, delving into the delta of what to do with it, particularly in terms of Linked Data versus LOUD, where my thesis provides pointers but does not provide definitive answers. 9.2 Evaluation: Abstracting LOUD In this section I will assess the impact of LOUD within the CH domain and the wider DH field, examining its implications for community practices and semantic interoperability, and secondarily whether LOUD has affected the perception of Linked Data. (…) 9.3 Retrospective: Truding like an Ant (…) 10. Conclusion For a better understanding of the past, Our images have to be enhanced, A new dialogue in three dimensions, Must have openness at its heart, For somewhere within the archive Of our aggregated minds Are a multitude of questions And a multitude of answers, Simply awaiting to be found. [@mr_gee_day_2023] This chapter brings to a close the journey undertaken since February 2021, aiming to clearly articulate the answers to the research questions, discuss how the research aligns with the objectives, elucidate the significance of the work, outline its shortcomings, and suggest avenues for future research. I had the privilege of hearing the above poem at EuropeanaTech in The Hague in October 2023. What struck me most, and what I have tried to convey in this thesis, was the powerful dialogue and collective spirit striving to harness the potential of our (digital) heritage. With a sense of conviction after this conference, I approached the next one in Geneva in February 2024 with confidence, believing that I had made a compelling case for the concept of LOUD. When a participant asked how LOUD differed from Linked Data, however, I found myself explaining the socio-technical ethos of IIIF and Linked Art, the richness of the individuals who make them up, the ability to combine these different standards, and the common use cases that emerge from these collaborations. Whether my answer was convincing remains uncertain, but I knew it was too brief. Perhaps it is here, in this conclusion, that my thoughts can find their full expression. I believe that LOUD should be at the forefront of efforts to improve the accessibility and usability of CH data, an endeavour that is increasingly relevant in a web-centric environment. This paradigm has gained considerable traction, particularly with the advent of Linked Art and the recognition that the IIIF Presentation API has been an inspiration for the LOUD design principles. The development and maintenance of LOUD standards by dedicated communities are characterised by collaboration, consensus building, and transparency. In the interstices of the IIIF and Linked Art communities, frameworks for interoperability are not only exposed, but revealed as profound testaments to the power of transparent collaboration across institutional boundaries. Both communities, it is true, are still very much Anglo-Saxon efforts, where the specifications have mainly been implemented in GLAM and/or DH research projects, or at least when we have been aware of them. It has clear guidelines on how to propose use cases, mostly using GitHub, and hides the sometimes unnecessary RDF complexity behind a set of JSON-LD @ context. IIIF is at the presentation layer and can really play its role as a mediator, with the Manifest as its central unit connecting other specifications, including semantic metadata, and preferably with simpatico specifications such as Linked Art. An important hypothesis arises from the observation that adherence to the LOUD design principles makes specifications more likely to be adopted. The primary benefit of adopting LOUD standards lies in their grassroots nature. This grassroots approach not only aligns with the core values of openness and collaboration within the DH community but also serves as a common denominator between DH practitioners and CHIs. This unique alignment fosters a sense of shared purpose and common ground. However, it’s essential to acknowledge that while LOUD and its associated standards, including IIIF, hold immense promise, their limited recognition in the wider socio-technical ecosystem may currently hinder their full potential impact beyond the CH domain. Consideration of socio-technical requirements and the promotion of digital equity are essential to the development of specifications in line with the LOUD design principles. In the context of the IIIF and Linked Art communities, this means both recognising current challenges and building on existing practices. This includes forming alliances that support diverse forms of inclusion at both project and individual levels. For example, organisations should be encouraged to send representatives from diverse professional and personal backgrounds, such as underrepresented groups or non-technical fields. This can be facilitated by initiatives that lower the barriers to participation, such as financial support for travel and participation, flexible participation formats, and targeted outreach efforts. Furthermore, as these standards often align with open government data initiatives, they present opportunities for broader public engagement and institutional transparency. In the broader context of DH, understanding LOUD involves tracing the historical development of the field and its evolving relationship with technology. The interdisciplinary nature of DH has always integrated diverse scholarly and technical practices. In recent years, DH has seen a notable increase in interest in the use of Linked Data and semantic technologies to improve the discoverability and accessibility of CH collections. LOUD's emphasis on user-centred design and usability aligns well with these goals. Consequently, the principles of LOUD hold great promise for advancing the integration and use of community-driven APIs and/or Linked Data within DH. This can be seen within PIA, where the benefits of implementing IIIF helped us to streamline machine-generated annotations, integrate different thumbnails into GUI prototypes, model photo albums with different layers from the Kreis Family collection, and enable project members and students to engage in digital storytelling, an important participatory facet that can be seamlessly explored by DH efforts and CHIs with the help of the IIIF Image and Presentation APIs. Data reuse is definitely a key LOUD driver, which could have been done more extensively with a productive instance of Linked Art. As for widening participation, this is definitely a strategic and political decision, rather than a technical one. That said, LOUD specifications can definitely be embedded through strategic citizen science initiatives. A recent example that highlights the comprehensive value of Linked Data was presented by @newbury_linked_2024 at the CNI Spring 2024 Meeting. He delineated its significance as extending well beyond single entities, such as the Getty Research Institute, to enrich a vast ecosystem. Specifically, he identified three principal areas of value: Firstly, within the ecosystem itself, where the utility of information is amplified through its application in diverse contexts. Secondly, for the audience, by directly addressing user needs and facilitating various conceptual frameworks. And finally, within the community, by enabling wider use and adaptation of data and code. This approach to Linked Data, as articulated by Newbury, not only enhances its utility across these dimensions, but also aligns seamlessly with the LOUD proposition, underscoring a shared vision for a digital space where the interconnectedness and accessibility of (meta)data serve as foundational principles for progress and community engagement. LUX, as a catalyst for LOUD, exemplifies a practical approach to implementing Linked Data that has garnered significant local engagement and support at Yale. This initiative demonstrates how sound socio-technical practices can be effectively applied within a supportive institutional environment. The consistency of the data within LUX aligns well with IIIF and Linked Art standards, with only a few minor adjustments required for full compliance. These quick fixes are manageable and do not detract from the overall robustness of the initiative. While it may be too early to fully assess the wider impact of using LOUD specifications on the LUX platform within the CH domain, the initiative has already attracted considerable interest in recent months. This growing attention suggests that the LUX approach is resonating with other organisations, suggesting the potential for wider adoption and impact. The enthusiastic local engagement at Yale provides a strong foundation for LUX and highlights its potential to serve as a model for similar projects aimed at enriching digital heritage through effective collaboration and agreed-upon standards. In carrying out this thesis, I have adhered to the five main objectives set out at the beginning of the PhD. These objectives have been accomplished to a high degree, reflecting a substantial and well-executed project. Furthermore, most of the outputs – such as data models and scripts – from this work are available on GitHub, providing open access to the wider community. In addition, I have published several papers, both individually and collaboratively, further disseminating the findings and contributions of this research. Additionally, this thesis is relevant because it sheds light on communities and implementations that can be celebrated not only for their standards but also for their operating ethos; IIIF and Linked Art present models ripe for emulation beyond their immediate digital confines. Here, agency and authority are most typically granted to the collective over the isolated, with each actor - be it an individual, an institution or an interface – intricately interconnected. Yale’s LUX initiative also embodies this ethos, demonstrating how collaborative efforts can lead to innovative solutions and wider impact. It is to be hoped, then, that these practices of openness and multiple partnerships will not be seen as limited to their origins in digital representation. At the very least, I hope that these socio-technical approaches can serve as exemplars or sources of inspiration in broader arenas, where the principles of mutual visibility and concerted action can point the way towards cohesive and adaptive collaborative architectures. Despite its contribution, this thesis is far from perfect and certainly contains several shortcomings. I will name here three significant ones. First, the visualisations included and the use of FOL are primarily designed to support my own self-reflection and may be more beneficial to me than to the broader academic community. While they provide insights into my research process and findings, their applicability and usefulness to others might be limited. Second, the theoretical framework I employed, while instrumental to my research, may not serve as a universally applicable toolbox. Nevertheless, I urge readers to pay close attention to STS methodologies and practices. The works of Bruno Latour, Donna Haraway, and Susan Leigh Star have been invaluable companions throughout this dissertation. Additionally, for those involved in conceptualising semantic information, I recommend exploring Floridi’s PI, which offers profound insights into the nature and dynamics of information. These readings have greatly influenced my approach and understanding, and I believe they can offer valuable perspectives to others as well. Third, while the thesis aims to address both community practices and semantic interoperability, it leans more heavily towards the former. This emphasis on community practices may overshadow the broader discussion of semantic interoperability, potentially limiting the appeal of the thesis to those primarily interested in the technical aspects. Other shortcomings include the broad scope of the thesis, with three empirical chapters exploring different avenues. While this comprehensive approach provides a broad understanding of the research topic, it has also resulted in a rather lengthy thesis. This may be a challenge for the reader, as a topic of interest in one chapter may not be as compelling in another. The diversity of empirical focus, while enriching the research, may dilute the coherence for some readers, making it more difficult to maintain a consistent engagement throughout the dissertation. Despite these limitations, I hope that the different perspectives and findings contribute to a richer, more nuanced understanding of LOUD for CH. Avenues for future research are numerous and promising. One interesting area to explore is the comparative benefits experienced by early adopters of IIIF and Linked Art specifications versus those who implemented these standards later. Early adopters have the advantage of having their use cases discussed and resolved within the community, and it would be insightful to analyse the long-term impacts on their projects. Such a study is already feasible for early adopters of IIIF and will become possible to compare further implementations of Linked Art within a few years. Furthermore, future exploration could focus on the full implementation of Linked Art within PIA or similar efforts, as well as more performance-oriented testing with the deployed LOUD APIs. These efforts should further validate the robustness and scalability of these specifications. Another important area for future investigation is the participation of institutions and individuals from the Global South in both the IIIF and Linked Art communities. It is crucial to explore how we can better support their uptake of these specifications and encourage their active involvement in these initiatives to ensure a more inclusive and globally representative environment. As I reflect on the journey of this thesis, I am reminded of the powerful dialogue and collective effort that has been at its heart. Mr Gee’s poem resonates deeply with my own aspirations for this work: to enhance our understanding of the past through openness and collaboration, as can be seen in IIIF and Linked Art. As I bring this dissertation to a close, I am filled with a sense of accomplishment and a renewed commitment to promoting sound socio-technical practices. It is my hope that the insights and methodologies presented here will inspire others to engage in this ongoing dialogue, continually asking and answering the many questions that arise as we collectively explore our cultural heritage landscapes. Throughout this dissertation, British English spelling conventions are predominantly observed. However, there are instances of American English spelling where direct quotations from sources are used as well as referring to names of institutions, standards, or concepts. ↩︎ SNSF Data Portal - Grant number 193788: https://data.snf.ch/grants/grant/193788 ↩︎ Seminar für Kulturwissenschaft und Europäische Ethnologie: https://kulturwissenschaft.philhist.unibas.ch/ ↩︎ DHLab: https://dhlab.philhist.unibas.ch/ ↩︎ HKB: https://www.hkb.bfh.ch/ ↩︎ The considerable size of the ASV collection, which includes over 90,000 analogue objects, reflects not just the work of the main authors but also the contributions from numerous explorers and additional material beyond the maps and primary publications. ↩︎ Max Frischknecht’s PhD: https://phd.maxfrischknecht.ch/ ↩︎ PIA project website: https://about.participatory-archives.ch/ ↩︎ The vision of the PIA project was first written in German and then translated into English and French. ↩︎ In our joint paper, we wrote ‘man-made’, corrected here, which makes me think of the transition within the CIDOC-CRM for the Entity E22 Human-Made Object from version 6.2.7 onward. ↩︎ Knora Base Ontology: https://docs.dasch.swiss/2023.07.01/DSP-API/02-dsp-ontologies/knora-base/ ↩︎ SIPI documentation: https://sipi.io/ ↩︎ IIIF Working Groups Meeting, The Hague, 2016: https://iiif.io/event/2016/thehague/ ↩︎ Van Gogh, Vincent. (1889). Irises [Oil on canvas]. Getty Museum, Los Angeles, CA, USA. https://www.getty.edu/art/collection/object/103JNH ↩︎ Giacometti, Alberto. (1956). L’homme qui marche I [Sculpture]. Carnegie Museum of Art, Pittsburg, PA, USA. https://www.wikidata.org/entity/Q706964 ↩︎ UNESCO World Heritage List: https://whc.unesco.org/en/list/ ↩︎ Blue Shield International: https://theblueshield.org/ ↩︎ The ICBS was founded by the ICA, ICOM, ICOMOS, and IFLA. ↩︎ Guro. (1900-1950). Male Face Mask (Zamble) [Wood and pigment]. Art Institute of Chicago, Chicago, IL, USA. https://www.artic.edu/artworks/239464 ↩︎ I have opted for the term ‘affordance’ and not ‘representation’ as my intention is to maintain a comprehensive scope that encompasses various modalities such as modelling endeavours. ↩︎ To some degree, parallels can be drawn between the distinctions of cultural and digital heritage with those drawn between the humanities and DH. ↩︎ Inicio - Museos Comunitarios de América: https://www.museoscomunitarios.org/ ↩︎ The descriptions of each of these nine dimensions are selected excerpts from @star_ethnography_1999. ↩︎ A PID is a long-lasting reference to a digital resource. It usually has two components: a unique identifier and a service that locates the resource over time, even if its location changes. The first helps to ensure the provenance of a digital resource (that it is what it purports to be), whilst the second will ensure that the identifier resolves to the correct current location [@digital_preservation_coalition_persistent_2017] ↩︎ Rijksmuseum: https://www.rijksmuseum.nl/ ↩︎ In the original version, these instances contained typographical or factual errors. They have been struck through and corrected here. ↩︎ @zeng_metadata_2022 [p. 11] articulate that ‘as with “data”, metadata can be either singular or plural. It is used as singular in the sense of a kind of data; however, in plural form, the term refers to things one can count’. In the context of this thesis, I have chosen to favour the plural form of (meta)data. However, I acknowledge that I may occasionally use the singular form when referring to the overarching concepts or when quoting references verbatim. ↩︎ The snapshot of this bibliographic record was taken from https://swisscovery.slsp.ch/permalink/41SLSP_UBS/11jfr6m/alma991170746542405501. ↩︎ Seeing Standards: A Visualization of the Metadata Universe. 2009-2010. Jenn Riley. https://jennriley.com/metadatamap/seeingstandards.pdf ↩︎ A widespread example in the CH domain is the serialisation of metadata in XML, a W3C standard. ↩︎ It is noteworthy that the diversity of metadata standards in the heritage domain, characterised primarily by a common emphasis on descriptive attributes, is not counter-intuitive. This variation reflects the diverse nature of CH resources and the nuanced needs of GLAMs. ↩︎ MARC Standards: https://www.loc.gov/marc/ ↩︎ RDA: https://www.loc.gov/aba/rda/ ↩︎ If RDA was initially envisioned as the third edition of AACR, it faces the challenge of maintaining a delicate balance between preserving the AACR tradition while embracing the necessary shifts required for a successful and relevant future for library catalogues that can easily be interconnected with standards from archives, museums, and other communities [see @coyle_resource_2007]. ↩︎ MODS: https://www.loc.gov/standards/mods/ ↩︎ METS: https://www.loc.gov/standards/mets/ ↩︎ People might even argue that FRBR is only interesting as an ‘intellectual exercise’ [@zumer_functional_2007 p. 27]. ↩︎ LRMer: https://www.iflastandards.info/lrm/lrmer ↩︎ BibFrame: https://www.loc.gov/bibframe/ ↩︎ EAD: https://www.loc.gov/ead/ ↩︎ ISAD(G): General International Standard Archival Description - Second edition https://www.ica.org/en/isadg-general-international-standard-archival-description-second-edition ↩︎ PREMIS: https://www.loc.gov/standards/premis/ ↩︎ RiC Conceptual Model: https://www.ica.org/en/records-in-contexts-conceptual-model ↩︎ RiC-O: https://www.ica.org/standards/RiC/ontology ↩︎ CDWA: https://www.getty.edu/research/publications/electronic_publications/cdwa/ ↩︎ CCO: https://www.vraweb.org/cco ↩︎ VRA: https://www.vraweb.org/ ↩︎ VRA Core 4.0 and CCO have a symbiotic relationship, with CCO providing data content guidelines and incorporating the VRA Core 4.0 methodology. The latter also been leveraged in other contexts to form the basis for more granular Linked Data vocabularies [see @mixter_using_2014]. ↩︎ In French, the original language used for this acronym, CIDOC stands for Comité international pour la documentation du Conseil international des musées. ↩︎ LIDO: https://cidoc.mini.icom.museum/working-groups/lido/ ↩︎ CIDOC Working Groups: https://cidoc.mini.icom.museum/working-groups/ ↩︎ CIDOC-CRM: https://cidoc-crm.org/ ↩︎ CRM-SIG Meetings: https://www.cidoc-crm.org/meetings_all ↩︎ CIDOC-CRM V7.1.2: https://www.cidoc-crm.org/html/cidoc_crm_v7.1.2.html ↩︎ For a quick overview of the classes and properties of CIDOC-CRM, I recommend visiting the dynamic periodic table created by Remo Grillo (Digital Humanities Research Associate at I Tatti, Harvard University Center for Italian Renaissance Studies): https://remogrillo.github.io/cidoc-crm_periodic_table/ ↩︎ CIDOC-CRM compatible models and collaborations: https://www.cidoc-crm.org/collaborations ↩︎ At the time of writing none of these CIDOC-CRM extensions have been formally approved by CRM-SIG. It is also worth mentioning that other extensions based on CIDOC-CRM have been developed by the wider community, such as Bio CRM, a data model for representing biographical data for prosopographical research [see @tuominen_bio_2017] or ArchOnto, which is a model created for archives [see @hall_archonto_2020]. ↩︎ CRMact: https://www.cidoc-crm.org/crmact/ ↩︎ CRMarchaeo: https://cidoc-crm.org/crmarchaeo/ ↩︎ CRMba: https://www.cidoc-crm.org/crmba/ ↩︎ CRMdig: https://www.cidoc-crm.org/crmdig/ ↩︎ CRMgeo: https://www.cidoc-crm.org/crmgeo/ ↩︎ CRMinf: https://www.cidoc-crm.org/crminf/ ↩︎ CRMsci: https://www.cidoc-crm.org/crmsci/ ↩︎ CRMsoc: https://www.cidoc-crm.org/crmsoc/ ↩︎ CRMtex: https://www.cidoc-crm.org/crmtex/ ↩︎ FRBRoo: https://www.cidoc-crm.org/frbroo/ ↩︎ PRESSoo: https://www.cidoc-crm.org/pressoo/ ↩︎ Linked Art: https://linked.art ↩︎ DCMI Metadata Terms: https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ ↩︎ Getty Vocabularies: https://www.getty.edu/research/tools/vocabularies/ ↩︎ Mastodon: https://joinmastodon.org/ ↩︎ Homosaurus: https://homosaurus.org/ ↩︎ DOLCE: www.loa.istc.cnr.it/dolce/overview.html ↩︎ It must be noted though that the use of DLs in KR predates the emergence of ontological modelling in the context of the Web, with its origins going back to the creation of the first DL modelling languages in the mid-1980s [@krotzsch_description_2013]. ↩︎ Author’s translation: ‘We need to give up on the idea of syntactic or structural interoperability through the use of a single model, whether for producing, storing or managing data within an information system’. ↩︎ CHAOSS: https://chaoss.community/ ↩︎ IIIF Annual Conference and Showcase - Los Angeles, CA, USA - June 4-7, 2024: https://iiif.io/event/2024/los-angeles/ ↩︎ Micrio: https://micr.io/ ↩︎ Message written on the IIIF Slack Workspace on 28 October 2022. ↩︎", + "content_html": "Linked Open Usable Data for Cultural Heritage: Perspectives on Community Practices and Semantic Interoperability PhD Thesis in Digital Humanities, completed as part of the Graduate School of Social Sciences’ (G3S) doctoral programme. It was successfully defended on 18 November 2024 (slides). This page will host a lightweight HTML version of my thesis, optimised for easy access and readability. The PDF version (e-dissertation) is available on the University of Basel’s repository: https://doi.org/10.5451/unibas-ep96807. Page in construction (please be patient ⌛) Author Dr. Julien A. Raemy (University of Basel) https://orcid.org/0000-0002-4711-5759 Supervisors Prof. Dr. Peter Fornaro (University of Basel) https://orcid.org/0000-0003-1485-4923 Prof. Dr. Walter Leimgruber (University of Basel) Dr. Robert Sanderson (Yale University) https://orcid.org/0000-0003-4441-6852 Abstract Digital technologies have fundamentally transformed how Cultural Heritage (CH) collections are accessed and engaged with. Linked Open Usable Data (LOUD) specifications, including the International Image Interoperability Framework (IIIF) Presentation API 3.0, Linked Art, and the W3C Web Annotation Data Model, have emerged as web standards to facilitate the description and dissemination of these valuable resources. Despite the widespread adoption of IIIF, implementing LOUD specifications, particularly in combination, remains challenging. This is especially evident in the development and assessment of infrastructures, or sites of assemblage, that support these standards. This research is guided by two perspectives: community practices and semantic interoperability. The first perspective assesses how organizations, individuals, and apparatuses engage with and contribute to the consensus-making processes surrounding LOUD. By examining these practices, the social fabrics of the LOUD ecosystem can be better understood. The second perspective focuses on making data meaningful to machines in a standardized, interoperable manner that promotes the exchange of well-formed information. This research is grounded in the SNSF-funded project, Participatory Knowledge Practices in Analogue and Digital Image Archives (PIA) (2021–2025), which aims to develop a citizen science platform for three photographic collections from the Cultural Anthropology Switzerland (CAS) archives. Actor-Network Theory (ANT) forms the theoretical foundation, aiming to describe the collaborative structures of the LOUD ecosystem and emphasize the role of non-human actors. Beyond its implementation within the PIA project, this research includes an analysis of the social dynamics within the IIIF and Linked Art communities and an investigation of Yale’s Collections Discovery platform, LUX. The research identifies socio-technical requirements for developing specifications aligned with LOUD principles. It also examines how the implementation of LOUD standards in PIA highlights their potential benefits and limitations in facilitating data reuse and broader participation. Additionally, it explores Yale University’s large-scale deployment of LOUD standards, emphasizing the importance of ensuring consistency between Linked Art and IIIF resources within the LUX platform for the CH domain. The core methodology of this thesis is an actor- and practice-centered inquiry, focusing on a detailed examination of specific cosmologies within LOUD-driven communities, PIA, and LUX. This micro-perspective approach provides rich empirical evidence to unravel the intricate web of cultural processes and constellations in these contexts. Key empirical findings indicate that LOUD enhances the discoverability and integration of data in CH, requiring community-driven consensus on model interoperability. However, significant challenges include engaging marginalized groups, sustaining long-term participation, and balancing technological and social factors. Strategic use of technology and the capture of digital materiality are critical, but LOUD also poses challenges related to resource investment, data consistency, and the broader implementation of complex patterns. LOUD should lead efforts to improve the accessibility and usability of CH data. The community-driven methodologies of IIIF and Linked Art inherently foster collaboration and transparency, making these standards essential tools in evolving data management practices. Even for institutions and projects that do not adopt these specifications, the socio-technical practices of LOUD offer vital insights into effective digital stewardship and strategies for community engagement. Keywords: Actor-Network Theory; Community of Practice; Cultural Anthropology Switzerland; Cultural Heritage; Digital Infrastructure; International Image Interoperability Framework; Knowledge Practices; Linked Art; Linked Data; LUX; Participatory Archives; Photographic Archives; Semantic Interoperability; Web Annotation Data Model Table of Contents Introduction Context Interlinking Cultural Heritage Data Exploring Relationships through an Actor-Network Theory Lens Research Scope and Methodology The Social Fabrics of IIIF and Linked Art PIA as a Laboratory Yale’s LUX and LOUD Consistency Discussion Conclusion 1. Introduction Since its inception in 2011, the IIIF has revolutionised[1] the accessibility of image-based resources. Initially driven by the needs of manuscript scholars, IIIF focused on two-dimensional images, but has since expanded to encompass a wide range of image-based resources, including audiovisual materials and, in the near future, 3D images. Similarly, Linked Art, formally established in 2017, initially concentrated on art museum objects but has since broadened its scope to model a variety of CH entities, leveraging CIDOC-CRM, a renowned ontology in the museum and DH space. Both initiatives aim to break down silos: IIIF focuses on improving the presentation of digital objects, while both initiatives enhance their dissemination. Together, they make CH data more accessible through IIIF and more meaningful through Linked Art for machines. These efforts have primarily benefited the CH domain. A key commonality is that the main APIs these communities create align with the LOUD design principles, either intentionally or empirically demonstrated through use cases. These principles enable software developers to develop compliant tools and services without needing to fully understand RDF, a syntax for representing information on the web. Additionally, they may not need to grasp all LOD principles, which promote the interlinking of data from diverse datasets using tools like KOS such as thesauri. WADM, a W3C standard, is also recognised as a LOUD specification. It provides a framework for creating interoperable annotations on web resources, facilitating the linking and sharing of data across different platforms and applications. These LOUD design principles include the right abstraction for the audience, few barriers to entry, comprehensibility by introspection, documentation with working examples, and the use of many consistent patterns rather than few exceptions. Additionally, both IIIF and Linked Art are driven by vibrant communities, mainly comprising GLAM and higher education institutions. While the standards and principles discussed have broad applications, it is important to clarify the scope of this dissertation. This work does not focus on KGs by assessing triplestores – databases specifically designed to store and retrieve triples, which are the fundamental data structures in RDF. Similarly, it does not deal with evaluating SPARQL engines, which are specifically designed to query KGs. Additionally, this dissertation does not address the intersection of ML and IIIF, or the ontological reasoning of Linked Art. Instead, this dissertation concentrates on LOUD, the consistency of its standards, design principles and the vibrant communities behind it. It examines JSON-LD serialisation efforts and the crucial intersection required to establish robust semantic interoperability baselines between presentation and semantic layers. It also presents real-world use case implementations, both on a small scale in a laboratory and flexible space within the PIA research project, and on a large scale at Yale, exemplified by the LUX platform that provides access to (meta)data from YUL, YCBA, YUAG, and YPM. The focus is therefore on digital infrastructures capable of delivering JSON-LD files from the above specifications, which are primarily, though not exclusively, CH resources. It is more about the different actors – both human and non-human – that create and maintain these interconnected systems and the dynamic interactions that sustain them. The deployment of various LOUD specifications addresses the need for semantic interoperability between CH resources and disparate datasets by establishing a standardised approach to representing and linking data, ensuring that information can be seamlessly shared and understood across different platforms and contexts. This dissertation seeks to carve out a distinct niche by addressing an often-overlooked aspect of IIIF and Linked Art. IIIF is sometimes perceived and studied merely as a service or an appendix, with the content it delivers taking precedence. However, this PhD thesis positions IIIF as a first-class citizen worthy of in-depth study. Similarly, Linked Art, despite its potential and its relatively recent establishment, has been the subject of very few scholarly papers. This gap underscores the significance of LOUD in this context. Furthermore, this thesis elevates Linked Art to a position of primary importance, recognising its significance and advocating for its thorough examination. To thoroughly study LOUD and its adherence to design principles, it is essential to immerse ourselves actively in both communities – an approach I have embraced for years. The thesis also emphasises the importance of participatory efforts and collaboration between research projects, which typically have shorter lifespans, and memory institutions, which need to implement technical standards as a lingua franca. In doing so, it reveals the mediating role of LOUD in advancing the heritage sphere. To truly understand IIIF, Linked Art, and to a lesser extent WADM, it is crucial to examine the social fabrics and consensus decision-making of each community. Among these considerations are how the specifications can be implemented pragmatically, and how the standards can support the implementation and maintenance of more extensive semantic interoperability efforts. The significance of this research lies in highlighting the commitment and diligence of the individuals and organisations that make up both the IIIF and Linked Art communities. It aims to demonstrate that community-driven practices, such as those exemplified by IIIF and Linked Art, have a potential that goes beyond the mere sharing of digital objects and their associated metadata. The more people who embrace these approaches and implement the associated specifications, the more society as a whole will benefit. Furthermore, this research illustrates that IIIF is no longer limited to two-dimensional images, that Linked Art is not restricted to artworks, and that WADM is a simple, content-agnostic standard that can be easily integrated into a range of systems. This adaptability is a strength of LOUD standards, which are designed to be simple yet effective. LOUD can serve a variety of purposes, primarily rooted in CH, but with the potential to extend its benefits to other sectors. The true beauty of LOUD lies in its ability to foster networking opportunities and transparent socio-technical practices, demonstrating its value beyond mere technical implementation. By emphasising these aspects, this dissertation highlights the wider impact of LOUD in promoting semantic interoperability and enhancing collaborative efforts within the heritage field and beyond. In addition, the implementation of standards through PIA underlines the potential for similar participatory or citizen science projects, while the LUX initiative serves as an illustrative example of robust infrastructure and cross-unit engagement. These examples demonstrate the practical applications and far-reaching implications of adopting LOUD standards in different contexts. This dissertation is structured across ten chapters, each building upon the previous ones up to Chapter 5 to provide a comprehensive understanding of the research. These initial chapters lay the foundation of the study, establishing the context, theoretical framework, and methodological approaches. After this foundational section, Chapters 6, 7, and 8 present empirical studies that, while interconnected, can be read independently if desired. These chapters offer detailed insights into specific aspects of the research and can be appreciated on their own or as part of the broader narrative. The thesis continues with Chapter 2, which extends this introduction by providing more information about the research setting, specifically PIA. Chapter 3 follows with an extensive literature review, offering a comprehensive overview of methods to interlink CH data. Next, Chapter 4 presents the theoretical framework, conceptualised as a toolbox and firmly rooted in ANT, guiding the analysis and discussion throughout the dissertation. Following this, Chapter 5 details the research scope and methodology, explaining the approaches and methods employed in the study. Moving on to the empirical work, Chapter 6 sheds light on the social fabrics of IIIF and Linked Art, exploring the communities and practices that underpin these initiatives.Chapter 7 then examines the implementation of LOUD standards within PIA, highlighting the practical aspects and challenges encountered. This is followed by Chapter 8, which focuses on the LUX initiative at Yale, examining the underlying governance and interdepartmental ownership of the Yale Collections Discovery platform. The discussion of findings is presented in Chapter 9, where the results from the empirical chapters are synthesised and analysed in relation to the theoretical framework. Finally, Chapter 10 concludes the thesis, summarising the key insights and contributions of the research while outlining potential directions for future study. 2. Context In this chapter, I will set the stage for my PhD thesis by providing important background information. First, in Section 2.1, I will explain why I chose the title for my thesis. This will give you an understanding of the main focus and the direction of my research. Next, in , I will describe the PIA research project, which is central to my work. This section will cover the project’s goals, significance, and overall framework. In , I will detail my specific contributions to the PIA project. I will emphasise how my work fits into the larger project and its importance to my thesis. Finally, in , I will talk about my active participation in the IIIF and Linked Art communities. This section will highlight how my involvement in these communities has influenced my research and its broader implications. 2.1 PhD Title I chose the title ‘Linked Open Usable Data for Cultural Heritage: Perspectives on Community Practices and Semantic Interoperability’ as it encapsulates the essence of my research focus but I could have indeed chosen other ones. During the initial stages of my research, multiple working titles were explored to capture the diverse facets of my interests and objectives. If I was quite sure about having in the title after the third iteration, I was quite unsure of what should follow and if a subtitle was actually needed at all. Amidst this dynamic progression, the underlying theme of my research remained steadfast – to delve into the transformative potential of LOUD for CH. I also opted to maintain in the title of my thesis subsection. While holds its appeal, my choice reflects a broader narrative that acknowledges the crucial role of CHIs and spotlighting the multifaceted nature of heritage preservation, encapsulating both its digital facets and the essential contribution of individuals and institutions in curating, interpreting, and making heritage accessible. As for the subtitle, while I do explore CoP as defined by @lave_situated_1991 and @wenger_communities_2011 through investigating the social fabrics of the IIIF and Linked Art communities, my main interest lies in the broader application of LOUD for describing and interlinking CH resources. Thus, I decided to opt for the more generic as the first axis or perspective. For the second perspective, I wanted to see how semantic interoperability can be achieved through standards adhering to the LOUD design principles, as they seem to be key enablers for seamless collaboration and knowledge exchange among practitioners. There was a time in my research when I envisaged decoupling and , perceiving them as two distinct dimensions. However, what really captivates me is the unification of these factors to facilitate collective reasoning for both humans and machines. In summary, this title reflects my enthusiasm for using web-based and community-driven technologies to transform the way we understand, share and value CH. 2.2 The PIA Research Project I undertook my doctoral studies within the scope of the PIA research project financed by the SNSF under their Sinergia funding scheme from February 2021 to January 2025[2]. The project aimed to analyse the interplay of participants, epistemological orders and the graphical representation of information and knowledge in relation to three photographic collections from CAS. It sought to bring together the world of data and things in an interdisciplinary manner, exploring the phases of the analogue and digital archive from a cultural anthropological, technical and design research perspective [@felsing_community_2023 p. 42]. As part of this endeavour, interfaces were developed to enable the collaborative indexing and use of photographic archival records [@chiquet_participatory_2023 p. 110]. I discuss in more detail the interdisciplinary components and briefly introduce the people involved in the project in Subsection 2.2.1, then talk about the photographic collections that were the overarching narrative of the research in Subsection 2.2.2, and lastly in Subsection 2.2.3, the vision that we had put together. The project, divided in three interdisciplinary teams, was led by the University of Basel through the Institute for Cultural Anthropology and European Ethnology[3] (Team A) and the DHLab[4] in collaboration with the DBIS group (Team B) as well as by the HKB[5], an art school and department of the Bern University of Applied Sciences (Team C) [@felsing_community_2023 p. 43]. Table 2.1 lists the people who contributed to the project, broken down by the three teams and their particular perspectives. Table 2.1: PIA Team Core Members Perspective People A) Anthropological Prof. Dr. Walter Leimgruber, Team Leader and Dissertation Supervisor Dr. Nicole Peduzzi, Photographic Restoration and Digitisation Supervisor Regula Anklin, Conservation and Restoration Specialist (project partner at Anklin & Assen) Murielle Cornut, PhD Candidate in Cultural Anthropology Birgit Huber, PhD Candidate in Cultural Anthropology Fabienne Lüthi, PhD Candidate in Cultural Anthropology B) Technical Prof. Dr. Peter Fornaro, Team Leader and Dissertation Supervisor Prof. Dr. Heiko Schuldt, Dissertation Supervisor (project partner at the University of Basel) Dr. Vera Chiquet, Postdoctoral Researcher Adrian Demleitner, Software Developer (2021-2023) Fabian Frei, Software Developer (2023-2025) Christoph Rohrer, Software Developer (2023-2025) Julien A. Raemy, PhD Candidate in Digital Humanities Florian Spiess, PhD Candidate in Computer Science C) Communicative Dr. Ulrike Felsing, Team Leader and Dissertation Supervisor Prof. Dr. Tobias Hodel, Dissertation Supervisor (project partner at the University of Bern) Daniel Schoeneck, Research Fellow Lukas Zimmer, Designer (project partner at A/Z&T) Max Frischknecht, PhD Candidate in Digital Humanities 2.2.2 Photographic Collections/Archives as Anchors CAS has historically been engaged in active collaborations that bridge the academic research and the public sphere, primarily through traditional analogue methods. The PIA project was created with the intention of exploring the complexities inherent in both analogue and digital approaches, and to encourage and investigate these collaborative endeavours between academia and the wider public. As such, PIA represents a paradigm shift within the scope of projects associated with or supported by CAS, facilitating the seamless integration of digital tools to explore multiple facets of participation and engagement. This transformative endeavour embodies a profound exploration of new intersections where scholarly endeavours intertwine with the active involvement of citizens. PIA drew on three collections: one focusing on scientific cartography and titled (Atlas der Schweizerischen Volkskunde), a second from the estate of the photojournalist Ernst Brunner (1901–1979), and a third collection consisting of vernacular photography which was owned by the Kreis Family (1860–1970). SGV_05 ASV consists of 292 maps and 1000 pages of commentary published from 1950 to 1995 — an example of such a map is shown in Figure 2.1. This collection was commissioned by the CAS to do an extensive survey of the Swiss population in the 1930s and 1940s on many issues pertaining, for instance, to everyday life, local laws, superstitions, celebrations or labour [@weiss_atlas_1940]. The contents were compiled by researchers and by people who were described as [6]. Questions were asked about everyday habits, community rights, work, trade, superstitions, and many other topics [@schmoll_richard_2009; @schmoll_vermessung_2009]. This collection offers a snapshot of everyday life in Switzerland right before the beginning of a modernisation process that fundamentally changed lifestyles in all areas during the postwar period. A digitised version of the ASV would not only allow the results of that time to be enriched with further findings [@schranz_critical_2021], but would also make transparent how knowledge was generated in cartographic form through a complex process along different types of media and actors. The restoration, digitisation, cataloguing and indexing efforts took all part throughout PIA under the supervision of Birgit Huber, who extensively based her doctoral research on this particular collection [see @huber_entdeckung_2023]. Figure 2.1: Map from the SGV_05 Collection Relating to Question 93 Showing Walks and Excursions at Pentecost. ASV. CAS. CC BY-NC 4.0 SGV_10 Kreis Family comprises approximately 20,000 loose photographic objects, where a quarter of them are organised and kept in 93 photo albums — as illustrated by Figure 2.2, from a wealthy Basel-based family and spanning from the 1850s to the 1980s. This private collection was acquired by CAS in 1991. The collection, which originally arrived in banana cases and was enigmatic due to the lack of clear organisation or accompanying information from the family, posed significant challenges. Despite these initial hurdles, CAS undertook meticulous efforts to catalogue and preserve its contents [@felsing_re-imagining_2024 p. 42]. The pictures were taken by studio photographers as well as by family members themselves. The Kreis Family collection represents a typical example of urban bourgeois culture and gives a comprehensive insight into the development of private photography over the course of a century [@pagenstecher_private_2009]. The photographic materials and formats are very diverse, ranging from prints to negatives, small, medium or large format photographs, black and white or colour. The collection also encompasses many photographic techniques, from the one-off daguerreotypes and ferrotypes, to the glass-based negatives that could be reproduced en masse, to the modern paper prints. While some of the albums and loose images were restored and digitised during the 2014 project, much of this work was completed during PIA and overseen by Murielle Cornut, whose doctoral investigation was centred on the study of photo albums [see @cornut_open_2023]. Figure 2.2: A photo Album Page from the SGV_10 Collection, Bearing the Following Inscription: Botanische Excursion ins Wallis, Pfingster 1928. SGV_10A_00031_015. Kreis Family. CAS. CC BY-NC 4.0 SGV_12 Ernst Brunner is a donation of about 48,000 negatives and 20,000 prints to the CAS archives from Ernst Brunner, a self-taught photojournalist, who lived from 1901 to 1979 and who documented mainly in the 1930s and 1940s a wide range of folkloristic themes — as shown by Figure 2.3. He is one of the most important photographers of the era and one of the most outstanding visual chroniclers of Swiss society [@pfrunder_ernst_1995]. His photographs show rural lifestyles, but also urban motifs. In his late work, he led the documentation and research on farmhouses in a specific Swiss district, a project initiated by CAS. Before Ernst Brunner became an independent photojournalist in the mid-1930s, he worked as a carpenter, influenced by the ideas of the Bauhaus and Neues Bauen movements. This can also be seen in the aesthetics and formal language of his photography. If all the black and white negatives were digitised and recorded between 2014 and 2018, the digitisation of prints, which is a selection done by Ernst Brunner, was conducted at the end of the PIA research project. The latter was supervised by Fabienne Lüthi, whose PhD was about organisational systems and knowledge practices in the Ernst Brunner Collection. Figure 2.3: Picture from the SGV_12 Collection Showing Walkers Looking at the Timetable Train. [Wanderer studieren den Fahrplan in der Bahnhofhalle]. Lucerne, 1938. Ernst Brunner. SGV_12N_00716. CAS. CC BY-NC 4.0 Whereas for each of the PhD Candidates in Cultural Anthropology, a particular collection was assigned to them and its content was to varying degrees part of their subject of study, this was not exactly the same for the PhD Candidates in DH, including myself, and in Computer Science. Put differently, we had relative leeway in terms of what interested us in each or all of these three photographic collections. In my case, I briefly explain my contribution to the project more in and then in as part of the empirical portion of my thesis focusing on the deployment of LOUD specifications using the three CAS photographic collections. Florian Spiess focused on the use of VR through vitrivr, a multimedia retrieval system developed by the DBIS research group at the Department of Mathematics and Computer Science [@spiess_multimodal_2022; @spiess_forschung_2023; @spiess_exploring_2024]. His work included experiments with PIA-related collections, such as the creation of virtual galleries clustered according to content-based similarity [see @peterhans_automatic_2022]. In the case of Max Frischknecht, his doctoral research centred on generative design[7], a methodology to visualise dynamic cultural archives. He mostly worked on the ASV collection and on a mapping tool which is a cartographic visualisation designed to explore the CAS photographic archives [see @frischknecht_generating_2022; @eggmann_digitalisierung_2024]. It should also be mentioned that not only did we use the three collections of the CAS photographic archives within the project, but that both formal and informal meetings took place most commonly within the photographic archives at the Spalenvorstadt premises in the old Gewerbemuseum and later either at the on Allschwilerstrasse, though less frequently, or at Rheinsprung where the Institute for Cultural Anthropology and European Ethnology is located. This meant that there was a strong and sometimes blurred entanglement between those involved in the archives and the PIA core team members. 2.2.3 Project Vision Between December 2021 and March 2022, we worked together to develop and finalise a vision for the project[8]. It includes seven key priorities, or pillars, which were meant to strengthen the interdisciplinary perspectives of PIA. Although ambitious, these elements were of paramount importance to us and served as a guiding blueprint for all PIA activities. Hereafter is a modified version of the vision[9] taken from @cornut_annotations_2023 [p. 4]. Accessibility by developing open interfaces and offering the possibility of expanding the archive and turning it into an instrument of current research that collects and evaluates knowledge with the participation of other users (Citizen Science). Heterogeneity by making visible where, why and under what circumstances the objects were created, how they were handled and what path they have taken to get to and in the archive. We work on visualisations that take into account the heterogeneous character of archival materials and make their respective biographies visible. Materiality by conveying the material properties of the objects: they have front and back sides, inscriptions, traces, development errors, they are transparent, multi-layered or fabric-covered. They tell of their origin, use, and peculiarities. We want to make this knowledge accessible and understandable in digital form. To this end, we also consider the necessary infrastructure involved in the creation as part of their narrative: the restoration, the relocation, the indexing, the storage devices, the research tools, the display medium, as well as the process of repro-photography. Interoperability as a crucial component and which will be done by supporting digital means that allow different stakeholders to freely access and interact with the project’s data. Both humans and machines can use, contribute to, correct and annotate the existing data in an open and interoperable manner, thus encouraging exchange and the creation of new knowledge. To do this, we use web-based standards that are widely adopted in the cultural heritage field. Affinities by leveraging data models and pattern recognition which can uncover semantic relationships between entities that were previously incomplete or difficult for users to access. Using specific interfaces and visualisations, we make it possible to explore digital assets and discover forms of relationships and similarities between images. AI that facilitates automated searches for simple image attributes such as colour, shapes, and localisation of image components. It should also become possible to recognise texts and object types for extracting metadata. Bias Management by taking into account that associated metadata was human-made[10] and thus is never objective. Collections and their metadata reflect biases or focus narrowly on selected areas and perceptions. Machines working on the basis of such data automatically reproduce the implicit biases in decision-making due to so-called biased algorithms. Therefore, understanding the data used for training and the algorithms applied for decision making is crucial to ensure the integrity of the application of these technologies in archives. We take ethical issues into account when using AI and visualisations, because the higher the awareness of a possible bias, the faster it can be detected or brought up for consideration with users. As my thesis is notably concerned with semantic interoperability, Interoperability and Affinities are of particular importance to my PhD thesis, although I recognise the importance of all pillars. Each of these resonated with me and my fellow PhD Candidates. As we immersed ourselves in the vision of the PIA research project, it became a unifying thread that brought us together in our research ambitions. We found that all these priorities within the project spoke to us at different points and provided a strong point of communication and practice in the development of processes, prototypes or interfaces. 2.3 Contribution to PIA and its Relevance to the Thesis To develop a participatory platform, an open and sustainable technological foundation for facilitating the reuse of CH resources was needed [@raemy_applying_2021]. Throughout the PIA project, I was mainly involved in the extension of the data infrastructure, the uptake of IIIF as well as designing the data model, leveraging Linked Art and WADM [@raemy_interlinking_2024]. As a member of Team B, I undertook this PhD as a bridge between the different teams, mostly participating in discussions with the three doctoral candidates from Team A to further develop and agree on the CAS data model and with the software developers from my team to discuss the impact of the data model on our evolving — yet transitory — infrastructure as well as helping in implementing the APIs adhering to the LOUD design principles. It was necessary to redesign the data model within the context of a database migration, from Salsah to the DSP, that happened between November 2021 and March 2024. This updated version, based on the Knora Base Ontology[11], corresponded to the needs of the CAS archives and to some extent to those of PIA, in particular to enable the PhD Candidates in Cultural Anthropology to make more precise assertions, whether in terms of descriptive metadata, or in the ability to link one object to another or to provide comments on these objects in several narrative forms. Moreover, an assessment of the appropriate technical standards for improved usability of the objects by both humans and machines was carried out, as a basis for extending the capabilities provided by DaSCH, such as helping the software developers to implement SIPI[12], a C++ image server compatible with the IIIF Image API and build services that create IIIF Presentation API 3.0 resources. While the theoretical framework of the thesis extends across the scope of PIA, the empirical part focuses on a specific set of findings derived from the research project outlined in , under the title . In this chapter, I discuss the data model and its refinement as well as the generation of custom IIIF Manifests during the specific digitisation, cataloguing and indexing efforts that took place throughout the project for the three CAS collections (SGV_05, SGV_10 and SGV_12) under investigation, the implementation of LOUD standards, and the overall design of the technological underpinnings. 2.4 Involvement within the IIIF and Linked Art communities I must acknowledge the invaluable role that my involvement within the IIIF and Linked Art communities has played in shaping my journey as a trained information specialist and an aspiring DH practitioner. Being an active participant in both communities has not only broadened my understanding of the latest developments in the field but has also profoundly influenced the trajectory of this dissertation. I have been involved within the IIIF community since October 2016 and the Working Groups Meeting that happened in The Hague[13]. This significant journey was, in fact, initiated by a recommendation from my first supervisor, Peter Fornaro, during my time as an undergraduate doing an internship at the DHLab. Little did I know that this recommendation would lead me to be carrying out a PhD and looking at IIIF not only as community-driven standards but as an object of study. Engaging with the IIIF community exposed me to cutting-edge advances in image interoperability and standards, and fostered a deeper appreciation for the importance of digital representations of cultural heritage. Through collaborative discussions with experts from diverse backgrounds, I gained new perspectives on the potential of technology to advance humanities research and preserve our collective cultural memory. Similarly, my involvement in the Linked Art community introduced me to the opportunities offered by LOUD and its transformative impact on research discourse. Exposure to Linked Data methodologies and the CIDOC-CRM has significantly influenced the way I have structured and interpreted the data in this dissertation, thereby enriching its scholarly breadth and rigour. I started to be actively involved in Linked Art at the beginning of my PhD in 2021, but I was already a by 2020, driven by the efforts of Rob Sanderson, my third supervisor. By mid-2023, I had become a member of the Editorial Board. The individuals I have met and the knowledge shared in these vibrant communities have deeply informed my approach as a scholar. The invaluable connections and collaborations I have made have expanded my network of fellow researchers, educators, and experts, leading to fruitful discussions that have significantly shaped the research questions addressed in this thesis. The events and workshops organised by these communities have also provided immersive learning experiences, giving me first-hand insights into the tools, technologies and methodologies used in the context of describing and disseminating CH data. The dynamic ecosystem of these communities has served as an inspiring backdrop, fostering innovative thinking and encouraging a more holistic approach to my research. 3. Interlinking Cultural Heritage Data Interlinking CH data is an important aspect of publishing heritage collections over the web, in particular by using LOD technologies to make assertions more easily readable and meaningful to machines [@marcondes_integrated_2021]. Due to the complexity of CH data and their intrinsic inter-relationships, it is necessary to define its nature and introduce controlled vocabularies and ontologies that can be integrated with existing web standards and interoperable with relevant platforms [@bruseker_cultural_2017; @hyvonen_using_2020]. Efforts to interlink CH data have brought about significant advancements, but challenges remain. One such challenge is finding a balance between completeness and precision of expression to ensure that the that CH data remain accessible and usable to a wider audience. Addressing this challenge, the Linked Open Usable Data (LOUD) design principles and the specifications that adhere to those, such as the IIIF Presentation API 3.0 and Linked Art, offer a promising approach [@raemy_enabling_2023]. By focusing on usability aspects from the perspective of software developers and data scientists involved in designing visualisation tools and data aggregation approaches, LOUD strives to enhance the overall user experience [@sanderson_keynote_2019]. Finding this equilibrium becomes crucial as CH data continues to grow in complexity and size, necessitating the seamless integration of native web technologies. The LOUD concept cultivates an environment that encourages the formation of vibrant CoP and the seamless integration of native web technologies, wherein an essential principle is the availability of comprehensive documentation supplemented with practical examples [@raemy_ameliorer_2022]. Moreover, the emphasis on leveraging widely adopted technologies enhances the interoperability of data and promotes its wider dissemination. With LOUD principles guiding the linking of CH data, the resulting web of knowledge becomes more than just a machine-readable resource; it transforms into a user-centric ecosystem where both accessibility of Linked Data and usability intersect to enable scholars and a wider audience to engage in the exploration and appreciation of CH [@newbury_loud_2018]. Finally, by fostering a collaborative, knowledge-sharing mindset, LOUD empowers software developers to implement data in a robust way, drawing insights from shared experiences [see @page_linked_2020]. In this chapter, which serves as the literature review of the PhD thesis, I attempt to draw on this brief introduction by dividing the insights into seven sections in order to provide an overview of the key concepts related to interlinking data in the CH domain. The literature review primarily encompasses works published up until December 2023, providing a comprehensive snapshot of the field’s current state and its evolution. Section 3.1 discusses what makes CH data stand out and Section 3.2 is about CH metadata standards, while ??? explores the technological trends, scientific movements and guiding principles that have shaped the field. ??? provides an overview of the web as an open platform, which are essential to understanding the current landscape of interlinking CH data. ??? focuses on LOUD, while ??? looks at characterising the community practices and semantic interoperability dimensions for CH. Finally, in ???, I summarise key elements from each section and within each of these I give some initial thoughts with respect to LOUD, and then conclude the chapter with some considerations on why we as a society need to care about CH data. 3.1 What Makes Cultural Heritage Data Stand Out? Here, I aim to establish the indirect territory of my study, as I am situated on a distinct plane that focuses on web technologies and standards — as well as software and services that enable them — as the subjects of investigation. However, it is crucial to acknowledge that LOUD specifications owe their existence to the available data that have served as case studies. Thus, their significance can be best understood through the lens of data and I recognise here the pivotal role played by CH practitioners — encompassing individuals from research and memory institutions — who have had a significant impact on specifying a series of web-based standards and who have helped to move forward the discovery of CH data and beyond, in particular those belonging to the public domain, in an open manner. In Subsection 3.1.1, I provide an introduction to CH as recognised by the UNESCO. I explore the tangible, intangible, and natural dimensions of CH, laying the foundation for further discussions on its representation and preservation, notably by giving a first definition of CH data. Next in 3.1.2, I look at the challenges of representation and embodiment of CH data. This subsection examines the challenges in describing and preserving its materiality or embodied aspects. Understanding the significance of collective efforts, communities, and the interplay of technologies. Thirdly, I discuss what I called ‘Collectives and Apparatuses’ in 3.1.3 where I highlight how actors in terms of collaborative actions and apparatuses play a pivotal role in CH. 3.1.1 Cultural Heritage The legacy of CH encompasses physical artefacts and intangible aspects inherited from past generations, reflecting the history and traditions of societies. Meanwhile, CH constantly evolves due to complex historical processes, necessitating preservation and protection efforts to prevent its loss over time [@loulanski_revising_2006]. The dynamic nature of CH demands collaborative actions, including documentation and the use of a range of technologies. The concept of CH is also characterised by perpetual evolution, mirroring the historical processes that shape societies over time. Social, political, economic, and technological shifts invariably influence the definition and perception of CH, prompting continuous reinterpretations and reevaluations of its significance. Over the years, the enthusiasm for the protection of cultural property has enriched the term with new shades of meaning. As societies undergo transformations, new layers of meaning and relevance are superimposed on existing CH, perpetually enriching its essence. As articulated by [@ferrazzi_notion_2021 p. 765]: ‘Cultural heritage’, as an abstract legacy or as a merge of tangible and intangible values, is able to encompass the totality of culture(s); in so, assuming a symbolic value that brings a clear break with all other terminologies. In conclusion, ‘cultural heritage’ as a legal term has demonstrated more than any others to be a real ensemble of historical stratification and cultural diversity. The advent of globalisation and rapid advancements in technology have further accelerated the evolution of CH. Increased interconnectedness and cross-cultural interactions have led to the fusion of traditions and the emergence of novel cultural expressions. Moreover, the digital era has facilitated the dissemination of CH resources on a global scale, transcending geographical barriers and preserving cultural knowledge for future generations as [@portales_digital_2018]. Thus, the intriguing nature of CH resources can be attributed to their multifaceted and diverse characteristics. The conservation and promotion of these resources demand a nuanced comprehension of the various types of heritage resources, culminating in effective preservation and promotion strategies that can account for their heterogeneity [@windhager_visualization_2019]. According to [@unesco_institute_for_statistics_unesco_2009], CH includes tangible and intangible heritage. Tangible CH refers to physical objects such as artworks, artefacts, monuments, and buildings, while intangible CH comprises practices, knowledge, folklore and traditions that hold cultural significance [@munjeri_tangible_2004]. The concept of heritage has evolved through a process of extension to include objects that were not traditionally considered part of the heritage. The criteria for selecting heritage have also changed, taking into account cultural value, identity, and the ability of the object to evoke memory. This shift has led to the recognition and protection of intangible CH, challenging a Eurocentric perspective and embracing cultural diversity as a valuable asset for humanity [@vecco_definition_2010]. Conservation guidelines have broadened the concept of heritage to include not only individual buildings and sites but also groups of buildings, historical areas, towns, environments, social factors, and intangible heritage [@ahmad_scope_2006]. In 2019, another instance of UNESCO defines CH in an even more comprehensive manner, taking into account natural heritage: Cultural heritage is, in its broadest sense, both a product and a process, which provides societies with a wealth of resources that are inherited from the past, created in the present and bestowed for the benefit of future generations. Most importantly, it includes not only tangible, but also natural and intangible heritage. [@unesco_culture_for_development_indicators_methodology_2014 p. 130] In thinking about the concept of CH, I find this last definition particularly resonant. This broader perspective is motivated by my interest with LOUD specifications as a research area, particularly because of their notable data agnosticism and as it resonated with @hyvonen_cultural_2012 [pp. 1-3]'s subdivision of CH as well. These services have the adaptability to process and use different types of data, transcending the boundaries of specific domains or disciplines. Although grounded in concrete CH cases, their potential to extend to any type of data, including those from STEM, is a compelling prospect that warrants further exploration, a point that I will explore later. The following sub-subsections aim to briefly discuss tangible, intangible, and natural heritage, as well as providing a definition of CH data which can serve as a foundational reference for this thesis. 3.1.1.1 Tangible Heritage Tangible CH encompasses physical artefacts and sites of immense cultural significance that are passed through generations in a society [@vecco_definition_2010]. These objects are tangible manifestations of human creativity, representing artistic creations, architectural achievements, archaeological remains as well as collections held by CHIs. One aspect of tangible CH is artistic creations such as paintings, sculptures and traditional handicrafts. These artefacts embody cultural values and artistic expressions and serve as essential reflections of a society’s collective ethos. For example, artworks such as ‘Irises’ from Vincent van Gogh[14] and Alberto Giacometti’s ‘L’Homme qui Marche I’ [15] are revered works of art that have deep cultural significance in Europe and all over the world. The built heritage, including monuments, temples and historic buildings, is another important component of the tangible CH. These architectural marvels not only represent past civilisations, but also convey the social values and aspirations of their time. The Taj Mahal, an exemplary white marble structure in India, stands as a poignant testament to Mughal architecture. Closer to where I write this dissertation one can mention the Abbey of St Gall, a convent from the century which is inscribed on the UNESCO World Heritage List. In the context of urban heritage, conventional definitions of built heritage often focus narrowly on the architectural and historical value of individual buildings and monuments, which are well protected by existing legislation. However, the challenge is to preserve urban fragments - areas within towns and cities that may not qualify as designated conservation areas, but are of significant cultural and morphological importance [@tweed_built_2007]. For instance, [@rautenberg_lemergence_1998] proposes two categories of built CH: heritage by designation and heritage by appropriation. Heritage by designation involves experts conferring heritage status on sites, buildings, and cultural objects through a top-down approach, often without public participation. This method can be predictable and uncontroversial, but can be criticised for being elitist and neglecting unconventional heritage. On the other hand, heritage by appropriation emphasises community and public involvement in identifying and preserving cultural expressions, leading to a more inclusive and dynamic understanding of heritage. Archaeological sites are also an integral part of the tangible CH, offering invaluable insights into past societies and ways of life. As per May 2024, UNESCO's long list of World Heritage Sites includes 1,199 cultural and natural sites in 168 different state parties — including 48 sites in transboundary regions[16]. Sites such as Machu Picchu, an impressive Inca citadel in the Peruvian Andes, bear witness to the architectural achievements and cultural practices of ancient civilisations. If archaeological sites are invaluable, they face significant threats such as looting, destruction, exploitation, and extreme weather phenomena [@bowman_transnational_2008; @micle_archaeological_2014]. To safeguard them, conservation efforts must be case-specific and include documentation and assessment of experiences gained [@aslan_protective_1997]. The preservation of tangible CH extends beyond physical objects to include libraries, archives and museums that house collections of books, manuscripts, historical documents and artefacts. Incidentally, the term “cultural property” is also employed as a related concept to tangible CH, encompassing both movable and immovable properties as opposed to less tangible cultural expressions [@ahmad_scope_2006]. Cultural property is protected by a number of international conventions and national laws. For instance, the Blue Shield[17] — an international organisation established in 1996 by four non-governmental organisations[18] — aims to protect and preserve heritage in times of armed conflict and natural disasters [@van_der_auwera_unesco_2013]. Its mission has been revised in 2016: The Blue Shield is committed to the protection of the world’s cultural property, and is concerned with the protection of cultural and natural heritage, tangible and intangible, in the event of armed conflict, natural- or human-made disaster. [@blue_shield_blue_2016 art. 2.1] Overall, tangible CH is a testament to human ingenuity and cultural diversity, and serves as a bridge between the past and the present. Its preservation is a collective responsibility, ensuring that the legacy of past generations endures and the wealth of cultural diversity continues to enrich the fabric of society. 3.1.1.2 Intangible Heritage The concept of intangible heritage emerged in the 1970s and was coined at the UNESCO Mexico Conference in 1982 [@leimgruber_switzerland_2010] with the aim of protecting cultural expressions that were previously excluded from preservation efforts [@hertz_politiques_2018]. UNESCO's previous focus had been on material objects, primarily from wealthier regions of the global North, leaving the intangible cultural heritage of the South overlooked. Attempts to protect intangible heritage through legal measures like copyright and patents were ineffective due to the collective nature of these cultural expressions and the anonymity of creators. The Convention acknowledges that intangible CH is essential for cultural diversity and sustainable development. Below is the definition given by the Convention for the Safeguarding of the Intangible Cultural Heritage: ‘The Intangible Cultural Heritage’ means the practices, representations, expressions, knowledge, skills – as well as the instruments, objects, artefacts and cultural spaces associated therewith – that communities, groups and, in some cases, individuals recognize as part of their cultural heritage. This intangible cultural heritage, transmitted from generation to generation, is constantly recreated by communities and groups in response to their environment, their interaction with nature and their history, and provides them with a sense of identity and continuity, thus promoting respect for cultural diversity and human creativity. [@unesco_basic_2022] According to UNESCO, intangible CH can be manifested in the following domains: oral traditions and expressions, including language as a vehicle of the intangible CH; performing arts; social practices, rituals and festive events; knowledge and practices concerning nature and the universe; traditional craftsmanship. Overall, intangible CH is a multifaceted concept that encompasses both traditional practices inherited from the past and contemporary expressions in which diverse cultural groups actively participate [@munjeri_tangible_2004; @leimgruber_was_2008]. It includes inclusive elements shared by different communities, whether they are neighbouring villages, distant cities around the world, or practices adapted by migrant populations in new regions. These expressions have been passed down from generation to generation, evolving in response to their environment, and play a crucial role in shaping our collective identity and continuity. Intangible CH promotes social cohesion, strengthens a sense of belonging and responsibility, and enables individuals to connect with different communities and society at large. Central to the nature of intangible CH is its representation within communities. Its value goes beyond mere exclusivity or exceptional importance; rather, it thrives on its association with the people who preserve and transmit their knowledge of traditions, skills and customs to others within the community and across generations. The recognition and preservation of intangible CH depends on the communities, groups or individuals directly involved in its creation, maintenance and transmission. Without their recognition, no external entity can decide on their behalf whether a particular practice or expression constitutes their heritage. The community-based approach ensures that intangible CH remains authentic and deeply rooted in the living fabric of society, protected by those who care for and perpetuate it. In Switzerland, the Winegrower’s Festival in Vevey (La Fête des Vignerons), a plurisecular event celebrating the world of wine making [@vinckMetiersOmbreFete2019] and the Carnival of Basel (Basler Fasnacht) [@chiquet_how_2023] are examples of traditions that are listed among UNESCO's intangible CH. (In)tangibility is not always a straightforward concept and can indeed be blurred, i.e. it goes beyond the mere idea of materialisation. Many artefacts and elements of CH possess both tangible and intangible qualities that intertwine and complement each other, making the distinction less clear-cut. For instance, this Male Face Mask, held at the Art Institute Chicago[19], also known as ‘Zamble’, from the Guro people in the Ivory Coast holds dual significance as both a tangible and intangible CH. As a tangible object, the mask is a physical artefact made from wood and pigment, fabric, and various adornments, that combines animal and human features representing the Guro people’s artistic skills. On the other hand, as an intangible cultural object, the Zamble mask carries profound spiritual and cultural meaning. It plays a significant role in commemorating the deceased during a man’s second funeral. These second funerals are organised months or even years after the actual burial as a way to honour and remember the departed [see @haxaire_power_2009]. Thus, the preservation and appreciation of both the tangible and intangible aspects of the mask are essential to its cultural relevance. Another example of the blurred line between tangible and intangible heritage is emphasised by @de_muynke_ears_2022 in recreating reported perceptions of the acoustics of Notre-Dame de Paris through a collaboration between sciences of acoustics and anthropology. The authors highlight the heritage value of how people subjectively perceive sound in a space, particularly in places of worship where sound and music are integral to the religious experience. The authors advocate integrating the study of both material and non-material aspects to understand the changing sonic environments of heritage buildings [@de_muynke_ears_2022 pp. 1-2]. @katz_digitally_2023 articulates that ‘acoustics is an intangible product of a tangible building’. This integrated perspective could lead to a more holistic understanding of the dynamics between physical spaces and the perceptual and experiential dimensions attached to them. 3.1.1.3 Natural Heritage Natural heritage, encompassing geological formations, biodiversity, and ecosystems of cultural, scientific, and aesthetic value, shares a significant overlap with CH. Many natural sites hold spiritual and symbolic importance for communities, becoming repositories of cultural memory and identity [@lowenthal_natural_2005]. Traditional ecological knowledge developed by various cultures also underscores the interconnectedness of cultural and natural heritage, as indigenous communities have accumulated wisdom on sustainable resource use and ecological balance [@azzopardi_what_2023]. Moreover, the conservation and sustainable management of natural heritage is often intertwined with efforts to protect CH, fostering a collective commitment to preserve these entangled legacies for future generations. The link between natural and CH goes beyond their shared values; spatial overlaps further accentuate their interdependence. Natural sites may have cultural significance, while CH sites may be situated within natural landscapes. For example, a national park may include archaeological sites or culturally revered landscapes, thus intertwining the cultural and natural dimensions. This spatial intermingling highlights the inextricable relationship between human societies and the natural environment, as cultural practices and beliefs become intertwined with the landscapes they inhabit. In this way, the preservation of both natural and cultural heritage becomes essential not only for their intrinsic worth but also for sustaining the narrative of our shared human and environmental history. Additionally, the distinction between nature and culture is not only subjective and dependent on human appreciation [@vandenhende_management_2017]. Rather, it is a concept intrinsically linked with the overarching framework of modernism, a perspective that has been critically examined and deconstructed by the influential sociologist and philosopher, Bruno Latour, that have argued that ‘we have never been modern’ [@latour_we_1993]. Latour’s deconstruction of the modernist perspective extends to the recognition that the ‘the proliferation of hybrids has saturated the constitutional framework of the moderns’ [@latour_we_1993 p. 51]. This assertion underscores the fundamental challenge posed by hybrid entities – those that blur the boundaries between nature and culture – to the traditional categories upon which modernist thinking has been predicated. In essence, the concept of hybrids disrupts the neat divisions between the natural and social worlds that have been a hallmark of modernist discourse and provide us an opportunity to situate ourselves as ‘amodern’ as opposed to postmodern [@latour_postmodern_1990]. In addition to Latour’s critique of the modernistic distinction between nature and culture, the concept of the ‘parasite’, as expounded by Michel Serres, one of the influential thinkers who significantly influenced Latour’s intellectual development [@berressem_deja_2015]. It offers a valuable lens through which to examine the intricacies of interconnectedness and interdependence within our world. In his view, everything is enmeshed in a complex web of relationships that negates the existence of self-contained entities. Rather than seeing discrete and isolated entities, Serres invites us to see everything as an integral part of a larger system in which each component is inextricably dependent on the others [@serres_parasite_2014]. Together, these complementary perspectives invite us to reevaluate our understanding of the intricate tapestry of existence, emphasising the complexities of our relationship with the world. Thus, the appreciation of nature and culture is not mutually exclusive, but rather forms a continuous and evolving relationship. The modern perspective has historically separated these realms, treating them as distinct and disconnected. However, a more inclusive approach dissolves this artificial boundary and recognises the interconnectedness of nature and culture [@haraway_encounters_2008; @haraway_staying_2016]. This paradigm shift challenges the traditional modern understanding and invites a more holistic view in which natural and cultural heritage are mutually constructed within a complex network of relationships. Recognition of this relationship is essential in the context of heritage conservation and understanding. The dynamic interplay between nature and culture is recognised, and the acknowledgement of their coexistence promotes a more holistic approach to heritage conservation, where cultural practices, traditions and ecological systems are seen as interdependent aspects of the wider heritage tapestry. This recognition encourages us to see heritage sites not as isolated entities, but as part of a larger web of interconnectedness, and urges us to conserve and value both cultural and natural heritage with a shared responsibility. Adopting this interconnected perspective enables us to appreciate the profound connections between human societies and the natural world, and inspires a collective commitment to safeguarding these precious legacies for future generations. 3.1.1.4 Cultural Heritage Data As I embark on the exploration of CH data, it is first necessary to establish a basic understanding of data in this context. At its core, data represents more than mere numbers and facts; it constitutes a collection of discrete or continuous values that are assembled for reference or in-depth analysis. In essence, data are the rich tapestry upon which the narratives of CH are woven, making its comprehension a critical prerequisite for our expedition into this domain. Luciano Floridi — a prominent philosopher in the field of information and digital ethics — provides a thorough perspective on the term ‘data’ and offers valuable insights into its fundamental nature in its PI. He perceives ‘data at its most basic level as the absence of uniformity, whether in the real world or in some symbolic system. Only once such data have some recognisable structure and are given some meaning can they be considered information’ [@floridi_information_2010]. This initial definition sets the stage for a deeper exploration of Floridi’s understanding of data, as he further focuses on its transformative journey into a more meaningful and structured form, which we will explore next. Building upon Floridi’s foundational concept of data as the absence of uniformity, his subsequent definition provides a more comprehensive perspective. In a previous work, @floridi_is_2005 [p. 357] argues that ‘data are definable as constraining affordances, exploitable by a system as input of adequate queries that correctly semanticise them to produce information as output’. This definition highlights the dynamic role of data, not only as raw entities awaiting structure and meaning but also as elements imbued with the potential to constrain and guide systems towards the generation of meaningful information. Transitioning from Floridi’s concept of data, we progress to the view that data can be notably seen as interpretable texts within the DH perspective. According to (Owens, 2011) @owens_defining_2011: there are four main perspectives on how Humanists can engage with data: Data as constructed artefacts: data are a product of human creation, not something inherently raw or neutral; Data as interpretable texts: Humanists can interpret data as authored works, considering the intentions of the creators and how different audiences understand and use the data; Data as processable information: data can be processed by computers, allowing various forms of visualisation, manipulation and analysis, which can lead to further perspectives and insights; Data can hold evidentiary value: data, as a form of human artefact and cultural object, can provide evidence to support claims and arguments. These considerations highlight the multifaceted nature of data within the field of DH. It is in this complex landscape that we recognise that data transcends its traditional role as a passive entity. As @rodighiero_mapping_2021 [p. 26, citing [@akrich_sociologie_2006]] suggests that ‘there is no doubt that data are full-fledged actors that take part in the social network the actor-network theory describes, in which both human and non-human intertwine and overlap’. This notion – rooted and borrowed from STS – reinforces the idea that data, as an active and dynamic entity, plays a significant role in shaping the interactions between human and non-human actors in any digital spheres. From these angles, I can look at the characteristics of CH data. @bruseker_cultural_2017 [p. 94] articulate that ‘data coming from the cultural heritage community comes in many shapes and sizes. Born from different disciplines, techniques, traditions, positions, and technologies, the data generated by the many different specializations that fall under this rubric come in an impressive array of forms’. In exploring CH data, it is important to recognise the inherent diversity stemming from diverse disciplines, techniques, and traditions. @bruseker_cultural_2017 [p. 94] aptly emphasise this, highlighting the extensive array of forms in which data manifests. This heterogeneity raises fundamental questions about the unity and identity of CH data — a crucial aspect deserving acknowledgement within this context. As the authors astutely ponder: It could be a natural problem to pose from the beginning: if the data of this community indeed presents itself in such a state of heterogeneity, does it not beg the question if there is truly an identity and unity to cultural heritage data in the first place? It could be argued that Cultural Heritage, as a term, offers a fairly useful means to describe the fuzzy and approximate togetherness of a wide array of disciplines and traditions that concern themselves with the human past. Expanding on these insights, CH data refer to digital or data-driven affordances of CH[20], embodying a rich and varied compilation of insights originating from a variety of disciplines, techniques, traditions, positions and technologies. It encompasses both tangible and intangible aspects of a society’s culture as well as natural heritage. These data, derived from a wide range of disciplines, offer a latent capacity to support the generation of knowledge relating to historical time periods, geospatial areas, as well as current and past human and non-human activities. They are collected, curated and maintained by various entities such as libraries, archives, museums, higher education institutions, non-governmental organisations, indigenous communities and local groups as well as by the wider public. Building further on the mosaic of CH data, three primary dimensions come to the fore: heterogeneity, knowledge latency, and custodianship. Heterogeneity: As a fundamental characteristic, signifies the diverse forms and origins that shape this invaluable reservoir of human heritage. Different techniques and varying viewpoints in treating modelling also contribute to this heterogeneity [@guillem_faire_2023]. Knowledge latency: It highlights the temporal dimension, presenting CH data as a repository of latent knowledge awaiting discovery and interpretation. Notably, not all artefacts are – or should be – digitised, and even among those that are, (mis)representation and challenges in interconnecting them persist [@rossenova_iterative_2022]. Besides, the issue of structured data – or the lack of it – reinforces the aspect of knowledge latency [@haciguzeller_emerging_2021]. Custodianship: This dimension reinforces the essential role played by a variety of entities, predominantly CHIs, in safeguarding and managing resources, ensuring their preservation and accessibility for present and future generations. However, it is very important to acknowledge the great divide in terms of resources, with indigenous and local communities often facing challenges in custodianship responsibilities. Taken together, these dimensions contribute to a comprehensive understanding of the nuanced fabric of CH data. They reveal the diversity of forms and origins, the temporal aspects and the responsible stewardship that are crucial to the sustainability of such data. By shifting our focus to the sphere of humanities data, we broaden our scope to extend beyond the peculiarities of CH data. Drawing parallels between these areas allows us to grasp the interconnectedness of our heritage. CH data usually refers to information about cultural artefacts, sites, and practices that hold historical or cultural significance. Humanities data encompasses information about human culture, history, and society, including literature, philosophy, art, and language [@tasovac_cultural_2020]. Both often involve ethical considerations, such as ownership, access, and preservation, and require a comprehensive understanding of their various meanings and values [@ioannides_towards_2019]. Moreover, @schoch_big_2013 explains that data in the humanities, such as text and visual elements, have unique qualities. While these analogue forms could be considered data, they lack the ability to be analysed computationally as they are non-discrete. The semiotic nature of language, text and art introduces dimensions tied to meaning and context, making the term ‘data’ problematic. Critics question its use because it conflicts with humanistic principles such as contextual interpretation and the subjective position of the scholar. @schoch_big_2013 distinguish data in the humanities further into two core types: smart and big data. The former tends to be small in volume, carefully curated, but harder to scale such as digital editions. As for the latter, it describes voluminous and varied data and it loosely relies on the three ⋁ by @laney_3d_2001: volume, velocity and variety (see 3.3.1.2). Yet, big data in the humanities differs significantly from other fields as it rarely requires rapid real-time analysis, is less focused on handling massive volumes, and instead deals with diverse, unstructured data sources. @schoch_big_2013 concludes by arguing that ‘I believe the most interesting challenge for the next years when it comes to dealing with data in the humanities will be to actually transgress this opposition of smart and big data. What we need is bigger smart data or smarter big data, and to create and use it, we need to make use of new methods’. Data processing offers great potential for humanities research as @owens_defining_2011 argues: ‘In the end, the kinds of questions humanists ask about texts and artifacts are just as relevant to ask of data. While the new and exciting prospects of processing data offer humanists a range of exciting possibilities for research, humanistic approaches to the textual and artifactual qualities of data also have a considerable amount to offer to the interpretation of data’. While the term ‘data’ in the context of the humanities may raise questions due to its semiotic and contextual complexities, it serves as a foundation for understanding both CH data and broader humanities data. The data originating from CH and the humanities are inherently intertwined, as they often share a similar nature and purpose for scholars. This strong interconnection leads to a collaborative relationship between the GLAM sector and the humanities or DH. Scholars in the humanities frequently rely on digitised cultural artefacts, historical records, linguistic resources, and literary works provided by GLAM institutions to gain valuable insights into human history, culture, and traditions. The digitisation efforts and research collaborations between these entities play a pivotal role in preserving CH data and advancing our understanding of diverse societies, fostering a deeper appreciation of our shared human heritage. CH data and humanities data are distinct from other scientific data due to their qualitative and subjective nature, which requires different methods of analysis than quantitative scientific data. They include archival and special collections, rare books, manuscripts, photographs, recordings, artefacts, and other primary sources that reflect the cultural beliefs, identity, and memory of a people [see @sabharwal_2_2015; @izu_sociocultural_2022]. In summary, while CH data and humanities data share some commonalities, they differ in terms of scope and subject matter. CH data focuses specifically on the preservation and documentation of physical artefacts and intangible attributes, while humanities data encompasses a broader range of disciplines within the humanities [@munster_digital_2019]. However, it is important to note that the distinction between CH data and humanities data can be blurred, as (meta)data should ideally be co-created and integrated across both domains. 3.1.2 Representation and Embodiment of Cultural Heritage Data Digital representation of CH data, while preserving their context and complexity, remain a significant challenge. Those representations, sometimes referred to as digital surrogates or digital twins [@conway_digital_2015; @shao_digital_2018; @semeraro_digital_2021], of CH data can potentially lead to a loss of context and a reduction in the richness of the CH represented. For instance, a digital image of a cultural artefact may not capture its materiality, such as its texture, weight, and feel, which are essential aspects of the artefact’s cultural significance [@force_context_2021]. Furthermore, digital representations may also exclude vital social, cultural, and historical contexts surrounding the object, which is crucial to understanding its full cultural value [@cameron_beyond_2007]. This subsection is structured around two key dimensions. Firstly, it explores materiality, highlighting how digital representations may fail to capture important aspects that are integral to understanding the significance of CH resources. Secondly, it navigates the convergence and divergence between digitised CH and digital heritage. 3.1.2.1 Materiality Briefly, materiality refers to the physical qualities of an object or artefact, such as its colour, texture, and composition. As part of built heritage, the emphasis for materiality relates primarily to architecture, its associated techniques and the range of materials used in the construction or renovation of a building. More specifically, materiality acts as a pivotal factor in the transformation of disparate fragments of material culture into heritage, providing a vital link to the intangible facets of heritage. It contributes significantly to an individual’s social position and ability to navigate specific social milieus, thereby determining their ability to transmit cultural knowledge and values to future generations. The transformative potential of materiality in this regard underscores its fundamental role in perpetuating heritage and the transmission of cultural legacies [@carman_where_2009]. The physical attributes of objects, including texture, colour and shape, can evoke different emotions and associations, shaping people’s perceptions and memories of these events. Beyond retrospective influences, the potential of materiality extends to the creation of new memories and meanings, as exemplified by the use of materials such as glass in contemporary art. In such cases, materials evoke not only their inherent properties but also symbolic connotations, adding new layers of meaning and memory to the artistic narrative [@fiorentino_persistence_2023]. @edwards_photographs_2004 [p. 3] argue that materiality is not just concerned with physical objects in a positivist sense, but also involves complex and fluid relationships between people, images, and things. This relationship is influenced by social, cultural, and historical contexts, and plays a crucial role in shaping our perceptions and experiences of the world. Moreover, materiality is central to giving meaning to non-human entities [see @latour_actor-network_1996; @haraway_companion_2003; @star_institutional_1989], which emphasises the role of both humans and non-humans in shaping social and cultural phenomena. For CH data, diversity is at its core, as it allows for the exploration of different ways of knowing, experiencing, and expressing the world. Therefore, it is important to approach materiality not as a static and fixed concept, but as a dynamic and evolving phenomenon that is shaped by multiple forces [@hahn_digitale_2018 pp. 62-63]. When discussing materiality, there is also its negation, i.e. the notion of space or emptiness, such as how people interact with it through built heritage, which is regarded as a primordial medium of material culture, as expounded by @guillem_rcc8_2023 [p. 2]: The most intuitive and foundational definition of architecture is the built thing, that is the architecture qua building or built work. Human beings continuously interact with the built materiality through the non-materiality of space. Space as emptiness is formed and defined by the materiality that affects its existence. That relation between fullness and emptiness is what makes possible architecture as lived and experienced space. Materiality also offers a means of challenging dominant narratives and power structures, particularly the Western-centric perspective on CH. It gives greater recognition to the importance of intangible CH, which often takes a back seat to tangible objects in dominant narratives [@lenzerini_intangible_2011]. By highlighting the materiality of marginalised or forgotten elements, individuals can reclaim their heritage and challenge dominant narratives that marginalise certain groups, contributing to a more inclusive and accurate representation of CH. The primary focus in terms of digitisation is also on preserving material-based knowledge, often overlooking the dynamic and living nature of intangibility. @hou_digitizing_2022 stress the crucial role of computational heritage and information technologies advances in preserving and improving access to intangible CH. Effectively documenting the ephemeral aspects of intangible heritage and communicating the knowledge that is deeply linked to individuals are pressing challenges. Recent initiatives seek to capture the dynamic facets of cultural practices, using visualisation, augmentation, participation and immersive experiences to enhance experiential narratives. There is a strong call for a strategic re-evaluation of the intangible CH digitisation process, emphasising the human body as a vessel for traditions and memories, such as capturing traditional Southern Chinese martial arts, who has been passed down colloquially from generations and needs a methodological approach to capture such embodied knowledge [see @adamou_facets_2023; @hou_ontology-based_2024]. Even in cases where considerable efforts have been devoted to digitisation of physical objects such as medieval manuscripts and rare books over the past few decades [@nielsen_digitisation_2008], a lingering concern persists regarding the authentic encounter with the original artefact, despite its enhanced accessibility through digital surrogates [@van_lit_digital_2020]. Material attributes present a persistent challenge to achieving full replication. Despite advances facilitated by techniques such as RTI, 3D digitisation, or VR and AR, which offer better experiential immersion and are more effective than two-dimensional representations in addressing certain materiality concerns, the ability to replicate the multifaceted sensory experience associated with the original object, including the palpable emotions and spatial sensation, remains an ongoing endeavour, presenting a complex and multifaceted dimension of a challenge that is quite unlikely and may never be fully feasible [see @endres_digitizing_2019]. 3.1.2.2 Digitised Cultural Heritage and Digital Heritage The concepts of digitised CH and digital heritage intersect through the use of digital technology for the preservation, access, and dissemination of CH resources. Digitised CH focuses on converting physical artefacts into digital forms, ensuring their long-term preservation and accessibility through digital means. Conversely, digital heritage includes a broader range of digital tools and resources ‘to preserve, research and communicate cultural heritage’ (@munster_digital_2021 p. 2, citing [@georgopoulos_cipas_2018]). Digitised CH acts as a critical bridge, facilitating a transition from traditional or analogue GLAM practices to a digital environment. This shift is pivotal in unlocking the potential of digitised CH. These values extend beyond scholarly pursuits, despite the majority of digitisation efforts being driven by research funding. In doing so, it becomes evident that the creative reuse and data-driven innovation stemming from digitised CH necessitate substantial and sustained investment in the GLAM sector. This investment is fundamental, especially amidst reduced funding due to years of austerity. @terras_value_2021 underscore this need, shedding light on the delicate balance required with commercial outcomes. They emphasised that leveraging CH datasets offers vast opportunities for technological innovation and economic benefits, urging professionals from various domains to collaborate and experiment in a low-risk environment. Digital heritage[21] encompasses a wide range of human knowledge and expression in cultural, educational, scientific and various other domains. In today’s rapidly evolving technological landscape, an increasing amount of this knowledge is either digitally created or in the process of being converted from analogue to digital formats [@he_digital_2017]. These digital resources cover a wide range, including text, multimedia, software and more, and require deliberate and strategic management to ensure their long-term preservation. This valuable heritage, spread across the globe and expressed in multiple languages [@unesco_charter_2009]. In summary, digitised CH not only forges the path to digital heritage but also embodies an ever-evolving cultural landscape. Recognising the transformative potency with digital heritage is essential to enriching our understanding and engagement with our cultural roots. Both concepts are intimately embedded in CH and play a vital role as conduits. 3.1.3 Collectives and Apparatuses The collaborative efforts of collectives and the operation of various apparatuses play a fundamental part in shaping the preservation, interpretation and dissemination of cultural artefacts and practices. This subsection is concerned with the central contributions of human and non-human actors engaged in cooperative action and the modus operandi of various apparatuses, such as building (digital) infrastructures. Some of these considerations are drawn from STS, which are more fully captured in , serving as the theoretical framework for the thesis. Bruno Latour’s concept of the importance of collectives and apparatuses [see @latour_habiter_2022 p. 15] can be extrapolated to CHIs. Every institution’s or project’s ultimate success hinges on the collaboration and support of individuals, as well as the tools, systems and technologies they use. Indeed, paralleling CHIs with wider contexts suggests that collective efforts and apparatuses play a critical role in shaping the effectiveness of any institution. This highlights the importance of recognising the influence of both human and non-human entities in institutional functioning and underlines the need for a more comprehensive understanding of the dynamics involved therein. ANT can be a useful lens to analyse the creation, use, and dissemination of CH data. ANT posits that actors are not independent entities but are instead part of a network that consists of both human and non-human entities. According to ANT, every actor, be it a person or a technology, is a node in the network and contributes to the overall functioning of the network [@latour_reassembling_2005; @callon_actor_2001]. When we apply this framework to CHIs, we can identify the different actors involved in the creation, use, and dissemination of CH data. These actors can include individuals, such as curators, conservators, and historians, as well as non-human entities, such as databases, digitisation equipment, and software. Moreover, this approach can help us understand the interactions between these actors and how they shape the overall functioning of CHIs. For instance, digitisation equipment can enable the creation of high-quality digital images of artefacts, which can then be disseminated globally through online platforms. Examining the Notre-Dame de Paris, one can discern the keystones at the summit of its arches as indispensable actors within its architectural narrative. These keystones, imbued with historical narratives and a non-human facet, played a central role in the (digital) rescue and subsequent restoration efforts following the tragic roof fire in April 2019. @guillem_faire_2023’s study further elucidates this restoration journey, emphasising how the keystones, with their individual narratives and structural significance, contributed to the (digital) reassembly. Building on this perspective, we can explore the importance of community involvement in the preservation and management of CH data, thereby increasing the potential for sustainable practices and inclusive engagement. Local communities have an integral part to play in the management and preservation CH data, especially in the digital age where resources are often scarce for GLAM institutions. Community involvement has several benefits, including increased engagement and participation, access to local knowledge and expertise, and more sustainable and inclusive management and preservation practices [@ridge_12_2021]. For instance, geophysical technologies such as ground-penetrating radar have been used with great success in identifying and evaluating the depth, extent, and composition of CH resources for research and management purposes, easing tensions when working with sensitive ancestral places [@nelson_role_2021]. Collaborative environments can also help with CH information sharing and communication tasks because of the way in which they provide a visual context to users, making it easier to find and relate CH content [@respaldiza_hidalgo_metadata_2011]. Embarking on @brown_communities_2023 [pp. 6-7]'s insightful analysis, a prominent illustration of exemplary community practice can be found in the sphere of community museums in Latin America: Inicio - Museos Comunitarios de América[22]. The author highlights the role of community engagement and leadership in the creation and operation of these museums. Such engagement ensures that these museums are not imposed from outside, but rather emerge organically as museums the community, resonating with its unique CH and identity. This approach is consistent with the ethos of ‘telling a story’, building a future, which embodies a deep commitment to community empowerment and cultural preservation. This community-centric approach amplifies the museum’s resonance with the community’s lived experiences and historical narratives. At the same time, institutions can also benefit from collaborating with peer communities like IIIF to promote greater access to their collections. IIIF provides a set of open standards for delivering high-quality digital objects online at scale, which can help memory and academic institutions share their collections with each other and with the wider public [@snydman_international_2015; @weinthal_iiif_2019]. By adopting IIIF standards, organisations can make their collections more discoverable and accessible to researchers, developers, and other CH professionals [@padfield_joseph_practical_2022]. Involvement in communities such as IIIF also helps to mitigate costs as they develop shared or adaptable resources and services [@raemy_international_2017]. Participation of communities in the management and preservation of CH resources is essential to ensure that CH is protected and accessible for future generations. By involving and participating in communities, GLAMs can tap into local as well as peer knowledge and expertise, making management and preservation practices more sustainable and inclusive. This approach also increases engagement and participation, ensuring that CH is valued and appreciated by the wider community. Thus, memory institutions need to collaborate closely with communities to ensure that CH data, and their underlying infrastructures and services, is being effectively curated [@delmas-glass_fostering_2020]. Closely related to this context, @star_ethnography_1999 points out the often unacknowledged role of infrastructure within society. She argues that infrastructures are necessary but often invisible and taken for granted: People commonly envision infrastructure as a system of substrates – railroad, lines, pipes and plumbing, electrical power plants, and wires. It is by definition invisible, part of the background for other kinds of work. It is ready-to-hand. This image holds up well enough for many purposes – turn on the faucet for a drink of water and you use a vast infrastructure of plumbing and water regulation without usually thinking much about it. [@star_ethnography_1999 p. 380] @star_ethnography_1999 [pp. 381-382, citing [@star_steps_1994]] identifies nine dimensions to define infrastructure. They provide a comprehensive framework to comprehend the nuanced nature of infrastructure and its pervasive impact on diverse societal facets. The following dimensions are vital for analysing the often imperceptible, yet deeply embedded structures that constitute the foundational framework of both daily life and broader societal operations[23]: Embeddedness: Infrastructure is sunk into and inside of other structures, social arrangements, and technologies. People do not necessarily distinguish the several coordinated aspects of infrastructure. Transparency: Infrastructure is transparent to use, in the sense that it does not have to be reinvented each time or assembled for each task, but invisibly supports those tasks. Reach or scope: This may be either spatial or temporal – infrastructure has reach beyond a single event or one-site practice. Learned as part of membership: Strangers and outsiders encounter infrastructure as a target object to be learned about. New participants acquire a naturalised familiarity with its objects, as they become members. Links with conventions of practice: Infrastructure both shapes and is shaped by the conventions of a community of practice. Embodiment of standards: Modified by scope and often by conflicting conventions, infrastructure takes on transparency by plugging into other infrastructures and tools in a standardised fashion. Built on an installed base: Infrastructure does not grow de novo; it wrestles with the inertia of the installed based and inherits strengths and limitations from that base. Becomes visible upon breakdown: The normally invisible quality of working infrastructure becomes visible when it breaks: the server is down, the bridge washes out, there is a power blackout. Is fixed in modular increments, not all at once or globally: Because infrastructure is big, layered, and complex, and because it means different things locally, it is never changed from above. Changes take time and negotiations, and adjustment with other aspects of the systems are involved. An appreciation of these dimensions is crucial to the analysis of the network of infrastructural systems that underpin contemporary society, and is necessary for the analysis of any digital infrastructure that manages CH data. Digital infrastructures – also known as e-infrastructures or cyberinfrastructures – are forms of infrastructure that are essential for the functioning of today’s society [see @jackson_understanding_2007; @ribes_sociotechnical_2010]. These kinds of infrastructure need to be understood as socio-technical systems, showcasing the interplay between technological components (such as hardware, software, and networks) and the social and organisational contexts in which they operate [@star_steps_1994]. According to @fresa_data_2013 [p. 33], digital CH infrastructures should be able to serve the research needs of humanities scholars as well as having dedicated services for education, learning, and general public access. In terms of requirements, @fresa_data_2013 [pp. 36-39] identifies three different layers of services: for content providers, for managing and adding value to the content, and for the research communities. For the latter, several sub-services tailored to research communities are listed. These encompass long-term preservation, PIDs[24], interoperability and aggregation, advanced search, data resource set-up, user authentication and access control, as well as rights management. Overall, (digital) infrastructures are imperative apparatuses in preserving and sharing CH data. First, they support preservation by archiving digital artefacts and their metadata, protecting them from deterioration and loss. Secondly, these infrastructures facilitate accessibility, allowing a global audience to explore and appreciate cultural heritage online. Finally, they encourage interpretation and engagement, promoting cross-cultural understanding and knowledge sharing. Moreover, infrastructure is a fundamental component that demands extensive investment, particularly in the creation of streamlined integration layers capable of interacting seamlessly with different systems. This can be exemplified by such institutions as the Rijksmuseum[25] , where a well-constructed infrastructure allows for efficient integration and interaction with various technological and organisational systems [@dijkshoorn_building_2023]. This investment serves as the foundation for an institution’s functionality, allowing for the smooth flow of data, the coordination of processes and the optimal use of resources. In a similar vein, @canning_power_2022 argue that the often invisible structures of metadata, particularly in Linked Data ontologies, play a crucial role in shaping the interpretation of data. These structures, while not immediately apparent, are imbued with value judgements and ideological implications, extending the impact of metadata beyond mere technicalities to encompass diverse and inter-sectional perspectives. This multidimensional ontological approach addresses the complexity and diversity of data sources, paralleling the need for sophisticated infrastructures in institutions like the Rijksmuseum. It underscores the importance of integrating inter-sectional feminist principles in information systems, reflecting a commitment to diverse ways of knowing and nuanced storytelling. Furthermore, as all (meta)data requires storage, it raises an important concern in terms of the entrenched power dynamics governing knowledge representation within information systems, as pointed out by @canning_what_2023. This perspective, initially centred around museum objects, holds broader implications for all CH resources [see @simandiraki-grimshaw_what_2023]. Canning strongly advocates for the essential adaptation of databases to embrace a diverse array of epistemological approaches by introducing new types of affordances. Databases, despite their role in information preservation, wield significant influence that can inadvertently stifle diverse modes of knowledge interpretation and ‘can constrain ways of knowing’. Furthermore, she compellingly argues that modifications to databases extend beyond technical adjustments; they are inextricably linked to shifts in institutional power dynamics and the enduring, often inequitable, power dynamics governing the world of museums – or any CHIs – and their curation. In understanding the interplay of collectives and apparatuses, it is clear that key actors, including individuals, institutions, local and global communities, as well as the sophisticated fabric of (digital) infrastructures and their components, are deeply entangled and interconnected. These entities, both human and non-human, collectively shape and navigate the rich networks of human interactions and technologies that underpin the foundations of contemporary society. 3.2 Cultural Heritage Metadata This subsection offers insights into the importance of metadata in CH, underlining its role in enhancing the understanding and accessibility of cultural artefacts. It is structured into three four[26] essential parts. I start with an introductory segment in 3.2.1, then I explore the types and functions of metadata in 3.2.2, thirdly in 3.2.3, I outline some of the most important CH metadata standards, and finally in 3.2.4, I explore the use of KOS, such as generic classification systems and controlled vocabularies. 3.2.1 Data about Data For curating CH resources, metadata[27], ‘data about data’, is probably one of the key concept that needs to be introduced here. Metadata permeate our digital and physical landscapes, playing a vital role in organising, describing and managing a vast array of information. Rather than being confined to a specific domain, they are ubiquitous and pervade many aspects of our everyday lives [@riley_understanding_2017 pp. 2-3]. From websites and databases to social media platforms and online marketplaces, metadata adds meaning to data, enabling users to understand their context, relevance and provenance. As an example, Figure 3.1 shows the metadata of a book[28]. Figure 3.1: Snapshot from the Swisscovery Platform Showing the Bibliographic Record of @zeng_metadata_2022 Metadata are central to the management and preservation of CH data, providing essential information to ensure that data can be properly organised, discovered and retrieved. For example, they facilitate the understanding and interpretation of data, enabling scholars and the public to access and use them effectively [@constantopoulos_aspects_2008]. Metadata also help to ensure the long-term preservation and accessibility of CH data [@zeng_metadata_2022 pp. 490-491]. Providing metadata in a structured manner facilitates forms of aggregation, i.e. individuals and institutions being able to harvest and organise metadata from multiple sources or repositories into a centralised location [see @freire_survey_2017; @freire_metadata_2021]. In addition, the importance of metadata as a gateway to information is particularly compelling when the primary embodiment of a record is either unavailable or lost. In cases where resources, time constraints, sensitive content or strategic decisions prevent the digitisation of an item, metadata becomes the principal means of representation and access. If a physical record is lost or damaged, the metadata associated with that record acts as a proxy for the record. @riley_understanding_2017 [p. 5] discusses the transformation of libraries over time. Initially, libraries moved from search terminals to the modern web-based resource discovery systems we use today. This shift was driven by advances in computerisation. Libraries’ basic approach to metadata is ‘bibliographic’, deeply rooted in their traditional expertise in describing books. This approach involves providing detailed descriptions of individual items so that users can easily locate them within the library’s collection. On the other hand, archives use ‘finding aids’, which are descriptive inventories of their collections, coupled with historical context. These aids are essential for users to understand the material and to find groups of related items within the archive. The metadata used in archives allows for the contextualisation of materials, particularly papers of individuals or records of organisations, providing a richer understanding of the content. Similarly, museums actively manage and track their acquisitions, exhibitions and loans through metadata. Museum curators use metadata to interpret collections for visitors, explaining the historical and social significance of artefacts and describing the relationships and connections between different objects. This helps to enhance the overall visitor experience and understanding of the artefacts on display or the digital resources on a particular website. 3.2.2 Types and Functions CHIs share common objectives and concerns related to information management, as highlighted by @lim_metadata_2011 [pp. 484-485]. These goals typically include facilitating access to knowledge and ensuring the integrity of CH data. However, it is important to note that CHIs also differ widely in how they deal with metadata. Different domains have unique approaches and standards for describing the materials they collect, preserve and disseminate, and even within a single domain there are significant differences. There have been different attempts to categorise the metadata landscape. For instance, @baca_setting_2016 identified the following five categories of metadata and their respective functions: Administrative: Metadata used in managing and administering collections and information resources, such as acquisition and appraisal information or documentation related to repatriation. Descriptive: Metadata used to identify, authenticate, and describe collections and related trusted information resources. Finding aids, cataloguing records, annotations by practitioners and end users, as well as metadata generated by or through a given DAM system can often be classified as descriptive metadata. Preservation: Metadata related to the preservation management of collections and information resources. Common examples of preservation metadata are documentation of physical condition of resources or of any actions taken to preserve resources, whether physical restoration or data migration. Technical: Metadata related to how a system functions or metadata behaves. Examples include software documentation and digitisation information. Use: Metadata related to the level and type of use of collections and information resources, such as circulation records, search logs, or rights metadata. Meanwhile @riley_seeing_2009, as illustrated in a comprehensive visualisation graph[29], suggested seven functions, i.e. the role a standard play in the creation and storage and metadata, and seven purposes referring to the general type of metadata. Functions: Conceptual Model, Content Standard, Controlled Vocabulary, Framework/Technology, Markup Language, Record Format, and Structure Standard. Purposes: Data, Descriptive Metadata, Metadata Wrappers, Preservation Metadata, Rights Metadata, Structural Metadata, and Technical Metadata. Almost a decade later, @riley_understanding_2017 [pp. 6-7] summarised metadata types into four groupings instead of the seven purposes previously mentioned. is removed from the list and technical, preservation and rights metadata are now grouped into a newly created administrative metadata category. Descriptive metadata: For finding or understanding a resource Administrative metadata: Umbrella term referring to the information needed to manage a resource or that relates to its creation 2.1 Technical metadata: For decoding and rendering files 2.2 Preservation metadata: Long-term management of files 2.3 Rights metadata: Intellectual property rights attached to content Structural metadata: Relationships of parts of resources to one another Markup Language: Integrates metadata and flags for other structural or semantic features within content[30]. This classification of metadata types and function differs to the categories identified by @baca_setting_2016 mostly due to the addition of structural metadata and markup language as their own categories [@zeng_metadata_2022 p. 19]. Table 3.1 lists the major types of metadata according to @riley_understanding_2017 [p 7] and include example properties and their primary uses. Table 3.1: Types of Metadata According to @riley_understanding_2017 [p. 7] Metadata (Sub)type Example properties Primary uses 1. Descriptive metadata Title, Author, Subject, Genre, Publication date Discovery, Display, Interoperability 2.1 Technical metadata File type, File size, Creation date, Compression scheme Interoperability, Digital object management, Preservation 2.2 Preservation metadata Checksum, Preservation event Interoperability, Digital object management, Preservation 2.3 Rights metadata Copyright status, Licence terms, Rights holder Interoperability, Digital object management 3. Structural metadata Sequence, Place in hierarchy Navigation 4. Markup languages Paragraph, Heading, List, Name, Date Navigation, Interoperability Ultimately, metadata can also be leveraged to create more inclusive and diverse representations of CH. For instance, metadata can be used to document and promote underrepresented communities and their heritage, providing greater visibility and recognition. This approach aligns with the principles of decolonising CH, promoting equity and social justice by recognising and valuing diverse cultural perspectives, especially in the prevailing anglophone and Western-centric standpoint in DH [@mullaney_internet_2021; @mahony_cultural_2018]. Moreover, the distinction between data and metadata, as discussed in the work of @alter_view_2023, is not always distinct, leading to the concept of ‘semantic transposition’. This complexity reflects in CH where what is considered metadata in one context might be primary data in another, underscoring the necessity for adaptable frameworks in data management. This understanding is crucial for fostering inclusive and diverse representations in CH, ensuring that all cultural narratives are appropriately documented and acknowledged. 3.2.3 Standards Metadata standards play a crucial role in ensuring that data are organised and consistent, facilitating mutual understanding between different stakeholders [@raemy_enabling_2020]. CHIs such as GLAMs typically follow established conventions or standards when organising their resources. Current methods of cataloguing have historical roots dating back to the century, particularly with the development of cataloguing systems such as Antonio Panizzi’s at the British Museum and Charles Coffin Jewett’s efforts to mechanically duplicate entries at the library of the Smithsonian Institution [@zeng_metadata_2022 pp. 14-15]. Unique metadata standards, rules and models have been established and maintained within specific sub-fields. In addition, certain standards for information resources have been endorsed by authoritative bodies [@greenberg_understanding_2005], and some are used exclusively within specific domain communities [@hillmann_metadata_2008]. @riley_understanding_2017 [p. 5] underscores the predilection of CH metadata – whether these standards emanate from libraries, archives, or museums – toward accentuating descriptive attributes. The foundational CH metadata standards, primarily conceived to [@zeng_metadata_2022 p. 11], manifest this thematic focus. Within the CH domain, metadata standards vary widely in scope, and a number of different standards have been developed to meet different needs and priorities[31] [@freire_availability_2018]. The following quoted passage sheds some light on the different approaches and levels of collaboration in metadata standardisation, namely among the library and museum sectors. Despite the striving for homogeneity, in practice, the production of metadata among information specialists and the use of metadata standards is already marked by considerable diversity. This has come about for very pragmatic reasons. Different types of objects and collections require different types of metadata. The curatorial interest for particular information differs for example between images held in an art gallery and a library, as does the information specialists’ domain expertise. Accordingly, diversity in metadata practice seems to be greatest in museums as they are the institutions that govern the most diverse collections. While the library sector has ‘systematically and cooperatively created and shared’ metadata standards since the 1960s, the museum sector, mostly handling images and objects, has been slower to establish such collaboration and consensus. [@dahlgren_diversity_2020 p. 244] In this context, I want to focus on some metadata standards that have proved vital across libraries, archives, museums and galleries. These standards, which I will briefly describe, serve as the foundation for organising, describing, and enabling efficient access to vast and diverse collections. Of particular interest I will be taking a closer look at CIDOC-CRM as it serves as the cornerstone of Linked Art, a fundamental LOUD standard. 3.2.3.1 Library Metadata Standards In libraries, several metadata standards have played crucial roles in organising and accessing collections over the years. The most prevalent historical standard, MARC[32], was a pilot project from the 1960s funded by the CLIR and led by the LoC to structure cataloguing data and distribute them through magnetic tapes [@avram_marc_1968 p. 3]. The standard evolved into MARC21 in 1999 [@zeng_metadata_2022 p. 418] – as exemplified by Code Snippet 3.1, providing a structured format for bibliographic records and related information in machine-readable form. It uses codes, fields, and sub-fields to structure data. Another significant historical standard is the AACR, published in 1967 and revised in 1978 that provides sets of rules for descriptive cataloguing of various types of information resources. Code Snippet 3.1: MARC21 Record of @zeng_metadata_2022 in the Swisscovery Platform leader 01424nam a2200397 c 4500 001 991170746542405501 005 20220427104002.0 008 210818s2022 xxu b 001 0 eng 010 ##$a 2021031231 020 ##$a9780838948750 $qBroschur 020 ##$a0838948758 035 ##$a(OCoLC)1264724191 040 ##$aDLC $bger $erda $cDLC $dCH-ZuSLS UZB ZB 042 ##$apcc 050 00$aZ666.7 $b.Z46 2022 082 00$a025.3 $223 082 74$a020 $223sdnb 100 1#$aZeng, Marcia Lei $d1956- $4aut $0(DE-588)136417035 245 10$aMetadata $cMarcia Lei Zeng and Jian Qin 250 ##$aThird edition 264 #1$aChicago $bALA Neal-Schuman $c2022 300 ##$axxvi, 613 Seiten $bIllustrationen 336 ##$btxt $2rdacontent 337 ##$bn $2rdamedia 338 ##$bnc $2rdacarrier 504 ##$aIncludes bibliographical references and index 650 #0$aMetadata 650 #7$aMetadata $2fast $0(OCoLC)fst01017519 650 #7$aMetadaten $2gnd $0(DE-588)4410512-5 776 08$iErscheint auch als $nOnline-Ausgabe $tMetadata $z9780838937969 776 08$iErscheint auch als $nOnline-Ausgabe $tMetadata $z9780838937952 700 1#$aQin, Jian $d1956- $4aut $0(DE-588)1056085541 856 42$3Inhaltsverzeichnis $qPDF $uhttps://urn.ub.unibe.ch/urn:ch:slsp:0838948758:ihv:pdf 900 ##$aOK_GND $xUZB/Z01/202203/klei 900 ##$aStoppsignal FRED $xUZB/Z01/202203 949 ##$ahttps://urn.ub.unibe.ch/urn:ch:slsp:0838948758:ihv:pdf AACR is no longer maintained and was replaced by RDA[33] around 2010 to be a more adaptive standard to contemporary needs. RDA, while not a markup language like MARC, serves as a content standard that guides the description and discovery of resources, focusing on user needs and facilitating improved navigation of library collections. Its goal is to provide a flexible and extensible framework for the description of all types of resources, ensuring discoverability, accessibility, and relevance for users[34] [@sprochi_where_2016 p. 130]. Libraries often leverage other standards to enrich their metadata practices. MODS[35], introduced in 2002, offers a more flexible XML-based schema for bibliographic description, allowing for better integration with other standards and systems. It was initially developed to carry [@zeng_metadata_2022 p. 423]. MODS provides a balance between human readability and machine processing, making it suitable for a wide range of resources and use cases [@guenther_mods_2003 p. 139]. METS[36], on the other hand, is a standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library. METS, developed as an initiative of the DLF, provides a flexible and extensible framework for structuring metadata, allowing for the packaging of complex digital objects [@cantara_mets_2005 pp. 238-239]. While MODS is primarily concerned with bibliographic information, METS focuses on structuring metadata for digital objects, making it particularly useful for digital libraries and repositories. A further important standard is FRBR, a conceptual framework for understanding and structuring bibliographic data and access points. Originally developed by IFLA in 1997 as part of its functional requirements family of models, FRBR describes three main groups of entities, relationships, and attributes as illustrated by Figure 3.2. The first group of entities are the foundation of the model which characterises four levels of abstraction: WEMI [@denton_functional_2006 p. 231]. FRBR has had a significant impact on the development of RDA, which is loosely aligned with the principles and structures defined by the conceptual framework, but as it isn’t a data model per se; it does not inform how to record bibliographic information in day-to-day practice and focus heavily on textual resources[37] [@sprochi_where_2016 pp. 130-131]. Furthermore, @cossham_models_2017 [p. 11] asserts that FRBR and RDA, ‘don’t align well with the ways that users use, understand, and experience library catalogues nor with the ways that they understand and experience the wider information environment’. Figure 3.2: The FRBR Conceptual Framework. Adapted from @zou_constructing_2018 [p. 36] A further important standard in the field of library science is the LRM, which was introduced as a comprehensive conceptual framework. It provides a broad understanding of bibliographic data and user-centric design principles, aligning with FRBR. LRM defines key entities, attributes, and relationships important for bibliographic searches, interpretation, and navigation – as shown in Figure 3.3. It operates at the conceptual level and does not dictate data storage methods. Attributes in LRM can be represented as literals or URIs. The model is presented in a structured document format to support LOD applications and reduce ambiguity. During its development, a parallel process created FRBRoo (see 3.2.3.3), a model that extends the original FRBR model by incorporating it into CIDOC-CRM. FRBRoo focuses on CH data and is more detailed than LRM, which is designed specifically for library data and follows a high-level, user-centric approach [@riva_ifla_2017 pp. 9-13]. The LRM model, known as LRMer[38], was released in 2020 by IFLA [@zeng_metadata_2022 p. 163]. Figure 3.3: Overview of Relationships in LRM [@riva_ifla_2017 p. 86] BibFrame[39] is another metadata standard in the library domain. It was initiated around 2011 by the LoC to be a successor of MARC, which had become obsolete [see @tennant_marc_2002] as well as being invisible to web crawlers and search engines preventing adequate discoverability of bibliographic resources [@sprochi_where_2016 p. 132]. BibFrame is a loosely RDF-based model [@sanderson_linked_2015], intending to improve the interoperability and discoverability of library resources. While the BibFrame model may not perfectly correspond with the WEMI entities outlined in FRBR, it is possible to effectively link BibFrame resources to FRBR entities, ensuring their compatibility [@sprochi_where_2016 p. 133]. BibFrame aims to transition from MARC by providing a more web-friendly framework, focusing on the relationships between entities, improving data sharing, and accommodating the digital environment. Conversely, @edmunds_bibframe_2023 argues that BibFrame is unaffordable and leads to elitism within libraries, with the main beneficiaries being well-funded institutions, particularly in North America, while placing a financial burden on others. This approach, endorsed by bodies such as the LoC, is criticised for its high cost, impracticality, inequity and limited benefits for cataloguers, libraries, vendors and the public they serve. In addition, the author highlights BibFrame's lack of user friendliness, regardless of the intended users, and criticises the notion of adopting Linked Data for its own sake without substantial practical benefits. 3.2.3.2 Archival Metadata Standards For archives, metadata standards like EAD[40] and ISAD(G)[41] have been pivotal. EAD, introduced in the mid-1990s – it originated in 1993 and the first version of EAD was released in 1998, provides a hierarchical structure for representing information about archival collections, offering comprehensive descriptions that aid researchers, archivists, and institutions in managing and providing access to archival records. Its goal is to create a standard for encoding finding aids to improve accessibility and understanding of archival collections [@pitti_encoded_1999 pp. 61-62]. On the other hand, ISAD(G), released in its first version in 1994 by ICA, offers a more general international standard for archival description, providing a framework for describing all types of archival materials, including fonds, sub-fonds, series, files, and items [@shepherd_application_2000 p. 57]. ISAD(G) aims to establish consistent and standardised archival description practices on a global scale, facilitating the sharing and exchange of archival information. PREMIS[42], is another metadata standard that was initially released in 2005 – version 3.0 is the latest specification, published in 2016 – and focuses on the preservation of digital objects, consisting of four interrelated entities: Object, Event, Agent, and Rights [@caplan_practical_2005 p. 111]. The main objective of PREMIS is to help institutions ensure the long-term accessibility of data by capturing key details about their creation, format, provenance, and preservation events. It is seen as an elaboration of OAIS, which categorises information required for preservation in several functional entities and types of information package [see @bates_open_2009 pp. 425-426] – as illustrated by Figure 3.4, expressed through the mapping of preservation metadata onto the conceptual model [@zeng_metadata_2022 pp. 493-494]. Figure 3.4: OAIS Functional Model Diagram by @mathieualexhache_oais_2021 The latest development in metadata standards for archives is the creation of RiC, which has been developed since 2012 by ICA [@clavaud_ica_2021 pp. 79-80]. RiC is structured into four complementary parts [@ica_expert_group_on_archival_description_records_2023 p. 1] intended to cover and replace existing archival standards such as ISAD(G): RiC Foundations of Archival Description: A brief description of the foundational principles and purposes of archival description. RiC Conceptual Model: A high-level framework for archival description[43], as shown in Figure 3.5. RiC-O: The ontology[44], which embodies a specific implementation of the conceptual model. It is formally expressed in OWL to make archival description available using LOD techniques – which facilitating extensions [see @mikhaylova_extending_2023] – and adheres to a conceptual vocabulary specific to archival description. It provides the ability to navigate and interpret complex archival holdings and foster meaningful research and discovery. The ontology includes seven main groups of entities: Record, Agent, Rule, Event, Date, Place, and Instantiation. RiC Application Guidelines: A part in development at the time of writing which will provide practitioners and software developers with guidance and examples for implementing the conceptual model and the ontology in records and archival management systems. Figure 3.5: Global Overview of the Core Entities Defined by the RiC Conceptual Model. Slightly Adapted from https://github.com/ICA-EGAD/RiC-O 3.2.3.3 Museum and Gallery Metadata Standards In the museum and gallery domain, various metadata standards and conceptual models have significantly contributed to the management, organisation, and accessibility of CH objects and artworks. Notable among these are CDWA, CCO, LIDO, CIDOC-CRM, as well as Linked Art. CDWA[45], developed in the mid-1990s and maintained by the Getty Vocabulary Program, and CCO[46] created by the VRA[47], introduced in the early 2000s, primarily focus on describing art and cultural artefacts, providing a framework for recording essential details like artist, title, medium, date, and provenance. CDWA is a comprehensive set of guidelines for cataloguing and describing various cultural objects, including artworks, architectural elements, material culture items, collections of works, and associated images. While not a data model itself, it offers a conceptual framework for designing data models and databases, as well as for information retrieval. It then evolved into CDWA Lite, an XML schema for data harvesting purposes [@baca_categories_2017 pp. 1-2]. CCO comprises of both rules and examples of the CDWA categories and the VRA Core 4.0 for describing, documenting, and cataloguing cultural works and their visual surrogates[48] [@coburn_cataloging_2010 pp. 17-18]. Both CCO and CDWA are standards that the CIDOC[49] recommends and supports for museum documentation. LIDO[50] is a CIDOC standard introduced in the early 2000s which offers a lightweight XML-based serialisation used for describing museum-related information – as shown in Code Snippet 3.2. It provides a format for the interchange of data about art and CH objects, complementing CDWA and CCO as it integrates and extends CDWA Lite with elements of CIDOC-CRM [@stein_using_2019 p. 1025]. Ultimately, LIDO's goal is to enhance interoperability, accessibility, and the sharing of collection information, enabling institutions to connect and showcase their collections in diverse contexts [@coburn_lido_2010 p. 3]. LIDO is also a CIDOC Working Group, which are created to tackle particular issues or areas of interest[51]. Code 3.2: Example of a LIDO Object in XML from @lindenthal_lido_2023 <lido:lido> <lido:lidoRecID lido:source="ld.zdb-services.de/resource/organisations/DE-Mb112" lido:type="http://terminology.lido-schema.org/lido00099"> ld.zdb-services.de/resource/organisations/DE-Mb112/lido/obj/00076417 </lido:lidoRecID> <lido:descriptiveMetadata xml:lang="en"> <lido:objectClassificationWrap> <lido:objectWorkTypeWrap> <lido:objectWorkType> <skos:Concept rdf:about="http://vocab.getty.edu/aat/300033799"> <skos:prefLabel xml:lang="en"> oil paintings (visual works) </skos:prefLabel> </skos:Concept> </lido:objectWorkType> </lido:objectWorkTypeWrap> </lido:objectClassificationWrap> <lido:objectIdentificationWrap> <lido:titleWrap> <lido:titleSet> <lido:appellationValue lido:pref="http://terminology.lido-schema.org/lido00169" xml:lang="en"> Mona Lisa </lido:appellationValue> </lido:titleSet> </lido:titleWrap> </lido:objectIdentificationWrap> </lido:descriptiveMetadata> </lido:lido> CIDOC-CRM[52], developed since 1996 by the CIDOC and more specifically maintained by the CRM-SIG — which convenes quarterly[53], is a formal and top-level ontology that offers a comprehensive conceptual framework for describing CH resources, allowing for a deep understanding of relationships between different entities, events, and concepts for museums [@doerr_cidoc_2003 pp. 75-76]. It aims to provide a common semantic framework for information integration, supporting robust knowledge representation and fostering collaboration and interoperability within the CH sector as it can also mediate different resources from libraries and archives. The latest stable version of the conceptual model is version 7.1.2[54], published in June 2022, and comprises of 81 classes and 160 properties[55] [see @bekiari_cidoc_2021]. Within the base ontology of CIDOC-CRM – or CRMBase – and despite the emergence of new developments and gradual changes, there is a fundamental and stable core that can be succinctly outlined. This fundamental structure acts as a basic orientation for understanding the way in which data is structured within CIDOC-CRM. Examining the hierarchical structure of CIDOC-CRM, one can identify the main top-level branches, namely: E18 Physical Thing: This class comprises all persistent physical items with a relatively stable form, human-made or natural. E28 Conceptual Object: This class comprises non-material products of our minds and other human produced data that have become objects of a discourse about their identity, circumstances of creation or historical implication. The production of such information may have been supported by the use of technical devices such as cameras or computers. E39 Actor: This class comprises people, either individually or in groups, who have the potential to perform intentional actions of kinds for which someone may be held responsible. E53 Place: This class comprises extents in the natural space we live in, in particular on the surface of the Earth, in the pure sense of physics: independent from temporal phenomena and matter. They may serve describing the physical location of things or phenomena or other areas of interest. E2 Temporal Entity: This class comprises all phenomena, such as the instances of E4 Periods and E5 Events, which happen over a limited extent in time. Complemented by entities tailored for the documentation of E41 Appellation and E55 Type, the structure – as shown in Figure 3.6 – provides a potent set of means to capture a broad range of general-level CH reasoning in a holistic manner [@bruseker_cultural_2017 pp. 111-112]. Figure 3.6: CIDOC-CRM Top-Level Categories by @bruseker_cultural_2017 [p. 112] CRMBase, is supplemented by a series of extensions – sometimes referred to as the CIDOC-CRM family of models – intended to support various types of specialised research questions and documentation, such as bibliographic records or geographical data. These compatible models[56], ordered alphabetically, include both works in progress and models to be reviewed by CRM-SIG[57]. They comprise as follows: CRMact[58]: An extension that defines classes and properties for integrating documentation records about plans for future activities and future events. CRMarchaeo[59]: An extension of CIDOC-CRM created to support the archaeological excavation process and all the various entities and activities related to it. CRMba[60]: An ontology for documenting archaeological buildings. Its primary purpose is to facilitate the recording of evidence and material changes in archaeological structures. CRMdig[61]: An ontology to encode metadata about the steps and methods of production (‘provenance’) of digitisation products and synthetic digital representations such as 2D, 3D or even animated models created by various technologies. CRMgeo[62]: An ontology intended to be used as a global schema for integrating spatio-temporal properties of temporal entities and persistent items. Its primary purpose is to provide a schema consistent with the CIDOC-CRM to integrate geoinformation using the conceptualisations, formal definitions, encoding standards and topological relations. CRMinf[63]: An extension of CIDOC-CRM that facilitates argumentation and inference in descriptive and historical fields. It serves as a universal schema for merging metadata related to argumentation and inference, primarily focusing on these disciplines. CRMsci[64]: The Scientific Observation Model is an ontology that extends CIDOC-CRM for scientific observation, distinguishing the process from results and providing a formal ontology for scientific data integration and research modelling. CRMsoc[65]: An ontology for integrating data about social phenomena and constructs that are of interest in the humanities and social sciences based on analysis of documentary evidence. CRMtex[66]: An extension of CIDOC-CRM created to support the study of ancient documents by identifying relevant textual entities and by modelling the scientific process related with the investigation of ancient texts and their features. FRBRoo[67]: An ontology intended to capture and represent the underling semantics of bibliographic information which interprets the conceptualisations of the FRBR framework. PRESSoo[68]: An ontology intended to capture and reresent the underling semantics of bibliographic information about continuing resources, and more specifically about periodicals (journals, newspapers, magazines, etc.). PRESSoo is also an extension of FRBRoo. Figure 3.7 shows CRMbase and eight of the extensions previously outlined in a pyramid shape, where the lower you go in the pyramid, the more specialised the concepts. Figure 3.7: CIDOC-CRM Family of Models. Diagram done and provided by Maria Theodoridou (Institute of Computer Science, FORTH) Linked Art[69], a recent addition to this landscape, is a community-driven initiative and a metadata application profile that has been in existence since the end of 2016 [@raemy_ameliorer_2022 pp. 136-137]. This community – recognised as a CIDOC Working Group – has created a common Linked Data model based on CIDOC-CRM for describing artworks, their relationships, and the activities around them (see 3.5.5). 3.2.3.4 Cross-domain Metadata Standards There are a few cross-domain standards that have been used to describe CH resources. For instance, the Dublin Core Elements, containing the original core sets of fifteen basic elements, and Dublin Core Metadata Terms[70], its extension, are widely used metadata standards for describing CH resources. It provides metadata properties and classes that are applicable to a wide range of resources [@weibel_dublin_2000]. Another good example is the EDM that has been specified so that national, regional and thematic aggregators in Europe can deliver resources of content providers to Europeana [see @charles_enhancing_2015; @freire_technical_2019]. Despite the presence of cross-domain standards and efforts to map between standards, whether from one version to another or across different domains, reconciling metadata from various sources remains a significant challenge in the CH sector. Institutions may collect metadata in different ways, using different standards and schemas, making it difficult to merge and compare metadata from different sources. Additionally, metadata may be incomplete, inconsistent, or contain errors, further complicating data reconciliation. To address these challenges, standardised, interoperable metadata are necessary to enable data sharing and reuse. While the use of different metadata standards can present challenges for data reconciliation, the adoption of standardised, interoperable metadata can facilitate data sharing and reuse, promoting the long-term preservation and accessibility of CH resources. Controlled vocabularies – included in what @zeng_metadata_2022 [pp. 24-25] called ‘standards for data value’ – such as those maintained by the Getty Research Institute[71]: the AAT, the TGN, and the ULAN, as well as various kinds of KOS (see 3.2.4). These vocabularies provide a common language for describing CH objects and can improve the interoperability of metadata across different institutions and communities. Alongside metadata reconciliation comes also the question of aggregation. Apart from LIDO in museums, the general and current operating model for aggregating CH (meta)data is still the OAI-PMH [see @raemy_enabling_2020], which is an XML-based standard that was initially specified in 1999 and updated in 2002 [@lagoze_open_2002]. Alas, OAI-PMH does not align to contemporary needs [@van_de_sompel_reminiscing_2015], and there are now some alternative and web-based technologies for harvesting resources that are slowly being leveraged such as AS [@snell_activity_2017], a W3C syntax and vocabulary for representing activities and events in social media and other web application. It can also be easily extended and used in different contexts, such as it is the case with the IIIF Change Discovery API (see 3.5.3.3) or with ActivityPub [@lemmer-webber_activitypub_2018], a decentralised W3C protocol being leveraged by Mastodon[72], a federated and open-source social network. Overall, the evolution of metadata standards in the CH domain paves the way for a more interconnected and accessible digital environment, thereby providing better access to disparate collections and facilitating cross-domain reconciliation. This transformation is complemented by a growing emphasis on web-based metadata aggregation technologies that are more suited to today’s needs. 3.2.4 Knowledge Organisation Systems KOS, also known as concept systems or concept schemes, encompass a wide range of instruments in the area of knowledge organisation. They are distinguished by their specific structures and functions [@mazzocchi_knowledge_2018 p. 54]. KOS include authority files, classification schemes, thesauri, topic maps, ontologies, and other related structures. Despite their differences in nature, scope and application, all share a common goal: to facilitate the structured organisation of knowledge and classification of information. According to @zeng_metadata_2022 [p. 284], ‘KOS have a more important function: to model the underlying semantic structure of a domain and to provide semantics, navigation, and translation through labels, definitions, typing relationships, and properties for concepts’. This overarching intent underpins the practice of information management and retrieval. The term KOS ‘became even more popular after the encoding standard Simple Knowledge Organization System (SKOS) was recommended by W3C’, although the use of such systems can be traced back over 100 years, whereas others have been created in the advent of the web [@zeng_metadata_2022 p. 188]. According to @hill_integration_2002 [pp. 46-47, citing [@hodge_systems_2000]], KOS can be divided into four main groups: term lists, metadata-like models, classification and categorisation, as well as relationship models. Term lists encompass authority files, dictionaries, and glossaries, serving as controlled sources for managing terms, definitions, and variant names within a knowledge organisation framework. Metadata-like models encompass directories and gazetteers, offering lists of names and associated contact information as well as geospatial dictionaries for named places, with can be extended for representing events and time periods. In the classification and categorisation domain, you find categorisation schemes and classification schemes that organise content, subject headings that represent controlled terms for collection items, and taxonomies that group items based on specific characteristics. Finally, relationship models feature ontologies, semantic networks, and thesauri, each capturing complex relationships between concepts and terms [@hill_integration_2002; @zeng_knowledge_2008]. Figure 3.8 represents an overview of the structure and functions of these four main groups, showcasing as well the subcategories of KOS previously mentioned. In this figure, the x characters indicate the extent to which each type of KOS embodies five key functions identified by @zeng_knowledge_2008, such as eliminating ambiguity or controlling synonyms. In this subsection, I will explore four subcategories of KOS, each representing a continuum from a more linear to a more structured network. These include folksonomy, taxonomy, thesaurus, and ontology. These KOS have been selected due to their significant impact on the organisation and interlinking of data within the contexts of CHI practices and LOD. Furthermore, the intent of these systems is to help bridge the gap between human understanding and machine processing. Figure 3.8: Overview of the Structures and Functions of KOS [@zeng_knowledge_2008 p. 161] 3.2.4.1 Folksonomy Positioned at one end of the organisational spectrum, folksonomies, also known as community tagging or social bookmarking, are characterised by their user-generated nature. These systems rely on individual users’ tagging of content with keywords or tags that reflect their personal perspectives and preferences. Folksonomies as integration or reconciliation is often hard to achieve [@zeng_metadata_2022 p. 401]. However, they do provide a wealth of source material for studying social semantics [@zeng_metadata_2022 p. 403] and can be done in parallel to more structured KOS. 3.2.4.2 Taxonomy Moving towards the centre of the spectrum, taxonomies present a more structured approach to knowledge organisation. [@zeng_knowledge_2008 p. 169]. Taxonomies employ hierarchical classifications to systematically categorise information into distinct classes and sub-classes, or in a parent/child relationship [@saa_dictionary_taxonomy_2023] - as shown by Code Snippet 3.3 [@niso_guidelines_2010 p. 18]. Taxonomy, in this context, extends beyond mere categorisation; it also establishes relationships. Code Snippet 3.3: Taxonomy Hierarchy Chemistry Physical Chemistry Electrochemistry Magnetohydrodynamics 3.2.4.3 Thesaurus Moving further along the spectrum, thesauri offer a more detailed and formalised method of organisation. They include not only hierarchical relationships but also explicit semantic connections between terms, making them valuable tools for information retrieval. As defined by @niso_guidelines_2010 [p. 9]: A thesaurus is a controlled vocabulary arranged in a known order and structured so that the various relationships among terms are displayed clearly and identified by standardized relationship indicators. For instance, consider a thesaurus related to photography, which encompasses categories for various aspects of photography, including photographic techniques, equipment, and materials. Within this taxonomy, ‘Kodachrome’ could be categorised not only as a specific type of colour film but also as a distinct photographic process. As a type, it could fall under the sub-category of ‘colour film photography’, and as a process, it would fit within the broader framework of ‘photographic techniques’. The AAT, commonly employed in the CH domain, stands as a significant example of a thesaurus [@harpring_development_2010 p. 67]. Homosaurus[73] is another example of a thesaurus with a distinct focus on enhancing the accessibility and discoverability of LGBTQ+ resources and related information. Leveraging Homosaurus in metadata can effectively contribute to diminishing biases present in such data, an essential step in promoting inclusivity and equity within information systems [see @hardesty_mitigating_2021]. 3.2.4.4 Ontology At the structured end of the spectrum, ontologies define complex relationships and attributes between concepts, whereby a series of concepts have been chosen to express what we understand, so that a computer can start making sense of our world. Ontologies are formalised KOS, enabling advanced data integration and KR for more sophisticated applications. The term is drawn from philosophy, where an ontology is a discipline concerned with studying the nature of existence, as articulated by @gruber_translation_1993 [pp. 199-200]: An ontology is an explicit specification of a conceptualization. The term is borrowed from philosophy, where an ontology is a systematic account of Existence. For knowledge-based systems, what “exists” is exactly that which can be represented. There are different kinds of ontologies, including axiomatic formal ontologies, foundational ontologies, and domain-specific ontologies [@beretta_interoperabilite_2022]. These different types of ontologies cater to various knowledge representation needs. Foundational ontologies, such as DOLCE [74], provide a high-level framework for modelling knowledge and offers a comprehensive system for representing entities, qualities, and relationships [see @masolo_wonder_2003; @borgo_dolce_2022]. DLs, a family of formal KR languages, play also a key role in developing ontologies and serve as the foundation for OWL (see 3.4.2), notably by providing a logical formalism. DLs are characterised by their ability to provide substantial expressive power that goes well beyond propositional logic, while maintaining decidable reasoning [@chang_abox_2014]. In computer science, the concepts of ABox and TBox, both statements in KBs, are relevant to the structuring and enrichment of KGs [@giacomo_tbox_1996][75]. The ABox, representing the ‘assertion’ or ‘instance’ level, encapsulates concrete data instances and their relationships, contributing to the factual knowledge of a given system. Conversely, the TBox, representing the ‘terminology’ or ‘schema’ level, defines the conceptual framework and hierarchies that govern the relationships and attributes of the instances. These two complementary components work in harmony to improve data interoperability, reasoning and knowledge sharing. Figure 3.9 depicts a high-level overview of a KB representation system. Figure 3.9: Knowledge Base Representation System Based on @patron_embedded_2011 [p. 205] Consider a scenario around artwork provenance held in a museum. The ABox strives to encapsulate the rich narratives of individual artworks, tracing their journey through time, ownership transitions and exhibition travels. At the same time, the TBox creates a conceptual scaffolding, imbued with classes such as Artwork, Creator, and Exhibition, painting an abstract portrait that contextualises each artefact within a broader cultural tapestry. It is here that the DL comes in, harmonising the symphony with its logical relationships and axioms, i.e. a rule or principle widely accepted as obviously true [@baader_13_2007]. The DL is represented as 𝒦 = (𝒯, ℛ, 𝒜), where: 𝒯: represents the TBox, defining the conceptual framework, which encompasses the hierarchical relationships, classes, and concepts within the KB. ℛ: represents the set of binary roles, delineating the relationships and connections between individuals or instances in the domain. These roles facilitate the understanding of how entities relate to one another within the KB. 𝒜: represents the ABox, encompassing the specific assertions or instances in the KB. This symbiotic interplay ensures that the provenance of each artwork is not just a static account, but a dynamic, interconnected narrative. The ABox-TBox relationship thrives in the realm of reasoning. Imagine an axiom embedded in the TBox: ‘A work of art presented in an exhibition curated by a distinguished patron is of heightened cultural significance’, or here phrased in DL terms: ∃ curates.Artwork.CulturalSignificance ⊑ true. This axiom serves as a beacon to guide the system’s reasoning. When an ABox instance of an artwork is woven into an exhibition curated by a prominent authority, the DL-informed engine responds by inferring an enriched cultural value that resonates beyond the artefact itself. This is where the TBox takes data and gives it life, producing insights that transcend the boundaries of individual instances. The KB, 𝒦, captures this orchestration, encapsulating the logical relationships for meaningful interpretation and knowledge discovery. Overall, the relationship between ABox and TBox in DL is vital for achieving semantic clarity, enabling meaningful data integration, and facilitating advanced reasoning mechanisms. The museum provenance scenario showcases a precisely orchestrated convergence of assertion, terminology, and rigorous logical reasoning. This engenders a computational landscape where historical artefacts intricately mesh within the complex network of human history’s data structures, seamlessly aligning with the underlying framework of algorithmic representation. These components enable software developers to harmonise disparate datasets, extract insightful knowledge, and support decision-making processes across a wide range of domains. In essence, the use of DL, ABox, and TBox in ontological KR enhances interoperability between different systems and allows for sophisticated reasoning and decision support. Moving beyond these foundational concepts, it is noteworthy to consider the work of @ehrlinger_towards_2016, who address the need for a clear and standardised definition of KGs. They highlight the term’s varied interpretations since its popularisation by Google in 2012 and propose a definitive, unambiguous definition to foster a common understanding and wider adoption in both academic and commercial realms. They define a KG as follows: ‘A knowledge graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge’. This definition crystallises the essence of KGs as dynamic and integrative systems that not only store but also process and enrich data through advanced reasoning. This conceptualisation underlines the transformative potential of KGs in various domains, bridging the gap between raw data and actionable insights. Finally, it is important to recognise that the importance of ontologies extends beyond individual systems. Shared ontologies are a cornerstone of semantic interoperability, thus facilitating a paradigm shift in the way systems and applications communicate. As @sanderson_rdf_2013 argues: ’shared ontologies increases semantic interoperability’ and ‘shared identity makes it possible for graph to merge serendipitously’. This shared understanding ensures that various entities can seamlessly connect and engage in meaningful interactions. 3.3 Trends, Movements, and Principles Technological trends, scientific movements, and guiding principles have played a crucial role in shaping the landscape of contemporary research. In recent years, there has been an increased emphasis on the need for academic and CH practices to be more transparent, inclusive, and accountable. This shift reflects a broader trend towards integrating advanced technological solutions and open-science principles in heritage management. As such, understanding the evolution of CH becomes imperative to comprehend how these practices have adapted and transformed in response to these guiding trends. The evolution of CH has been characterised by a series of technological and methodological shifts. Initially, the primary focus was on digitising physical artefacts to preserve information from degrading originals. This phase was crucial for transitioning tangible CH into a digital format, mitigating the risk of loss due to physical degradation. Following this, efforts shifted towards ensuring the persistence of digitised resources. This stage involved addressing challenges related to digital preservation, including data degradation and format obsolescence, to ensure the longevity of digital cultural assets. The advent of open data principles marked the next phase in CH development. This approach facilitated broader access to information, aligning with contemporary values of transparency and inclusivity in, governmental, academic, and cultural contexts. Subsequently, the focus expanded to enhancing the utility of this data. This stage involved contextualising and enriching CH data, thereby increasing their applicability and relevance across various domains. The current frontier in CH involves developing applications that leverage rich CH data. These applications serve not only as tools for engagement and education but also as justifications for the ongoing costs associated with data storage and archival. They illustrate the tangible benefits derived from preserving heritage resources, encompassing both cultural and economic returns. In summary, the trajectory of CH development mirrors broader technological and societal trends, transitioning from preservation to active utilisation. This progression underscores the dynamic nature of research and CH processes, highlighting the evolving requirements for transparency, inclusivity, and accountability in CH management. While automation has significantly enhanced the efficiency of digitisation processes in CH, cataloguing and indexing remain complex challenges. The intricacies involved in accurately understanding and categorising resources necessitate more than just technological solutions; they require context-aware and culturally sensitive approaches. Here, ML offers promising perspectives. ML, particularly in its advanced forms like deep learning, can assist in cataloguing and indexing by analysing large datasets to identify patterns, categorise content, and even suggest metadata. This can be particularly useful in handling large volumes of CH data, where manual processing is time-consuming and prone to human error. Typical applications of ML in this field include image recognition for identifying and classifying visual elements in artefacts, NLP for analysing textual content, and pattern recognition for sorting and organising data based on specific characteristics. Furthermore, prospective developments may entail the refinement of metadata mapping and the enhancement of quality control mechanisms. Moreover, ML algorithms can be trained to recognise stylistic elements, historical contexts, and other nuances that are essential for accurate cataloguing in CH. However, it is crucial to note that the effectiveness of ML depends heavily on the quality and diversity of the training data. Biases in this data can lead to inaccuracies in cataloguing and indexing. Thus, a collaborative approach, where ML is supplemented by expert human oversight, is often the most effective strategy. Overall, this section provides a comprehensive overview of six three[26:1] technological trends as well as five key scientific movements and guiding principles that are shaping research and how universities and GLAMs should provide environments, services, and tools with a view to collecting and disseminating content. By exploring each of these trends, movements, and principles, we can gain a deeper understanding of how research and CH processes are permeated by dynamic movements and how resources can be made more transparent, inclusive and accountable, as well as how data can be made available to human and non-human users. 3.3.1 Current and Emerging Technological Trends in Cultural Heritage I will explore some current and emerging technological trends in CH, organised into three components: Linked Data, big data, and AI. Each represents a critical driver shaping the landscape and practices of heritage data. The three trends have been around for a few decades, with the ‘Linked Data’ principles and underlying standards coming from the late 1990s, ‘big data’ being coined in 1990 and AI in 1956. Before considering the trends discussed hereafter, note that current technological developments do not exist in isolation, but tend to intertwine and act synergistically. A vivid example of this interplay can be seen in AI and its latent impact on the semantic web, particularly in facilitating more efficient querying and crawling processes such as the LinkedDataGPT proof-of-concept service[76] from Liip on the City of Zurich that combines ChatGPT — a generative AI solution — on top of a Linked Data portal to facilitate querying open datasets [@stocker_use_2023]. Inversely AI can be fed by data on the web to learn and reason, as outlined by @gandonWebScienceArtificial2019. 3.3.1.1 Linked Data Linked Data, and most precisely LOD, is a set of design principles adhering to RDF which is a significant approach to interconnect data on the web in order to make semantic queries more useful [@berners-lee_semantic_2001]. In other words, this standardisation allows data to be not only linked, but also openly accessible and reusable. As noted by @gandonWebScienceArtificial2019 [p. 115, citing [@gandon_pour_2017]]: The Web was initially perceived and used as a globally distributed hypertext space for humans. But from its inception, the Web has always been more: its hypermedia architecture is in fact linking programs world-wide through remote procedure calls. This deeper understanding of the web’s architecture as a conduit for linking programs on a global scale holds profound implications. It signifies that the web is not merely a medium for accessing information but a dynamic environment where data-driven programs interact, exchange data, and collaborate across geographical boundaries. In this context, Linked Data emerges as a powerful enabler, providing a structured and standardised approach for these programs to communicate and share meaningful data [@bizer_linked_2008]. In the context of CH, institutions such as museums, libraries and archives can publish their collections using Linked Data principles, enabling a web of linked information that is accessible to all. As this dissertation’s main topic revolves around Linked (Open) (Usable) Data, two dedicated sections have been written within this literature review in Section 3.4 and Section 3.5. Beyond formal LOD, CHIs may also link their databases or collections in more informal ways. This interconnection may take the form of shared metadata, common identifiers, or simply hyperlinks. These links can enhance the user experience by supporting a more seamless navigation between related items or pieces of information. For instance, a parallel strategy is the use of graph-based data representation, i.e. property graph which consists of a set of objects or vertices, and a set of arrows or edges connecting the objects, that are most likely not RDF-compliant [see @bermes_modelisons_2023]. Graph databases, such as Neo4j[77] which is quite prevalent in DH [see @webber_programmatic_2012; @drakopoulos_semantically_2019; @darmont_data_2020], allow for efficient storage and retrieval of interconnected data through nodes representing entities and relationships linking them. 3.3.1.2 Big Data Big Data refers to extremely large and complex datasets that exceed the capabilities of traditional data processing methods and tools. It encompasses a massive volume of structured, semi-structured and unstructured data that is currently flooding across a variety of sectors, companies and organisations [see @emmanuel_defining_2016]. The characteristics of big data are often described by the three ⋁ model [@laney_3d_2001]: Volume: Big data refers to a massive amount of data. This can encompass a spectrum of data sizes, extending from GB and TB, to PB[78] and beyond. The sheer size of the data is a key aspect of big data, making traditional database systems inadequate for storage and analysis. Velocity: Data is being generated and collected at an unprecedented rate. Social media posts, sensor data, online transactions and more are constantly being generated, requiring real-time or near real-time processing and analysis. Variety: Big data comes in a variety of formats, including structured data (e.g. databases), semi-structured data (e.g. XML, JSON) and unstructured data (e.g. text, images, video). The variety of data types requires flexible processing methods. In addition to the three ⋁ model, two more characteristics are often included [@saha_data_2014 p. 1294]: Veracity: It refers to the quality of the data, including its accuracy, reliability and trustworthiness. Big data sources can be inherently uncertain or inaccurate, and addressing data quality is a critical challenge. Value: Extracting value and actionable insights from big data is the ultimate goal. Analysing and interpreting Big data should lead to better decision-making, improved business strategies, as well as enhanced UX[79]. Regarding the two latter dimensions, @debattista_linked_2015 argue that that Linked Data is the most suitable technology to increase the value of data over conventional formats, thus contributing towards the value challenge in Big Data. As for veracity, they describe a semantic pipeline with eight key metrics to address the veracity dimension. Building on this technological foundation, the integration of Linked Data and Big Data analytics takes centre stage. Big data analytics can be employed on CH content to uncover insights and correlations that can be used in decision-making. @barrile_big_2022 [p. 2708] highlight the transformative potential of using big data by investigating how analytical approach can enhance conservation strategies, aid resource allocation and optimise the management of CH resources. @poulopoulos_digital_2022 [pp. 188-189] emphasise that emerging technology trends, including big data, have a significant impact on related research areas such as CH. Big data primarily originates from sources such as social media, online gaming, data lakes[80], logs and frameworks that generate or use significant amounts of data. They stress that the incorporation of multi-faceted analytics in the CH domain is an area of active research, and present a data lake that provides essential user and data/knowledge management functionalities. However, they emphasise a crucial consideration - the need to bridge the theoretical foundations of disciplines such as cultural sociology with the technological advances of big data. 3.3.1.3 Artificial Intelligence AI has been coined for the first time by John McCarthy, an American computer scientist and cognitive scientist, during the 1956 Dartmouth Conference, which is often considered the birth of AI as an academic field [@andresen_john_2002 p. 84]. According to the @oxford_english_dictionary_artificial_2023, AI is described as follows: The capacity of computers or other machines to exhibit or simulate intelligent behaviour; the field of study concerned with this. In later use also: software used to perform tasks or produce output previously thought to require human intelligence, esp. by using machine learning to extrapolate from large collections of data. While AI is not the central focus of my PhD thesis, I acknowledge its impact in several instances. As a rapidly developing technology, AI has the potential to significantly transform various aspects of society, including the way we describe, analyse, and disseminate CH resources. It is worth mentioning that I endeavour to engage in a broader discourse concerning the domain of AI. In this context, I use the acronyms AI to talk about the overarching domain or its ethics, and ML to discuss the specifics of methodologies and algorithmic approaches, while refraining from delving into the intricacies of Deep Learning, which is a distinct subdomain within ML. AI and ML offer great potential for digitising, curating and analysing CH, leveraging the vast digital datasets from CHIs. Some of the examples include text recognition mechanisms using OCR and HTR, NLP and NER for enriching unstructured text, as well as object detection methods for finding patterns within still and moving images [@neudecker_cultural_2022; @sporleder_natural_2010]. Textual works can also be analysed, for instance for sentiment analysis [see @susnjak_applying_2023], and generated using LLM – a variety of NLP, such as BERT or ChatGPT, which predicts the likelihood of a word given the previous words present in recorded texts. However, challenges such as data quality and biases in AI persist [@neudecker_cultural_2022]. In addition, there are still uncertainties regarding the licensing and reuse of CH datasets by ML algorithms[81]. @neudecker_cultural_2022 emphasises the importance of well-curated digitised CH resources that are openly licensed, accompanied by relevant metadata, and accessible through APIs or download dumps in various formats. These curated resources have the potential to address the existing gap in this domain. Building on the theme of enhancing CH through digital technologies, @mcgillivray_digital_2020 explore the synergies and challenges found at the intersection of DH and NLP. DH is aptly described as ‘a nexus of fields within which scholars use computing technologies to investigate the kinds of questions that are traditional to the humanities […] or who ask traditional kinds of humanities-oriented questions about computing technologies’ [@fitzpatrick_reporting_2010]. This broad characterisation encapsulates the transformative potential of digital tools, including ML techniques, in enriching humanities research. @mcgillivray_digital_2020 highlight the critical need for bridging the communication gap between DH and NLP to drive progress in both fields. They propose increased interdisciplinary collaboration, encouraging DH researchers to actively utilise NLP tools to refine their research methodologies. A primary challenge in this convergence is the application of NLP to the complex, historical, or noisy texts often encountered in DH research. They conclude by advocating for stronger cooperation between practitioners in these fields. This collaborative effort is vital for harnessing the full potential of ML in analysing and interpreting CH. The use of ML scripts in the context of CH — and beyond — is inherently limited by their applicability, namely when dealing with historical photographs. In such cases, the use of algorithms that are mostly trained and grounded in contemporary image data becomes quite incongruous due to the dissimilarity in temporal contexts. This dilemma is exemplified by datasets such as Microsoft’s Common Object in Context (COCO)[82] [@fleet_microsoft_2014], where the available data are predominantly contemporary photographic content, which is misaligned with the historical nuances inherent in most of the digitised CH images. @coleman_managing_2020 corroborates that a sound approach would be for ML practitioners to collaborate with libraries as they can draw practical lessons from critical data studies and the thoughtful integration of AI into their collections, using guidelines from DH. She also advocates that as handing handing over datasets would be a disservice to library patrons and that ‘Librarians need to master the instruments of AI and employ them both to learn more about their own resources—to see and analyze them in new ways—and to help shape applications of AI with the expertise and ethos of libraries.’ Ethical concerns, particularly regarding social biases and racism, are prevalent in technologies like ImageNet, where facial recognition may yield AI statements with strong negative connotations [@neudecker_cultural_2022]. Addressing this, @gandonWebScienceArtificial2019 suggest the production of AI services that are ‘benevolent-by-design for the good of the Web and society’. Furthermore, @floridi_good_2023 introduces the double-charge thesis, asserting that all technology design is a moral act, challenging the neutrality thesis. He emphasises that technologies are not neutral and can be influenced by a dynamic equilibrium of values, predisposing them towards morally good or evil directions. As mentioned previously, the ML training datasets are often not enough representative to be properly leveraged in the CH sector [@strien_introduction_2022]. Fine-tuning is now a topic though and new ground truth datasets have been created and tailored for the needs of CH, such as Viscounth[83], a large-scale VQA dataset — i.e a dataset containing open-ended questions about images which requires an understanding of vision, language and commonsense knowledge to answer [@goyal_making_2017] — for CH in English and Italian [see @becattini_viscounth_2023]. @jaillant_unlocking_2022 argue that the governance of AI ought to be carried out in partnership with GLAM institutions. However, while this collaboration has been proposed as a promising way forward, it still requires further exploration and evaluation, particularly with regards to the specific challenges and opportunities that it presents. On the one hand, the involvement of GLAMs in AI governance could enhance the development of digital CH projects that promote social justice and equity. However, on the other hand, this collaboration raises several challenges, such as the need to address issues of privacy, data protection, and intellectual property rights, and to ensure that the values and perspectives of GLAM professionals are adequately represented in the development of AI algorithms and systems. Therefore, it is crucial to examine the specific challenges and opportunities of this collaboration and to develop appropriate frameworks and guidelines that enable effective and ethical governance of AI in the GLAM sector. One of these platforms that address these issues is AI4LAM, which is an international and participatory community focused on advancing the use of AI in, for and by libraries, archives, and museums[84]. The initiative was launched by the National Library of Norway and Stanford University Libraries in 2018 inspired by the success of the IIIF community. Another agency is the AEOLIAN Network[85], AI for Cultural Organisations, which investigates the role that AI can play to make born-digital and digitised cultural records more accessible to users [@jaillant_applying_2023 p. 582]. As an illustrative case, the LoC's exploration into ML technologies, as highlighted by @allen_why_2023, demonstrates a strategic commitment to enhancing the accessibility and utility of its diverse collections. This initiative reflects the LoC's acknowledgement of the transformative potential of ML, balanced with a cautious approach due to the necessity for accurate and responsible information stewardship. The LoC faces several challenges in applying ML, particularly the limitations of commercial AI systems in handling its varied materials and the requirement for substantial human intervention. This cautious exploration into ML is indicative of a broader trend in CHIs, where maintaining a balance between embracing technological advancements and preserving authenticity and integrity is crucial. The specific experiments and projects undertaken by the LoC in the realm of ML are diverse and illustrative of the institution’s comprehensive approach to innovation. For instance, image recognition systems have been tested for identifying and classifying visual elements in artefacts, a task that requires a nuanced understanding of historical and cultural contexts. In another initiative, speech-to-text technology was employed to transcribe spoken word collections, confronting challenges such as accent recognition and audio quality variation. Additionally, the LoC explored the potential of ML in enhancing search and discovery capabilities through projects like Newspaper Navigator[86], which aimed to identify and extract images from digitised newspaper pages. These experiments not only highlight the potential of ML in transforming the way LoC manages and disseminates its collections but also reveal the complexities and limitations inherent in these technologies. As @allen_why_2023 notes, the ongoing research and experimentation in ML at the LoC are critical in revolutionising access and discovery in the cultural heritage sector. These efforts, while facing challenges, represent a diligent integration of advanced technologies, upholding principles of responsible custodianship and setting a precedent for similar institutions globally in the adoption and adaption of ML and AI in CHIs. The integration of LLM and KG presents a groundbreaking opportunity, particularly within the realm of CHIs, where there is already considerable expertise. This is aptly demonstrated in the work of @pan_large_2023, which elucidates the harmonisation between explicit knowledge and parametric knowledge, i.e. knowledge derived from patterns in data, as learned by models such as LLMs. The authors highlight three key areas for the advancement of KR and processing: Knowledge Extraction, where LLMs improves the extraction of knowledge from diverse sources for applications such as information retrieval and KG construction; Knowledge Graph Construction, which involves LLMs in tasks such as link prediction and triple extraction from data, albeit with challenges in precision and management of long tail entities; Training LLMs Using KGs, where KGs provides structured knowledge for LLMs, helping to build retrieval-augmented models on the fly, enriching LLMs with world knowledge and increasing its adaptability. In a report for the University of Leeds in the UK, @pirgova-morgan_looking_2023 explores the potential and practical implications of AI in libraries. The project, forming part of the university’s ambitious vision for digital transformation, aims to understand how AI can be effectively integrated into library services. This research looks at both the use of general AI for long term strategic planning and specific AI applications for improving UX, process optimisation and enhancing the discoverability of collections. The methodology used in this study involves a multi-faceted approach including desk-based assessments, a university-wide survey and expert interviews. Specifically, the study highlights the following key findings: AI for UX and Process Optimisation: The integration of AI technologies offers substantial opportunities for improving user experiences in libraries. This includes optimising library processes, enhancing collections descriptions, and improving their discoverability. Challenges and Opportunities of AI Application: While AI presents exciting possibilities, its practical application in library settings faces challenges. These include evaluating specific AI technologies in the unique context of the University of Leeds, ensuring they align with the institution’s needs and goals. Perceptions of AI in Libraries: The report reveals varying perceptions among librarians and users regarding AI. This includes views on how AI can contribute to resilience, awareness of climate change, and practices promoting equality, diversity, and inclusion. Role of AI in Strategic Library Development: General AI technologies are seen as instrumental in shaping long-term strategies for libraries, highlighting the need for ongoing adaptation and development in response to evolving AI capabilities. Expert Perspectives on AI in Libraries: Interviews with experts from around the world underscore the importance of understanding both general and specific applications of AI. These insights help in identifying priority areas where AI can significantly enhance library operations and services. These insights from the University of Leeds report illustrate the complex impact of AI on library services, from enhancing user interaction to influencing strategic decision-making, while also emphasising the importance of adapting AI applications to specific institutional needs. It must be also stated that AI lacks inherent intelligence and consciousness, and have been ultimately built by people. An important concern, namely with LLM, is the perceptual illusion of cognitive interaction, where the machine appears to be engaging in dialogue and reasoning, when in fact it is generating content through predictive algorithms [see @ridge_enriching_2023]. Furthermore, regarding the topic of data colonialism, poor people in underprivileged nations are often burdened with the responsibility of cleaning up the toxic repercussions of AI, shielding affluent individuals and prosperous countries from direct exposure to its harmful effects[87]. Concluding this segment, it is essential to perceive ML algorithms as uncertain ‘socio-material configurations’, which can be seen as both powerful and inscrutable, demanding an axiomatic and problem-oriented approach in their understanding and application. @jaton_we_2017 elaborates on this by examining how these algorithms, while technologically complex, are firmly rooted in and shaped by the social, material, and human contexts in which they are developed. Beyond their computational complexity, these algorithms are deeply embedded in the process of constructing . These ground truths are not inherent or fixed; instead, they emerge from collaborative efforts that reflect the varied inputs of actors. This process underscores the algorithms as socio-material constructs, influenced by the characteristics and contexts of their creators. Understanding algorithms in this light highlights their deep integration with human actions and societal norms, offering a more nuanced view of their design and implementation [see @jaton_assessing_2021; @jaton_groundwork_2023]. 3.3.2 Scientific Movements and Guiding Principles First, 3.3.2.1 examines the movement towards more open and transparent forms of research. Open scholarship is a broad concept that encompasses practices such as open access publishing, open data, open source software, and open educational resources. The subsection explores the benefits and challenges of open scholarship, and how it can help to increase the accessibility and impact of research data. Then, 3.3.2.2 explores the growing trend of involving members of the public in scientific research. Citizen science and citizen humanities involve collaborations between scientists and non-expert individuals, with the aim of generating new knowledge or solving complex problems. The subsubsection examines the benefits and challenges of citizen science and citizen humanities, and how they can help to democratise research. 3.3.2.3 examines the set of guiding principles designed to ensure that research outputs are FAIR. It explores the importance of each data principle for research integrity, reproducibility, and collaboration, and provides examples of how they can be implemented in practice. 3.3.2.4 explores the importance of ethical and culturally sensitive data governance practices for indigenous communities that are materialised through CARE. These principles provide a framework for managing data in a way that is consistent with the values and cultural traditions of indigenous communities. This part explores as well the challenges and opportunities of implementing the CARE Principles for Indigenous Data Governance. Finally, 3.3.2.5 explores the concept of ‘Collections as Data’, a perspective that has emerged from the practical need and desire to improve decades of digital collecting practice. This approach re-conceptualises collections as ordered digital information that is inherently amenable to computational processing. 3.3.2.1 Towards Open Scholarship According to the FOSTER[88], Open Science can be described as ‘[…] the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods.’ [@foster_open_2019]. In recent years, the principles of Open Science, that historically include Open methodology, Open source, Open data, OA, Open peer review, as well as open educational resources, have become increasingly important as they emphasise transparency, collaboration and accessibility in scientific research [@bezjak_open_2019]. Open methodology refers to the sharing of research processes and methods, allowing other researchers to reproduce and build on existing work [see @vicente-saez_open_2018]. Open source software and tools enable researchers to collaborate, while open data practices promote the sharing of research data in ways that are accessible, discoverable and reusable by others[89]. Open access seeks to remove financial and other barriers to accessing scientific knowledge, while open peer review provides greater transparency and accountability in the publication process. Finally, open educational resources encourage the sharing of teaching and learning materials, thereby facilitating the dissemination of knowledge and skills. @unesco_preliminary_2019 conducted a preliminary study of the technical, financial and considerations related to the promotion of Open Science. This research underscored the necessity for a holistic approach to Open Science and stressed the significance of tackling international legal matters, as well as the existing challenges stemming from unequal access to justice, which can hinder global scientific collaboration. This study laid the groundwork for a recommendation on making ‘[…] multilingual scientific knowledge openly available, accessible and reusable for everyone, to increase scientific collaborations and sharing of information for the benefits of science and society, and to open the processes of scientific knowledge creation’ [@unesco_implementation_2021 p. 7]. UNESCO identified five types of access related to Open Science: infrastructures, societal actors, as well as associated and diverse knowledge systems where dialogue is needed. This includes acknowledging the rights of indigenous peoples and local communities to govern and make decisions on the custodianship, ownership, and administration of data on traditional knowledge and on their lands and resources. Figure 3.10 provides a visual summary of this. Figure 3.10: Open Science Elements, Redrawn Slide from Presentation of Ana Persic [@morrison_redrawn_2021 citing [@persic_building_2021]] While Open Science offers numerous benefits, it also presents challenges and potential drawbacks that warrant careful consideration. One major concern is the risk of exacerbating inequities between researchers from well-resourced institutions and those from less privileged backgrounds. Open access publishing often entails significant costs in the form of article processing charges, which can disproportionately burden researchers without adequate funding support [@burchardt_researchers_2014]. Additionally, Open Science practices relying on open protocols may be vulnerable to misuse, such as automated bots excessively crawling open repositories or datasets. This can lead to overloading systems, unauthorised data extraction, or unintended uses of research outputs [see @irish_bots_2023; @li_good_2021]. These risks underscore the importance of balancing openness with safeguards that ensure equitable participation and secure, sustainable access to research materials. These challenges are particularly relevant in the context of DH, a field that harnesses the promise and impact of digital technologies and methodologies for the study and understanding of cultural phenomena. The adoption of Open Science principles has contributed to greater collaboration, transparency and accessibility in research practices in this field. Open data practices are particularly relevant, as they allow scholars to work with large and complex datasets, including digitised archives and social media data. Open educational resources can also be used to support the dissemination of CH literacy and skills, enabling wider audiences to engage with such resources. However, ensuring that such openness does not exacerbate inequities or introduce vulnerabilities requires thoughtful implementation. In addition to the principles of Open Science, the concept of Open Scholarship has been introduced by [@tennant_tale_2020] as a broader approach that encompasses the arts and humanities and goes beyond the research community to the wider public. Open Scholarship emphasises the importance of making research and scholarship accessible to a wider audience, including non-experts, educators and policy makers. It can be particularly relevant to the arts and humanities, as they often deal with complex cultural materials and narratives that have wider societal implications. By making their work openly accessible and engaging with non-experts, humanities researchers can contribute to public discourse, promote cultural understanding, and inform policy and decision-making. Open scholarship can also support greater collaboration and innovation within the Arts and Humanities by enabling researchers to work collaboratively across disciplines and with a wide range of constituents. For instance, open educational resources can be used to develop collaborative teaching and learning materials that draw on the expertise of scholars and practitioners from different disciplines, while open data practices can facilitate the sharing and reuse of CH materials. Conversely, @knochelmann_open_2019 advocates for the term Open Humanities as a dedicated discourse that would within the humanities. Notably, he argues that Open Humanities should adapt key Open Science elements to the Humanities’ unique context. In the case of preprints, the challenges in the humanities, such as limited discipline-specific preprint servers and linguistic diversity, require tailored solutions to encourage adoption. Open peer review in the humanities should accommodate the field’s subjectivity and diverse perspectives. Concerns about liberal copyright licenses revolve around potential misrepresentation and plagiarism, highlighting the importance of maintaining scholarly integrity regardless of the chosen license. Knochelmann’s proposal underscores the need for context-sensitive approaches to promote openness and collaboration while respecting humanities’ distinct characteristics. Overall, the principles of Open Science provide a framework for promoting greater collaboration, transparency and accessibility in research practices. Yet, the challenges discussed underscore the need for careful adaptation to address inequities, cybersecurity concerns, and field-specific nuances. The concept of Open Scholarship, which stresses the importance of making research and scholarship accessible to wider audiences, can be instrumental in broadening the impact of research in both natural sciences and the humanities, as Open Science encourages greater collaboration and innovation across disciplines. Ultimately, this underscores the need for adaptation and positions all academic disciplines as essential contributors to societal understanding, cultural preservation and informed decision-making, while ensuring the sustainability and integrity of open practices. 3.3.2.2 Citizen Science, Citizen Humanities (…) 3.3.2.3 FAIR Data Principles The FAIR data principles[90] were developed to ensure that three types of entities – namely data, metadata, as well as infrastructures – are Findable, Accessible, Interoperable, and Reusable. The four key principles of FAIR and their underlying 15 sub-elements or facets are as follows [@wilkinson_fair_2016]: (…) 3.3.2.4 CARE Principles for Indigenous Data Governance (…) 3.3.2.5 Collections as Data (…) (…) 4. Exploring Relationships through an Actor-Network Theory Lens As Jim Clifford taught me, we need stories (and theories) that are just big enough to gather up the complexities and keep the edges open and greedy for surprising new and old connections. [@haraway_staying_2016 p. 101] This chapter serves as the theoretical framework of the dissertation, and its primary goals are to elucidate the theoretical underpinnings and provide a comprehensive toolbox for addressing the identified problem. In the preceding literature review chapter, I highlighted the issue that necessitates attention around interlinking CH. The theoretical framework, sometimes referred to as the ‘toolbox’, which can be likened to ‘tools’ the that will be employed to understand and address this problem. Here, the primary purpose of this chapter is to offer an in-depth exploration of the tools – which comprises various theories, propositions, and concepts – delineating their characteristics, behaviours, historical applications, interrelationships, relevance to the study’s objectives, and potential limitations. Subsequently, the next chapter will elucidate how these tools will be operationalised in the research process. (…) 5. Research Scope and Methodology This chapter delineates the Research Scope and Methodology, laying the groundwork for the empirical exploration within this thesis. (…) 6. The Social Fabrics of IIIF and Linked Art (…) 7. PIA as a Laboratory (…) 8. Yale’s LUX and LOUD Consistency (…) 9. Discussion [Il] faut renoncer à l’idée d’une interopérabilité syntaxique ou structurelle par l’utilisation d’un modèle unique, qu’il s’agisse de la production, de stockage ou de l’exploitation au sein même d’un [système d’information]. [@poupeau_reflexions_2018] [91] This chapter presents a comprehensive discussion where I interpret, analyse and critically examine my findings in relation to the thesis and the wider application of LOUD. Through an in-depth analysis of the design principles of LOUD and their implications for CH, this discussion aims to demonstrate the many challenges and opportunities inherent in this framework. The focus is on achieving community-driven consensus, rather than simply pursuing technological breakthrough. The following sections are organised to provide a comprehensive review of the empirical findings, an evaluation abstracting LOUD, and a retrospective analysis of the research journey. Firstly, in Section 9.1, I will present a summary of the empirical findings from my research. This will include key themes and insights, structured to reflect the different areas of study and practice within LOUD. Secondly, in Section 9.2 I will provide an evaluation of LOUD by means of using the LoA approach. This evaluation will focus on the impact of LOUD on the perception of Linked Data within the CH domain and the wider DH field. This will include the key themes and insights that have emerged, structured in a way that reflects four levels of abstraction. I will also explore the dual nature of LOUD implementation, involving both simplicity and complexity, and discuss the various factors that influence such dynamics. Finally, in Section 9.3, I will offer a retrospective analysis of the research journey. This section will interpret the findings to situate LOUD as fully-fledged actors. It will reflect on the challenges, achievements, and lessons learned throughout the research process, providing a holistic view of the project’s trajectory and its implications for the future of LOUD. 9.1 Empirical Findings This section summarises the empirical findings of my research and already offers some suggestions. The structure does not follow the exact order of the three empirical chapters but is organised around overarching topics that emerged throughout the study. The seven topics include Community Practices and Standards, Inclusion and Marginalised Groups, Maintenance and Community Engagement, Interoperability and Usability, Future Directions and Sustainability, Digital Materiality and Representation, as well as Challenges of Scaling and Implementation. Community Practices and Standards GitHub serves as a vital hub for community involvement, with a core group of active contributors often attending meetings regularly. This platform simplifies decision-making within the community, although it also reflects biases similar to those in FLOSS communities. Behind visible activities like meetings, there is substantial preparatory work managed by co-chairs, editorial boards, or driven by community-generated use cases. This foundational work often determines the direction and outcomes of formal gatherings. The LUX project at Yale, as seen in , has successfully fostered collaboration across various units, bringing together libraries and museums on a unified platform. The technological foundation of LUX, based on open standards, facilitates data integration and cross-collections discovery. Not only does the deployment of FLOSS tools contribute to these achievements, but it also emphasises the social advantages of working collaboratively. The concept of the Tragedy of the Commons, as described by @hardin_tragedy_1968, highlights the potential for individual self-interest to deplete shared resources. However, @ostrom_governing_1990 offers a counterpoint by demonstrating how communities can successfully manage common resources through collective action and shared norms. In this context, initiatives like the CHAOSS initiative[92] play a significant role by providing metrics that help evaluate the health and sustainability of open source communities. These metrics include contributions, issue resolution times, and community growth, offering valuable insights into how collaborative efforts can be maintained and improved. Reaching consensus is another critical aspect of community practices and standards. While the minutes of meetings are valuable artefacts, they often reflect an Anglo-Saxon approach to decision-making characterised by few substantive points and critical turning points. The formal aspects of conversations captured in minutes do not fully encompass the decision-making process, which frequently involves informal conversations, consensus-building through open dialogue, and subtle cues that influence outcomes. These elements are integral to the English and American approach and hold valuable lessons for an international community. IIIF and Linked Art are international communities, but decisions are made in English and the majority of participants are based in North America and the UK, significantly imprinting this approach. Understanding these nuances can help us improve our collaborative efforts within the IIIF and Linked Art communities. By recognising and appreciating these different facets of decision-making, we can learn from each other and enhance our collective ability to make effective and inclusive decisions. Some of the challenges associated with these practices include the major demand on resources for community building, the slowness inherent in distributed development, and the difficulty in achieving consensus. Additionally, the concept of social sustainability can be seen as an imaginary construct that papers over differences, as discussed by @fitzpatrick_generous_2019. Addressing these challenges is crucial for the long-term success and effectiveness of the IIIF and Linked Art communities. Inclusion and Marginalised Groups The demographic homogeneity in these communities can perpetuate biases and neglect issues relevant to underrepresented or marginalised groups, as seen in . Participation in these standardisation processes is itself a privilege. The assumption that internet access and digital devices are universally available is critically examined, revealing key actors in the digital landscape. This mirrors issues within the IIIF community, where generating IIIF resources presupposes means that may not be accessible to all. We need clear terms of inclusion, as highlighted by @hoffmann_terms_2021. She argues that effective inclusion requires a critical examination of the frameworks and conditions under which inclusion is offered. The framework should ensure that inclusion initiatives do not merely add diversity to existing power structures but work to transform these structures fundamentally. This involves questioning who defines the terms of inclusion, who benefits from them, and who may be inadvertently excluded. @hoffmann_terms_2021 suggests a participatory approach, where marginalised communities are actively involved in shaping inclusion policies and practices, thus making inclusion an ongoing, reflective process rather than a static goal. The inclusion of marginalised groups is a necessary step, but it is not sufficient. To truly make a difference, there must be a strategic and concentrated effort to appropriate technologies, as emphasised by [@morales_apropiacion_2009; @morales_imaginacion_2017; @morales_apropiacion_2018] and further articulated by [@martinez_demarco_empowering_2019; @martinez_demarco_digital_2023]. This strategic approach highlights the political significance of challenging dominant neoliberal and consumerist perspectives on technology and individual engagement. @martinez_demarco_digital_2023 underscores the critical importance of focusing on practices that go beyond mere inclusion. Instead, it requires a deep understanding and critical assessment of how technology is intertwined with social, economic, and ideological contexts. It implies a reflective and deliberate process of technology adoption in which individuals creatively tailor technology to their specific needs, beliefs, and interests. Moreover, a key aspect highlighted by @martinez_demarco_digital_2023 is the implicit and explicit critique of a universalist approach to inclusion, which often lends itself to all too easy instrumentalisation. Understanding and studying resistance to inclusion in an oppressive digital transformation context is paramount, particularly given the highly unequal conditions that prevail. In this light, a comprehensive study of socio-material and symbolic processes, practices, and involved in embedding technologies into individuals’ lives is needed. This approach also recognises technology as a catalyst for change. It envisions the use of technology to drive meaningful change at multiple dimensions and realities—national, societal, or personal. By focusing on these practices, empowering individuals to navigate and use technology thoughtfully and purposefully becomes a reality, bridging the gap between technological advances and societal progress [@martinez_demarco_empowering_2019]. Maintenance and Community Engagement The tension between creating advanced specifications and their practical implementation by platforms is evident in the IIIF Cookbook recipes and Linked Art patterns, as discussed in Chapter 6. This ongoing development shows that the community is still finding the best ways to achieve broad adoption and interoperability. The deployment of the Change Discovery API, as illustrated in Chapter 7, demonstrates that establishing such a protocol on top of the IIIF Presentation API is feasible and straightforward. High-level support from leadership, particularly Susan Gibbons as Vice Provost, has been crucial in building trust and ensuring the project’s success as a valuable discovery layer at Yale. This integration of diverse collections through a unified platform, based on open standards, highlights the potential for transforming teaching, learning, and research by leveraging collaborative efforts. The topic modelling exercise in LUX reveals the intricate actor-networks composed of organisations, individuals, and non-human actors. This analysis underscores the importance of ongoing processes and relationships in maintaining and evolving infrastructure, akin to the concept of ‘infrastructuring’. As detailed in Chapter 8, following best practices and guidelines such as the SHARED Principles is essential for better involvement, but it is also crucial to uphold these commitments consistently over the long term to ensure meaningful participation. Between the PIA team members, there were sometimes ‘disconnects between different communities who undertake collaborative research’ [@vienni-baptista_foundations_2023]. This was something we had to navigate and learn from, which was manageable within the context of a laboratory setting. However, for any follow-up projects or whatever forms the digital infrastructure we built may take, it is imperative that these disconnects are addressed and solidified to ensure cohesive and sustained community engagement. Interoperability and Usability Within PIA, different APIs have been progressively deployed to meet various requirements while allowing parallel exploration of data modelling. Each API offers unique advantages, but their collective integration promotes semantic interoperability. For example, the IIIF Image API has been instrumental in rationalising image distribution across prototypes, providing efficient access to high-quality digital surrogates and the ability to resize them for different uses. Adherence to LOUD standards and schemas within LUX has generally been positive, although transitioning between versions of a specification can present challenges, highlighting the need to improve the consistency of compliant resources. Linked Art, for instance, has the capacity to generate various insights and sources of truth around different entities. However, additional or entirely new vocabularies from sources like the Getty may need to be used – such as Homosaurus. Complementary to Linked Art, using WADM allows for assertions that go beyond purely descriptive narratives, though it may sacrifice some semantic richness. This complexity in managing vocabularies and maintaining semantic richness directly ties into broader usability considerations within the community. Addressing these usability concerns, Robert Sanderson has suggested focusing on the use of full URIs in Linked Art to ensure computational usability, in contrast to IIIF‘s approach of minimising URIs to enhance readability. This difference highlights a fundamental question in usability: balancing readability and computational usability. Understanding developers’ perspectives on these approaches is critical. I would suggest as a way forward for the IIIF and Linked Art communities to focus on further improving usability of the specifications. This includes conducting comprehensive usability assessments of APIs to evaluate the experiences of new developers versus existing ones, understanding the steepness of the learning curve associated with each API, and guiding improvements in documentation, on-boarding processes, and overall developer support. Efforts should be made to lower the barriers to entry for new developers by developing more intuitive and user-friendly tutorials, providing example projects, and creating a robust support community. Ensuring that developers can quickly and effectively leverage APIs will foster greater adoption. Addressing the challenges of transitioning between different versions of specifications is critical, and developing tools and guidelines that help maintain consistency across versions will reduce friction and ensure smoother updates. Future Directions and Sustainability Survey findings, as discussed in , underscore the need for ongoing efforts to develop LOUD standards that foster an inclusive, dynamic digital ecosystem. Future strategies should include creating educational resources and frameworks that support interdisciplinary collaboration and reduce barriers to participation. While the Manifest serves as the fundamental unit within IIIF, the Linked Art protocol can play a similar central role as semantic gateways in broader contexts, allowing round-tripping across the APIs. The topic modelling exercise in LUX, detailed in , reveals complex actor-networks of organisations, individuals, and non-human actors, providing insights into the relationships sustaining the LUX initiative. The next steps for Linked Art might involve forming a new consortium independent of a CIDOC Working Group, which could provide the necessary support to sustain the initiative. Alternatively, integrating Linked Art into IIIF as a new TSG and specification could address the discovery challenges within IIIF, as discussed during the birds of a feather session led by Robert Sanderson [see @raemy_notes_2024] at the 2024 IIIF Conference in Los Angeles[93]. Design principles that act as bridges across different disciplines, as proposed by @roke_pragmatic_2022, are crucial. IIIF has demonstrated that this collaborative approach is feasible, and Linked Art could follow in its footsteps. However, achieving this requires increased dedication from passive members and broader adoption of the model and the API ecosystem in the near future. Digital Materiality and Representation As explored in Chapter 7, the detailed digital representation of photographic albums, such as the Kreis Family Collection, demonstrates the need to comprehensively capture the materiality of digital objects. This includes the structure and context of images, which are crucial for maintaining their historical and social significance. The implementation of the IIIF Presentation API in creating a detailed digital replica of the Getty’s Bayard Album shows how digital materiality can be enhanced through thoughtful use of technology, but also highlights the scalability challenges for such detailed representations. Creating these detailed digital representations can be seen as a ‘boutique’ approach, which, while labour-intensive and resource-demanding, is necessary for preserving the integrity and contextual significance of cultural heritage objects. The challenge lies in developing the appropriate means and methodologies to achieve this level of detail consistently. Future endeavours, whether through research projects or collaborative efforts between GLAM institutions and DH practitioners, should aim to address these challenges and create sustainable practices for digital materiality and representation. As Edwards aptly notes: ‘Presentational forms equally reflect specific intent in the use and value of the photographs they embed, to the extent that the objects that embed photographs are in many cases meaningless without their photographs; for instance, empty frames or albums. These objects are only invigorated when they are again in conjunction with the images with which they have a symbiotic relationship, for display functions not only make the thing itself visible but make it more visible in certain ways‘. [@edwards_photographs_2004 p. 11] Challenges of Scaling and Implementation As seen in Chapter 6, the IIIF Cookbook recipes and Linked Art patterns reflect the tension between creating advanced specifications and their practical implementation. This gap between ideation and real-world application underscores the challenges faced by the community in achieving broad adoption and interoperability. In Chapter 7, the exploration of APIs like the IIIF Change Discovery API illustrates the practical challenges and potential of scaling these technologies for wider adoption. The successful implementation in PIA demonstrates viability, but also points to the need for continued development and community engagement to fully realise the benefits. Furthermore, assessing the scalability of IIIF image servers, as discussed by [@duin_webassembly_2022] and exemplified by the firm Q42 with their Edge-based service Micrio[94], highlights the importance of optimising data performance. Erwin Verbruggen aptly noted that ‘optimising data performance in my opinion mens sending as little data over as needed’[95], emphasising the need for efficient data handling to enhance scalability. This insight reinforces the necessity of continual refinement in scaling digital infrastructure to support broader use and integration. Reflecting on these findings, I would like to assert that continuous participation, particularly for institutions that can afford to be part of initiatives like IIIF-C, is essential. Active members should not only focus on their own use cases but also consider the needs and perspectives of other, perhaps marginalised, groups. Achieving the dual goals of making progress within one community, whether it be IIIF or Linked Art, while also engaging in effective outreach and creating a solid baseline, will benefit everyone in the CH sector and beyond. Addressing where LOUD fits in, how people perceive this new concept or paradigm, and understanding how LOUD differs from Linked Data in general are essential. These questions help to clarify the stages at which themes related to one of the LOUD design principles emerge, crystallise, and potentially disappear. My thesis does not fully resolve these queries but offers insights and hints for further exploration. In conclusion, the empirical findings reveal the richness of the implementation and maintenance of LOUD standards in the CH domain. From the critical role of community practices and standards to the challenges of achieving interoperability and inclusivity, each theme underlines the complex interplay of social, technical and organisational factors. will look at the evaluation of LOUD and explore its overall impact, delving into the delta of what to do with it, particularly in terms of Linked Data versus LOUD, where my thesis provides pointers but does not provide definitive answers. 9.2 Evaluation: Abstracting LOUD In this section I will assess the impact of LOUD within the CH domain and the wider DH field, examining its implications for community practices and semantic interoperability, and secondarily whether LOUD has affected the perception of Linked Data. (…) 9.3 Retrospective: Truding like an Ant (…) 10. Conclusion For a better understanding of the past, Our images have to be enhanced, A new dialogue in three dimensions, Must have openness at its heart, For somewhere within the archive Of our aggregated minds Are a multitude of questions And a multitude of answers, Simply awaiting to be found. [@mr_gee_day_2023] This chapter brings to a close the journey undertaken since February 2021, aiming to clearly articulate the answers to the research questions, discuss how the research aligns with the objectives, elucidate the significance of the work, outline its shortcomings, and suggest avenues for future research. I had the privilege of hearing the above poem at EuropeanaTech in The Hague in October 2023. What struck me most, and what I have tried to convey in this thesis, was the powerful dialogue and collective spirit striving to harness the potential of our (digital) heritage. With a sense of conviction after this conference, I approached the next one in Geneva in February 2024 with confidence, believing that I had made a compelling case for the concept of LOUD. When a participant asked how LOUD differed from Linked Data, however, I found myself explaining the socio-technical ethos of IIIF and Linked Art, the richness of the individuals who make them up, the ability to combine these different standards, and the common use cases that emerge from these collaborations. Whether my answer was convincing remains uncertain, but I knew it was too brief. Perhaps it is here, in this conclusion, that my thoughts can find their full expression. I believe that LOUD should be at the forefront of efforts to improve the accessibility and usability of CH data, an endeavour that is increasingly relevant in a web-centric environment. This paradigm has gained considerable traction, particularly with the advent of Linked Art and the recognition that the IIIF Presentation API has been an inspiration for the LOUD design principles. The development and maintenance of LOUD standards by dedicated communities are characterised by collaboration, consensus building, and transparency. In the interstices of the IIIF and Linked Art communities, frameworks for interoperability are not only exposed, but revealed as profound testaments to the power of transparent collaboration across institutional boundaries. Both communities, it is true, are still very much Anglo-Saxon efforts, where the specifications have mainly been implemented in GLAM and/or DH research projects, or at least when we have been aware of them. It has clear guidelines on how to propose use cases, mostly using GitHub, and hides the sometimes unnecessary RDF complexity behind a set of JSON-LD @ context. IIIF is at the presentation layer and can really play its role as a mediator, with the Manifest as its central unit connecting other specifications, including semantic metadata, and preferably with simpatico specifications such as Linked Art. An important hypothesis arises from the observation that adherence to the LOUD design principles makes specifications more likely to be adopted. The primary benefit of adopting LOUD standards lies in their grassroots nature. This grassroots approach not only aligns with the core values of openness and collaboration within the DH community but also serves as a common denominator between DH practitioners and CHIs. This unique alignment fosters a sense of shared purpose and common ground. However, it’s essential to acknowledge that while LOUD and its associated standards, including IIIF, hold immense promise, their limited recognition in the wider socio-technical ecosystem may currently hinder their full potential impact beyond the CH domain. Consideration of socio-technical requirements and the promotion of digital equity are essential to the development of specifications in line with the LOUD design principles. In the context of the IIIF and Linked Art communities, this means both recognising current challenges and building on existing practices. This includes forming alliances that support diverse forms of inclusion at both project and individual levels. For example, organisations should be encouraged to send representatives from diverse professional and personal backgrounds, such as underrepresented groups or non-technical fields. This can be facilitated by initiatives that lower the barriers to participation, such as financial support for travel and participation, flexible participation formats, and targeted outreach efforts. Furthermore, as these standards often align with open government data initiatives, they present opportunities for broader public engagement and institutional transparency. In the broader context of DH, understanding LOUD involves tracing the historical development of the field and its evolving relationship with technology. The interdisciplinary nature of DH has always integrated diverse scholarly and technical practices. In recent years, DH has seen a notable increase in interest in the use of Linked Data and semantic technologies to improve the discoverability and accessibility of CH collections. LOUD's emphasis on user-centred design and usability aligns well with these goals. Consequently, the principles of LOUD hold great promise for advancing the integration and use of community-driven APIs and/or Linked Data within DH. This can be seen within PIA, where the benefits of implementing IIIF helped us to streamline machine-generated annotations, integrate different thumbnails into GUI prototypes, model photo albums with different layers from the Kreis Family collection, and enable project members and students to engage in digital storytelling, an important participatory facet that can be seamlessly explored by DH efforts and CHIs with the help of the IIIF Image and Presentation APIs. Data reuse is definitely a key LOUD driver, which could have been done more extensively with a productive instance of Linked Art. As for widening participation, this is definitely a strategic and political decision, rather than a technical one. That said, LOUD specifications can definitely be embedded through strategic citizen science initiatives. A recent example that highlights the comprehensive value of Linked Data was presented by @newbury_linked_2024 at the CNI Spring 2024 Meeting. He delineated its significance as extending well beyond single entities, such as the Getty Research Institute, to enrich a vast ecosystem. Specifically, he identified three principal areas of value: Firstly, within the ecosystem itself, where the utility of information is amplified through its application in diverse contexts. Secondly, for the audience, by directly addressing user needs and facilitating various conceptual frameworks. And finally, within the community, by enabling wider use and adaptation of data and code. This approach to Linked Data, as articulated by Newbury, not only enhances its utility across these dimensions, but also aligns seamlessly with the LOUD proposition, underscoring a shared vision for a digital space where the interconnectedness and accessibility of (meta)data serve as foundational principles for progress and community engagement. LUX, as a catalyst for LOUD, exemplifies a practical approach to implementing Linked Data that has garnered significant local engagement and support at Yale. This initiative demonstrates how sound socio-technical practices can be effectively applied within a supportive institutional environment. The consistency of the data within LUX aligns well with IIIF and Linked Art standards, with only a few minor adjustments required for full compliance. These quick fixes are manageable and do not detract from the overall robustness of the initiative. While it may be too early to fully assess the wider impact of using LOUD specifications on the LUX platform within the CH domain, the initiative has already attracted considerable interest in recent months. This growing attention suggests that the LUX approach is resonating with other organisations, suggesting the potential for wider adoption and impact. The enthusiastic local engagement at Yale provides a strong foundation for LUX and highlights its potential to serve as a model for similar projects aimed at enriching digital heritage through effective collaboration and agreed-upon standards. In carrying out this thesis, I have adhered to the five main objectives set out at the beginning of the PhD. These objectives have been accomplished to a high degree, reflecting a substantial and well-executed project. Furthermore, most of the outputs – such as data models and scripts – from this work are available on GitHub, providing open access to the wider community. In addition, I have published several papers, both individually and collaboratively, further disseminating the findings and contributions of this research. Additionally, this thesis is relevant because it sheds light on communities and implementations that can be celebrated not only for their standards but also for their operating ethos; IIIF and Linked Art present models ripe for emulation beyond their immediate digital confines. Here, agency and authority are most typically granted to the collective over the isolated, with each actor - be it an individual, an institution or an interface – intricately interconnected. Yale’s LUX initiative also embodies this ethos, demonstrating how collaborative efforts can lead to innovative solutions and wider impact. It is to be hoped, then, that these practices of openness and multiple partnerships will not be seen as limited to their origins in digital representation. At the very least, I hope that these socio-technical approaches can serve as exemplars or sources of inspiration in broader arenas, where the principles of mutual visibility and concerted action can point the way towards cohesive and adaptive collaborative architectures. Despite its contribution, this thesis is far from perfect and certainly contains several shortcomings. I will name here three significant ones. First, the visualisations included and the use of FOL are primarily designed to support my own self-reflection and may be more beneficial to me than to the broader academic community. While they provide insights into my research process and findings, their applicability and usefulness to others might be limited. Second, the theoretical framework I employed, while instrumental to my research, may not serve as a universally applicable toolbox. Nevertheless, I urge readers to pay close attention to STS methodologies and practices. The works of Bruno Latour, Donna Haraway, and Susan Leigh Star have been invaluable companions throughout this dissertation. Additionally, for those involved in conceptualising semantic information, I recommend exploring Floridi’s PI, which offers profound insights into the nature and dynamics of information. These readings have greatly influenced my approach and understanding, and I believe they can offer valuable perspectives to others as well. Third, while the thesis aims to address both community practices and semantic interoperability, it leans more heavily towards the former. This emphasis on community practices may overshadow the broader discussion of semantic interoperability, potentially limiting the appeal of the thesis to those primarily interested in the technical aspects. Other shortcomings include the broad scope of the thesis, with three empirical chapters exploring different avenues. While this comprehensive approach provides a broad understanding of the research topic, it has also resulted in a rather lengthy thesis. This may be a challenge for the reader, as a topic of interest in one chapter may not be as compelling in another. The diversity of empirical focus, while enriching the research, may dilute the coherence for some readers, making it more difficult to maintain a consistent engagement throughout the dissertation. Despite these limitations, I hope that the different perspectives and findings contribute to a richer, more nuanced understanding of LOUD for CH. Avenues for future research are numerous and promising. One interesting area to explore is the comparative benefits experienced by early adopters of IIIF and Linked Art specifications versus those who implemented these standards later. Early adopters have the advantage of having their use cases discussed and resolved within the community, and it would be insightful to analyse the long-term impacts on their projects. Such a study is already feasible for early adopters of IIIF and will become possible to compare further implementations of Linked Art within a few years. Furthermore, future exploration could focus on the full implementation of Linked Art within PIA or similar efforts, as well as more performance-oriented testing with the deployed LOUD APIs. These efforts should further validate the robustness and scalability of these specifications. Another important area for future investigation is the participation of institutions and individuals from the Global South in both the IIIF and Linked Art communities. It is crucial to explore how we can better support their uptake of these specifications and encourage their active involvement in these initiatives to ensure a more inclusive and globally representative environment. As I reflect on the journey of this thesis, I am reminded of the powerful dialogue and collective effort that has been at its heart. Mr Gee’s poem resonates deeply with my own aspirations for this work: to enhance our understanding of the past through openness and collaboration, as can be seen in IIIF and Linked Art. As I bring this dissertation to a close, I am filled with a sense of accomplishment and a renewed commitment to promoting sound socio-technical practices. It is my hope that the insights and methodologies presented here will inspire others to engage in this ongoing dialogue, continually asking and answering the many questions that arise as we collectively explore our cultural heritage landscapes. Throughout this dissertation, British English spelling conventions are predominantly observed. However, there are instances of American English spelling where direct quotations from sources are used as well as referring to names of institutions, standards, or concepts. ↩︎ SNSF Data Portal - Grant number 193788: https://data.snf.ch/grants/grant/193788 ↩︎ Seminar für Kulturwissenschaft und Europäische Ethnologie: https://kulturwissenschaft.philhist.unibas.ch/ ↩︎ DHLab: https://dhlab.philhist.unibas.ch/ ↩︎ HKB: https://www.hkb.bfh.ch/ ↩︎ The considerable size of the ASV collection, which includes over 90,000 analogue objects, reflects not just the work of the main authors but also the contributions from numerous explorers and additional material beyond the maps and primary publications. ↩︎ Max Frischknecht’s PhD: https://phd.maxfrischknecht.ch/ ↩︎ PIA project website: https://about.participatory-archives.ch/ ↩︎ The vision of the PIA project was first written in German and then translated into English and French. ↩︎ In our joint paper, we wrote ‘man-made’, corrected here, which makes me think of the transition within the CIDOC-CRM for the Entity E22 Human-Made Object from version 6.2.7 onward. ↩︎ Knora Base Ontology: https://docs.dasch.swiss/2023.07.01/DSP-API/02-dsp-ontologies/knora-base/ ↩︎ SIPI documentation: https://sipi.io/ ↩︎ IIIF Working Groups Meeting, The Hague, 2016: https://iiif.io/event/2016/thehague/ ↩︎ Van Gogh, Vincent. (1889). Irises [Oil on canvas]. Getty Museum, Los Angeles, CA, USA. https://www.getty.edu/art/collection/object/103JNH ↩︎ Giacometti, Alberto. (1956). L’homme qui marche I [Sculpture]. Carnegie Museum of Art, Pittsburg, PA, USA. https://www.wikidata.org/entity/Q706964 ↩︎ UNESCO World Heritage List: https://whc.unesco.org/en/list/ ↩︎ Blue Shield International: https://theblueshield.org/ ↩︎ The ICBS was founded by the ICA, ICOM, ICOMOS, and IFLA. ↩︎ Guro. (1900-1950). Male Face Mask (Zamble) [Wood and pigment]. Art Institute of Chicago, Chicago, IL, USA. https://www.artic.edu/artworks/239464 ↩︎ I have opted for the term ‘affordance’ and not ‘representation’ as my intention is to maintain a comprehensive scope that encompasses various modalities such as modelling endeavours. ↩︎ To some degree, parallels can be drawn between the distinctions of cultural and digital heritage with those drawn between the humanities and DH. ↩︎ Inicio - Museos Comunitarios de América: https://www.museoscomunitarios.org/ ↩︎ The descriptions of each of these nine dimensions are selected excerpts from @star_ethnography_1999. ↩︎ A PID is a long-lasting reference to a digital resource. It usually has two components: a unique identifier and a service that locates the resource over time, even if its location changes. The first helps to ensure the provenance of a digital resource (that it is what it purports to be), whilst the second will ensure that the identifier resolves to the correct current location [@digital_preservation_coalition_persistent_2017] ↩︎ Rijksmuseum: https://www.rijksmuseum.nl/ ↩︎ In the original version, these instances contained typographical or factual errors. They have been struck through and corrected here. ↩︎ ↩︎ @zeng_metadata_2022 [p. 11] articulate that ‘as with “data”, metadata can be either singular or plural. It is used as singular in the sense of a kind of data; however, in plural form, the term refers to things one can count’. In the context of this thesis, I have chosen to favour the plural form of (meta)data. However, I acknowledge that I may occasionally use the singular form when referring to the overarching concepts or when quoting references verbatim. ↩︎ The snapshot of this bibliographic record was taken from https://swisscovery.slsp.ch/permalink/41SLSP_UBS/11jfr6m/alma991170746542405501. ↩︎ Seeing Standards: A Visualization of the Metadata Universe. 2009-2010. Jenn Riley. https://jennriley.com/metadatamap/seeingstandards.pdf ↩︎ A widespread example in the CH domain is the serialisation of metadata in XML, a W3C standard. ↩︎ It is noteworthy that the diversity of metadata standards in the heritage domain, characterised primarily by a common emphasis on descriptive attributes, is not counter-intuitive. This variation reflects the diverse nature of CH resources and the nuanced needs of GLAMs. ↩︎ MARC Standards: https://www.loc.gov/marc/ ↩︎ RDA: https://www.loc.gov/aba/rda/ ↩︎ If RDA was initially envisioned as the third edition of AACR, it faces the challenge of maintaining a delicate balance between preserving the AACR tradition while embracing the necessary shifts required for a successful and relevant future for library catalogues that can easily be interconnected with standards from archives, museums, and other communities [see @coyle_resource_2007]. ↩︎ MODS: https://www.loc.gov/standards/mods/ ↩︎ METS: https://www.loc.gov/standards/mets/ ↩︎ People might even argue that FRBR is only interesting as an ‘intellectual exercise’ [@zumer_functional_2007 p. 27]. ↩︎ LRMer: https://www.iflastandards.info/lrm/lrmer ↩︎ BibFrame: https://www.loc.gov/bibframe/ ↩︎ EAD: https://www.loc.gov/ead/ ↩︎ ISAD(G): General International Standard Archival Description - Second edition https://www.ica.org/en/isadg-general-international-standard-archival-description-second-edition ↩︎ PREMIS: https://www.loc.gov/standards/premis/ ↩︎ RiC Conceptual Model: https://www.ica.org/en/records-in-contexts-conceptual-model ↩︎ RiC-O: https://www.ica.org/standards/RiC/ontology ↩︎ CDWA: https://www.getty.edu/research/publications/electronic_publications/cdwa/ ↩︎ CCO: https://www.vraweb.org/cco ↩︎ VRA: https://www.vraweb.org/ ↩︎ VRA Core 4.0 and CCO have a symbiotic relationship, with CCO providing data content guidelines and incorporating the VRA Core 4.0 methodology. The latter also been leveraged in other contexts to form the basis for more granular Linked Data vocabularies [see @mixter_using_2014]. ↩︎ In French, the original language used for this acronym, CIDOC stands for Comité international pour la documentation du Conseil international des musées. ↩︎ LIDO: https://cidoc.mini.icom.museum/working-groups/lido/ ↩︎ CIDOC Working Groups: https://cidoc.mini.icom.museum/working-groups/ ↩︎ CIDOC-CRM: https://cidoc-crm.org/ ↩︎ CRM-SIG Meetings: https://www.cidoc-crm.org/meetings_all ↩︎ CIDOC-CRM V7.1.2: https://www.cidoc-crm.org/html/cidoc_crm_v7.1.2.html ↩︎ For a quick overview of the classes and properties of CIDOC-CRM, I recommend visiting the dynamic periodic table created by Remo Grillo (Digital Humanities Research Associate at I Tatti, Harvard University Center for Italian Renaissance Studies): https://remogrillo.github.io/cidoc-crm_periodic_table/ ↩︎ CIDOC-CRM compatible models and collaborations: https://www.cidoc-crm.org/collaborations ↩︎ At the time of writing none of these CIDOC-CRM extensions have been formally approved by CRM-SIG. It is also worth mentioning that other extensions based on CIDOC-CRM have been developed by the wider community, such as Bio CRM, a data model for representing biographical data for prosopographical research [see @tuominen_bio_2017] or ArchOnto, which is a model created for archives [see @hall_archonto_2020]. ↩︎ CRMact: https://www.cidoc-crm.org/crmact/ ↩︎ CRMarchaeo: https://cidoc-crm.org/crmarchaeo/ ↩︎ CRMba: https://www.cidoc-crm.org/crmba/ ↩︎ CRMdig: https://www.cidoc-crm.org/crmdig/ ↩︎ CRMgeo: https://www.cidoc-crm.org/crmgeo/ ↩︎ CRMinf: https://www.cidoc-crm.org/crminf/ ↩︎ CRMsci: https://www.cidoc-crm.org/crmsci/ ↩︎ CRMsoc: https://www.cidoc-crm.org/crmsoc/ ↩︎ CRMtex: https://www.cidoc-crm.org/crmtex/ ↩︎ FRBRoo: https://www.cidoc-crm.org/frbroo/ ↩︎ PRESSoo: https://www.cidoc-crm.org/pressoo/ ↩︎ Linked Art: https://linked.art ↩︎ DCMI Metadata Terms: https://www.dublincore.org/specifications/dublin-core/dcmi-terms/ ↩︎ Getty Vocabularies: https://www.getty.edu/research/tools/vocabularies/ ↩︎ Mastodon: https://joinmastodon.org/ ↩︎ Homosaurus: https://homosaurus.org/ ↩︎ DOLCE: www.loa.istc.cnr.it/dolce/overview.html ↩︎ It must be noted though that the use of DLs in KR predates the emergence of ontological modelling in the context of the Web, with its origins going back to the creation of the first DL modelling languages in the mid-1980s [@krotzsch_description_2013]. ↩︎ LinkedDataGPT: https://ld.gpt.liip.ch/ ↩︎ Neo4j: https://neo4j.com/ ↩︎ GB and PB are units of digital information storage capacity. 1 GB is equal to 1,000,000,000 ($10^{9}$) bytes, 1 TB is equal to 1,000,000,000,000 ($10^{12}$) bytes, and 1 PB, is equal to 1,000,000,000,000,000 ($10^{15}$) bytes. If a standard high-definition movie is around 4-5 GB, then 1 PB could store tens of thousands of movies. In 2011, @gomes_survey_2011 [p. 414] reported that the Internet Archive held 150,000 million contents of archived websites – crawled through the Wayback Machine – or approximately 5.5 PB. As of December 2021, it was about 57 PB of archived websites and a total used storage of 212 PB, see https://archive.org/web/petabox.php. ↩︎ In this context, UX is understood as an umbrella term encompassing both user and/or customer service, emphasising that the focus is on individuals who need or use a given service, regardless of their categorisation as users or customers. ↩︎ According to @nargesian_data_2019 [p. 1986], a data lake is a vast collection of datasets that has four characteristics. It can be stored in different storage systems, exhibit varying formats, may lack useful metadata or use differing metadata formats, and can change autonomously over time. ↩︎ An interesting initiative in this area is the use of RAIL, which empower developers to restrict the use of AI on the software they develop to prevent irresponsible and harmful applications: https://www.licenses.ai/ ↩︎ Common Objects in Context: https://cocodataset.org/ ↩︎ Viscounth – A Large Dataset for Visual Question Answering for Cultural Heritage: https://github.com/misaelmongiovi/IDEHAdataset ↩︎ Artificial Intelligence for Libraries, Archives & Museums: https://sites.google.com/view/ai4lam ↩︎ AEOLIAN Network: https://www.aeolian-network.net/ ↩︎ Newspaper Navigator: https://news-navigator.labs.loc.gov/ ↩︎ @perrigo_exclusive_2023 investigated that Kenyan workers made less than USD 2 an hour to identify and filter out harmful content for ChatGPT. ↩︎ FOSTER Plus (Fostering the practical implementation of Open Science in Horizon 2020 and beyond) was a 2-year EU-funded project initiated in 2017 with 11 partners across 6 countries. Its main goal was to promote a lasting shift in European researchers’ behaviour towards Open Science becoming the norm. ↩︎ According to the Open Knowledge Foundation, a non-profit network established in 2004 in the U.K., which aims to promote the idea of open knowledge, sets out some some principles around the concept of openness and defines it as follows: ‘Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness)’. https://opendefinition.org/ ↩︎ FAIR Principles: https://www.go-fair.org/fair-principles/ ↩︎ Author’s translation: ‘We need to give up on the idea of syntactic or structural interoperability through the use of a single model, whether for producing, storing or managing data within an information system’. ↩︎ CHAOSS: https://chaoss.community/ ↩︎ IIIF Annual Conference and Showcase - Los Angeles, CA, USA - June 4-7, 2024: https://iiif.io/event/2024/los-angeles/ ↩︎ Micrio: https://micr.io/ ↩︎ Message written on the IIIF Slack Workspace on 28 October 2022. ↩︎", "date_published": "2024-11-18T00:00:00.000Z" } diff --git a/_site/feed.xml b/_site/feed.xml index 51b365a..2ea09c6 100644 --- a/_site/feed.xml +++ b/_site/feed.xml @@ -43,7 +43,7 @@ https://phd.julsraemy.ch/thesis.html https://phd.julsraemy.ch/thesis.html 2024-11-18T00:00:00Z - + diff --git a/_site/thesis.html b/_site/thesis.html index d7b1c2c..7f440d0 100644 --- a/_site/thesis.html +++ b/_site/thesis.html @@ -828,8 +828,98 @@

3.3 Trends, Movements, and Principles

The advent of open data principles marked the next phase in CH development. This approach facilitated broader access to information, aligning with contemporary values of transparency and inclusivity in, governmental, academic, and cultural contexts. Subsequently, the focus expanded to enhancing the utility of this data. This stage involved contextualising and enriching CH data, thereby increasing their applicability and relevance across various domains.

The current frontier in CH involves developing applications that leverage rich CH data. These applications serve not only as tools for engagement and education but also as justifications for the ongoing costs associated with data storage and archival. They illustrate the tangible benefits derived from preserving heritage resources, encompassing both cultural and economic returns. In summary, the trajectory of CH development mirrors broader technological and societal trends, transitioning from preservation to active utilisation. This progression underscores the dynamic nature of research and CH processes, highlighting the evolving requirements for transparency, inclusivity, and accountability in CH management.

While automation has significantly enhanced the efficiency of digitisation processes in CH, cataloguing and indexing remain complex challenges. The intricacies involved in accurately understanding and categorising resources necessitate more than just technological solutions; they require context-aware and culturally sensitive approaches. Here, ML offers promising perspectives. ML, particularly in its advanced forms like deep learning, can assist in cataloguing and indexing by analysing large datasets to identify patterns, categorise content, and even suggest metadata. This can be particularly useful in handling large volumes of CH data, where manual processing is time-consuming and prone to human error. Typical applications of ML in this field include image recognition for identifying and classifying visual elements in artefacts, NLP for analysing textual content, and pattern recognition for sorting and organising data based on specific characteristics. Furthermore, prospective developments may entail the refinement of metadata mapping and the enhancement of quality control mechanisms. Moreover, ML algorithms can be trained to recognise stylistic elements, historical contexts, and other nuances that are essential for accurate cataloguing in CH. However, it is crucial to note that the effectiveness of ML depends heavily on the quality and diversity of the training data. Biases in this data can lead to inaccuracies in cataloguing and indexing. Thus, a collaborative approach, where ML is supplemented by expert human oversight, is often the most effective strategy.

-

Overall, this section provides a comprehensive overview of six technological trends as well as five key scientific movements and guiding principles that are shaping research and how universities and GLAMs should provide environments, services, and tools with a view to collecting and disseminating content. By exploring each of these trends, movements, and principles, we can gain a deeper understanding of how research and CH processes are permeated by dynamic movements and how resources can be made more transparent, inclusive and accountable, as well as how data can be made available to human and non-human users.

+

Overall, this section provides a comprehensive overview of six three[26:1] technological trends as well as five key scientific movements and guiding principles that are shaping research and how universities and GLAMs should provide environments, services, and tools with a view to collecting and disseminating content. By exploring each of these trends, movements, and principles, we can gain a deeper understanding of how research and CH processes are permeated by dynamic movements and how resources can be made more transparent, inclusive and accountable, as well as how data can be made available to human and non-human users.

+

I will explore some current and emerging technological trends in CH, organised into three components: Linked Data, big data, and AI. Each represents a critical driver shaping the landscape and practices of heritage data. The three trends have been around for a few decades, with the ‘Linked Data’ principles and underlying standards coming from the late 1990s, ‘big data’ being coined in 1990 and AI in 1956.

+

Before considering the trends discussed hereafter, note that current technological developments do not exist in isolation, but tend to intertwine and act synergistically. A vivid example of this interplay can be seen in AI and its latent impact on the semantic web, particularly in facilitating more efficient querying and crawling processes such as the LinkedDataGPT proof-of-concept service[76] from Liip on the City of Zurich that combines ChatGPT — a generative AI solution — on top of a Linked Data portal to facilitate querying open datasets (Stocker, 2023). Inversely AI can be fed by data on the web to learn and reason, as outlined by +

3.3.1.1 Linked Data
+

Linked Data, and most precisely LOD, is a set of design principles adhering to RDF which is a significant approach to interconnect data on the web in order to make semantic queries more useful (Berners-Lee et al., 2001). In other words, this standardisation allows data to be not only linked, but also openly accessible and reusable. As noted by (Gandon, 2019) [p. 115, citing (Gandon, 2017)]:

+
+

The Web was initially perceived and used as a globally distributed hypertext space for humans. But from its inception, the Web has always been more: its hypermedia architecture is in fact linking programs world-wide through remote procedure calls.

+
+

This deeper understanding of the web’s architecture as a conduit for linking programs on a global scale holds profound implications. It signifies that the web is not merely a medium for accessing information but a dynamic environment where data-driven programs interact, exchange data, and collaborate across geographical boundaries. In this context, Linked Data emerges as a powerful enabler, providing a structured and standardised approach for these programs to communicate and share meaningful data (Bizer et al., 2008).

+

In the context of CH, institutions such as museums, libraries and archives can publish their collections using Linked Data principles, enabling a web of linked information that is accessible to all. As this dissertation’s main topic revolves around Linked (Open) (Usable) Data, two dedicated sections have been written within this literature review in Section 3.4 and Section 3.5.

+

Beyond formal LOD, CHIs may also link their databases or collections in more informal ways. This interconnection may take the form of shared metadata, common identifiers, or simply hyperlinks. These links can enhance the user experience by supporting a more seamless navigation between related items or pieces of information. For instance, a parallel strategy is the use of graph-based data representation, i.e. property graph which consists of a set of objects or vertices, and a set of arrows or edges connecting the objects, that are most likely not RDF-compliant (see Bermès, 2023). Graph databases, such as Neo4j[77] which is quite prevalent in DH (Darmont et al., 2020; Drakopoulos et al., 2019; see Webber, 2012), allow for efficient storage and retrieval of interconnected data through nodes representing entities and relationships linking them.

+
3.3.1.2 Big Data
+

Big Data refers to extremely large and complex datasets that exceed the capabilities of traditional data processing methods and tools. It encompasses a massive volume of structured, semi-structured and unstructured data that is currently flooding across a variety of sectors, companies and organisations (see Emmanuel & Stanier, 2016). The characteristics of big data are often described by the three ⋁ model (Laney, 2001):

+ +

In addition to the three ⋁ model, two more characteristics are often included (Saha & Srivastava, 2014, p. 1294):

+ +

Regarding the two latter dimensions, (Debattista et al., 2015) argue that that Linked Data is the most suitable technology to increase the value of data over conventional formats, thus contributing towards the value challenge in Big Data. As for veracity, they describe a semantic pipeline with eight key metrics to address the veracity dimension. Building on this technological foundation, the integration of Linked Data and Big Data analytics takes centre stage.

+

Big data analytics can be employed on CH content to uncover insights and correlations that can be used in decision-making. (Barrile & Bernardo, 2022) [p. 2708] highlight the transformative potential of using big data by investigating how analytical approach can enhance conservation strategies, aid resource allocation and optimise the management of CH resources. (Poulopoulos & Wallace, 2022) [pp. 188-189] emphasise that emerging technology trends, including big data, have a significant impact on related research areas such as CH. Big data primarily originates from sources such as social media, online gaming, data lakes[80], logs and frameworks that generate or use significant amounts of data. They stress that the incorporation of multi-faceted analytics in the CH domain is an area of active research, and present a data lake that provides essential user and data/knowledge management functionalities. However, they emphasise a crucial consideration - the need to bridge the theoretical foundations of disciplines such as cultural sociology with the technological advances of big data.

+
3.3.1.3 Artificial Intelligence
+

AI has been coined for the first time by John McCarthy, an American computer scientist and cognitive scientist, during the 1956 Dartmouth Conference, which is often considered the birth of AI as an academic field (Andresen, 2002, p. 84). According to the (Oxford English Dictionary, 2023), AI is described as follows:

+
+

The capacity of computers or other machines to exhibit or simulate intelligent behaviour; the field of study concerned with this. In later use also: software used to perform tasks or produce output previously thought to require human intelligence, esp. by using machine learning to extrapolate from large collections of data.

+
+

While AI is not the central focus of my PhD thesis, I acknowledge its impact in several instances. As a rapidly developing technology, AI has the potential to significantly transform various aspects of society, including the way we describe, analyse, and disseminate CH resources. It is worth mentioning that I endeavour to engage in a broader discourse concerning the domain of AI. In this context, I use the acronyms AI to talk about the overarching domain or its ethics, and ML to discuss the specifics of methodologies and algorithmic approaches, while refraining from delving into the intricacies of Deep Learning, which is a distinct subdomain within ML.

+

AI and ML offer great potential for digitising, curating and analysing CH, leveraging the vast digital datasets from CHIs. Some of the examples include text recognition mechanisms using OCR and HTR, NLP and NER for enriching unstructured text, as well as object detection methods for finding patterns within still and moving images (Neudecker, 2022; Sporleder, 2010). Textual works can also be analysed, for instance for sentiment analysis (see Susnjak, 2023), and generated using LLM – a variety of NLP, such as BERT or ChatGPT, which predicts the likelihood of a word given the previous words present in recorded texts. However, challenges such as data quality and biases in AI persist (Neudecker, 2022).

+

In addition, there are still uncertainties regarding the licensing and reuse of CH datasets by ML algorithms[81]. (Neudecker, 2022) emphasises the importance of well-curated digitised CH resources that are openly licensed, accompanied by relevant metadata, and accessible through APIs or download dumps in various formats. These curated resources have the potential to address the existing gap in this domain.

+

Building on the theme of enhancing CH through digital technologies, (McGillivray et al., 2020) explore the synergies and challenges found at the intersection of DH and NLP. DH is aptly described as ‘a nexus of fields within which scholars use computing technologies to investigate the kinds of questions that are traditional to the humanities […] or who ask traditional kinds of humanities-oriented questions about computing technologies’ (Fitzpatrick, 2010). This broad characterisation encapsulates the transformative potential of digital tools, including ML techniques, in enriching humanities research.

+

(McGillivray et al., 2020) highlight the critical need for bridging the communication gap between DH and NLP to drive progress in both fields. They propose increased interdisciplinary collaboration, encouraging DH researchers to actively utilise NLP tools to refine their research methodologies. A primary challenge in this convergence is the application of NLP to the complex, historical, or noisy texts often encountered in DH research. They conclude by advocating for stronger cooperation between practitioners in these fields. This collaborative effort is vital for harnessing the full potential of ML in analysing and interpreting CH.

+

The use of ML scripts in the context of CH — and beyond — is inherently limited by their applicability, namely when dealing with historical photographs. In such cases, the use of algorithms that are mostly trained and grounded in contemporary image data becomes quite incongruous due to the dissimilarity in temporal contexts. This dilemma is exemplified by datasets such as Microsoft’s Common Object in Context (COCO)[82] (Lin et al., 2014), where the available data are predominantly contemporary photographic content, which is misaligned with the historical nuances inherent in most of the digitised CH images. (Coleman, 2020) corroborates that a sound approach would be for ML practitioners to collaborate with libraries as they can draw practical lessons from critical data studies and the thoughtful integration of AI into their collections, using guidelines from DH. She also advocates that as handing handing over datasets would be a disservice to library patrons and that ‘Librarians need to master the instruments of AI and employ them both to learn more about their own resources—to see and analyze them in new ways—and to help shape applications of AI with the expertise and ethos of libraries.’

+

Ethical concerns, particularly regarding social biases and racism, are prevalent in technologies like ImageNet, where facial recognition may yield AI statements with strong negative connotations (Neudecker, 2022). Addressing this, (Gandon, 2019) suggest the production of AI services that are ‘benevolent-by-design for the good of the Web and society’. Furthermore, (Floridi, 2023) introduces the double-charge thesis, asserting that all technology design is a moral act, challenging the neutrality thesis. He emphasises that technologies are not neutral and can be influenced by a dynamic equilibrium of values, predisposing them towards morally good or evil directions.

+

As mentioned previously, the ML training datasets are often not enough representative to be properly leveraged in the CH sector (Strien et al., 2022). Fine-tuning is now a topic though and new ground truth datasets have been created and tailored for the needs of CH, such as Viscounth[83], a large-scale VQA dataset — i.e a dataset containing open-ended questions about images which requires an understanding of vision, language and commonsense knowledge to answer (Goyal et al., 2017) — for CH in English and Italian (see Becattini et al., 2023).

+

(Jaillant & Caputo, 2022) argue that the governance of AI ought to be carried out in partnership with GLAM institutions. However, while this collaboration has been proposed as a promising way forward, it still requires further exploration and evaluation, particularly with regards to the specific challenges and opportunities that it presents. On the one hand, the involvement of GLAMs in AI governance could enhance the development of digital CH projects that promote social justice and equity. However, on the other hand, this collaboration raises several challenges, such as the need to address issues of privacy, data protection, and intellectual property rights, and to ensure that the values and perspectives of GLAM professionals are adequately represented in the development of AI algorithms and systems. Therefore, it is crucial to examine the specific challenges and opportunities of this collaboration and to develop appropriate frameworks and guidelines that enable effective and ethical governance of AI in the GLAM sector.

+

One of these platforms that address these issues is AI4LAM, which is an international and participatory community focused on advancing the use of AI in, for and by libraries, archives, and museums[84]. The initiative was launched by the National Library of Norway and Stanford University Libraries in 2018 inspired by the success of the IIIF community. Another agency is the AEOLIAN Network[85], AI for Cultural Organisations, which investigates the role that AI can play to make born-digital and digitised cultural records more accessible to users (Jaillant & Rees, 2023, p. 582).

+

As an illustrative case, the LoC's exploration into ML technologies, as highlighted by (Allen, 2023), demonstrates a strategic commitment to enhancing the accessibility and utility of its diverse collections. This initiative reflects the LoC's acknowledgement of the transformative potential of ML, balanced with a cautious approach due to the necessity for accurate and responsible information stewardship. The LoC faces several challenges in applying ML, particularly the limitations of commercial AI systems in handling its varied materials and the requirement for substantial human intervention. This cautious exploration into ML is indicative of a broader trend in CHIs, where maintaining a balance between embracing technological advancements and preserving authenticity and integrity is crucial.

+

The specific experiments and projects undertaken by the LoC in the realm of ML are diverse and illustrative of the institution’s comprehensive approach to innovation. For instance, image recognition systems have been tested for identifying and classifying visual elements in artefacts, a task that requires a nuanced understanding of historical and cultural contexts. In another initiative, speech-to-text technology was employed to transcribe spoken word collections, confronting challenges such as accent recognition and audio quality variation. Additionally, the LoC explored the potential of ML in enhancing search and discovery capabilities through projects like Newspaper Navigator[86], which aimed to identify and extract images from digitised newspaper pages.

+

These experiments not only highlight the potential of ML in transforming the way LoC manages and disseminates its collections but also reveal the complexities and limitations inherent in these technologies. As (Allen, 2023) notes, the ongoing research and experimentation in ML at the LoC are critical in revolutionising access and discovery in the cultural heritage sector. These efforts, while facing challenges, represent a diligent integration of advanced technologies, upholding principles of responsible custodianship and setting a precedent for similar institutions globally in the adoption and adaption of ML and AI in CHIs.

+

The integration of LLM and KG presents a groundbreaking opportunity, particularly within the realm of CHIs, where there is already considerable expertise. This is aptly demonstrated in the work of (Pan et al., 2023), which elucidates the harmonisation between explicit knowledge and parametric knowledge, i.e. knowledge derived from patterns in data, as learned by models such as LLMs. The authors highlight three key areas for the advancement of KR and processing:

+
    +
  1. Knowledge Extraction, where LLMs improves the extraction of knowledge from diverse sources for applications such as information retrieval and KG construction;
  2. +
  3. Knowledge Graph Construction, which involves LLMs in tasks such as link prediction and triple extraction from data, albeit with challenges in precision and management of long tail entities;
  4. +
  5. Training LLMs Using KGs, where KGs provides structured knowledge for LLMs, helping to build retrieval-augmented models on the fly, enriching LLMs with world knowledge and increasing its adaptability.
  6. +
+

In a report for the University of Leeds in the UK, (Pirgova-Morgan, 2023) explores the potential and practical implications of AI in libraries. The project, forming part of the university’s ambitious vision for digital transformation, aims to understand how AI can be effectively integrated into library services. This research looks at both the use of general AI for long term strategic planning and specific AI applications for improving UX, process optimisation and enhancing the discoverability of collections. The methodology used in this study involves a multi-faceted approach including desk-based assessments, a university-wide survey and expert interviews. Specifically, the study highlights the following key findings:

+ +

These insights from the University of Leeds report illustrate the complex impact of AI on library services, from enhancing user interaction to influencing strategic decision-making, while also emphasising the importance of adapting AI applications to specific institutional needs.

+

It must be also stated that AI lacks inherent intelligence and consciousness, and have been ultimately built by people. An important concern, namely with LLM, is the perceptual illusion of cognitive interaction, where the machine appears to be engaging in dialogue and reasoning, when in fact it is generating content through predictive algorithms (see Ridge, 2023). Furthermore, regarding the topic of data colonialism, poor people in underprivileged nations are often burdened with the responsibility of cleaning up the toxic repercussions of AI, shielding affluent individuals and prosperous countries from direct exposure to its harmful effects[87].

+

Concluding this segment, it is essential to perceive ML algorithms as uncertain ‘socio-material configurations’, which can be seen as both powerful and inscrutable, demanding an axiomatic and problem-oriented approach in their understanding and application. (Jaton, 2017) elaborates on this by examining how these algorithms, while technologically complex, are firmly rooted in and shaped by the social, material, and human contexts in which they are developed. Beyond their computational complexity, these algorithms are deeply embedded in the process of constructing . These ground truths are not inherent or fixed; instead, they emerge from collaborative efforts that reflect the varied inputs of actors. This process underscores the algorithms as socio-material constructs, influenced by the characteristics and contexts of their creators. Understanding algorithms in this light highlights their deep integration with human actions and societal norms, offering a more nuanced view of their design and implementation (see Jaton, 2021, 2023).

+

3.3.2 Scientific Movements and Guiding Principles

+

First, 3.3.2.1 examines the movement towards more open and transparent forms of research. Open scholarship is a broad concept that encompasses practices such as open access publishing, open data, open source software, and open educational resources. The subsection explores the benefits and challenges of open scholarship, and how it can help to increase the accessibility and impact of research data.

+

Then, 3.3.2.2 explores the growing trend of involving members of the public in scientific research. Citizen science and citizen humanities involve collaborations between scientists and non-expert individuals, with the aim of generating new knowledge or solving complex problems. The subsubsection examines the benefits and challenges of citizen science and citizen humanities, and how they can help to democratise research.

+

3.3.2.3 examines the set of guiding principles designed to ensure that research outputs are FAIR. It explores the importance of each data principle for research integrity, reproducibility, and collaboration, and provides examples of how they can be implemented in practice.

+

3.3.2.4 explores the importance of ethical and culturally sensitive data governance practices for indigenous communities that are materialised through CARE. These principles provide a framework for managing data in a way that is consistent with the values and cultural traditions of indigenous communities. This part explores as well the challenges and opportunities of implementing the CARE Principles for Indigenous Data Governance.

+

Finally, 3.3.2.5 explores the concept of ‘Collections as Data’, a perspective that has emerged from the practical need and desire to improve decades of digital collecting practice. This approach re-conceptualises collections as ordered digital information that is inherently amenable to computational processing.

+
3.3.2.1 Towards Open Scholarship
+

According to the FOSTER[88], Open Science can be described as ‘[…] the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods.’ (FOSTER, 2019).

+

In recent years, the principles of Open Science, that historically include Open methodology, Open source, Open data, OA, Open peer review, as well as open educational resources, have become increasingly important as they emphasise transparency, collaboration and accessibility in scientific research (Bezjak et al., 2019). Open methodology refers to the sharing of research processes and methods, allowing other researchers to reproduce and build on existing work (see Vicente-Saez & Martinez-Fuentes, 2018). Open source software and tools enable researchers to collaborate, while open data practices promote the sharing of research data in ways that are accessible, discoverable and reusable by others[89]. Open access seeks to remove financial and other barriers to accessing scientific knowledge, while open peer review provides greater transparency and accountability in the publication process. Finally, open educational resources encourage the sharing of teaching and learning materials, thereby facilitating the dissemination of knowledge and skills.

+

(UNESCO, 2019) conducted a preliminary study of the technical, financial and considerations related to the promotion of Open Science. This research underscored the necessity for a holistic approach to Open Science and stressed the significance of tackling international legal matters, as well as the existing challenges stemming from unequal access to justice, which can hinder global scientific collaboration. This study laid the groundwork for a recommendation on making ‘[…] multilingual scientific knowledge openly available, accessible and reusable for everyone, to increase scientific collaborations and sharing of information for the benefits of science and society, and to open the processes of scientific knowledge creation’ (UNESCO, 2021, p. 7). UNESCO identified five types of access related to Open Science: infrastructures, societal actors, as well as associated and diverse knowledge systems where dialogue is needed. This includes acknowledging the rights of indigenous peoples and local communities to govern and make decisions on the custodianship, ownership, and administration of data on traditional knowledge and on their lands and resources. Figure 3.10 provides a visual summary of this.

+
+ Open Science Elements, Redrawn Slide from Presentation of Ana Persic [(Morrison, 2021) citing (Persic, 2021)] +
+ Figure 3.10: Open Science Elements, Redrawn Slide from Presentation of + Ana Persic [(Morrison, 2021) citing (Persic, 2021)] +
+
+

While Open Science offers numerous benefits, it also presents challenges and potential drawbacks that warrant careful consideration. One major concern is the risk of exacerbating inequities between researchers from well-resourced institutions and those from less privileged backgrounds. Open access publishing often entails significant costs in the form of article processing charges, which can disproportionately burden researchers without adequate funding support (Burchardt, 2014). Additionally, Open Science practices relying on open protocols may be vulnerable to misuse, such as automated bots excessively crawling open repositories or datasets. This can lead to overloading systems, unauthorised data extraction, or unintended uses of research outputs (see Irish & Saba, 2023; Li et al., 2021). These risks underscore the importance of balancing openness with safeguards that ensure equitable participation and secure, sustainable access to research materials.

+

These challenges are particularly relevant in the context of DH, a field that harnesses the promise and impact of digital technologies and methodologies for the study and understanding of cultural phenomena. The adoption of Open Science principles has contributed to greater collaboration, transparency and accessibility in research practices in this field. Open data practices are particularly relevant, as they allow scholars to work with large and complex datasets, including digitised archives and social media data. Open educational resources can also be used to support the dissemination of CH literacy and skills, enabling wider audiences to engage with such resources. However, ensuring that such openness does not exacerbate inequities or introduce vulnerabilities requires thoughtful implementation.

+

In addition to the principles of Open Science, the concept of Open Scholarship has been introduced by (Tennant et al., 2020) as a broader approach that encompasses the arts and humanities and goes beyond the research community to the wider public. Open Scholarship emphasises the importance of making research and scholarship accessible to a wider audience, including non-experts, educators and policy makers. It can be particularly relevant to the arts and humanities, as they often deal with complex cultural materials and narratives that have wider societal implications. By making their work openly accessible and engaging with non-experts, humanities researchers can contribute to public discourse, promote cultural understanding, and inform policy and decision-making. Open scholarship can also support greater collaboration and innovation within the Arts and Humanities by enabling researchers to work collaboratively across disciplines and with a wide range of constituents. For instance, open educational resources can be used to develop collaborative teaching and learning materials that draw on the expertise of scholars and practitioners from different disciplines, while open data practices can facilitate the sharing and reuse of CH materials.

+

Conversely, (Knöchelmann, 2019) advocates for the term Open Humanities as a dedicated discourse that would within the humanities. Notably, he argues that Open Humanities should adapt key Open Science elements to the Humanities’ unique context. In the case of preprints, the challenges in the humanities, such as limited discipline-specific preprint servers and linguistic diversity, require tailored solutions to encourage adoption. Open peer review in the humanities should accommodate the field’s subjectivity and diverse perspectives. Concerns about liberal copyright licenses revolve around potential misrepresentation and plagiarism, highlighting the importance of maintaining scholarly integrity regardless of the chosen license. Knochelmann’s proposal underscores the need for context-sensitive approaches to promote openness and collaboration while respecting humanities’ distinct characteristics.

+

Overall, the principles of Open Science provide a framework for promoting greater collaboration, transparency and accessibility in research practices. Yet, the challenges discussed underscore the need for careful adaptation to address inequities, cybersecurity concerns, and field-specific nuances. The concept of Open Scholarship, which stresses the importance of making research and scholarship accessible to wider audiences, can be instrumental in broadening the impact of research in both natural sciences and the humanities, as Open Science encourages greater collaboration and innovation across disciplines. Ultimately, this underscores the need for adaptation and positions all academic disciplines as essential contributors to societal understanding, cultural preservation and informed decision-making, while ensuring the sustainability and integrity of open practices.

+
3.3.2.2 Citizen Science, Citizen Humanities
+

(…)

+
3.3.2.3 FAIR Data Principles
+

The FAIR data principles[90] were developed to ensure that three types of entities – namely data, metadata, as well as infrastructures – are Findable, Accessible, Interoperable, and Reusable. The four key principles of FAIR and their underlying 15 sub-elements or facets are as follows (Wilkinson et al., 2016):

+

(…)

+
3.3.2.4 CARE Principles for Indigenous Data Governance
+

(…)

+
3.3.2.5 Collections as Data
+

(…)

(…)

4. Exploring Relationships through an Actor-Network Theory Lens

@@ -849,7 +939,7 @@

8. Yale’s LUX and LOUD Consistency<

(…)

9. Discussion

-

[Il] faut renoncer à l’idée d’une interopérabilité syntaxique ou structurelle par l’utilisation d’un modèle unique, qu’il s’agisse de la production, de stockage ou de l’exploitation au sein même d’un [système d’information]. (Poupeau, 2018) [76]

+

[Il] faut renoncer à l’idée d’une interopérabilité syntaxique ou structurelle par l’utilisation d’un modèle unique, qu’il s’agisse de la production, de stockage ou de l’exploitation au sein même d’un [système d’information]. (Poupeau, 2018) [91]

This chapter presents a comprehensive discussion where I interpret, analyse and critically examine my findings in relation to the thesis and the wider application of LOUD. Through an in-depth analysis of the design principles of LOUD and their implications for CH, this discussion aims to demonstrate the many challenges and opportunities inherent in this framework. The focus is on achieving community-driven consensus, rather than simply pursuing technological breakthrough.

The following sections are organised to provide a comprehensive review of the empirical findings, an evaluation abstracting LOUD, and a retrospective analysis of the research journey. Firstly, in Section 9.1, I will present a summary of the empirical findings from my research. This will include key themes and insights, structured to reflect the different areas of study and practice within LOUD.

@@ -859,7 +949,7 @@

9.1 Empirical Findings

This section summarises the empirical findings of my research and already offers some suggestions. The structure does not follow the exact order of the three empirical chapters but is organised around overarching topics that emerged throughout the study. The seven topics include Community Practices and Standards, Inclusion and Marginalised Groups, Maintenance and Community Engagement, Interoperability and Usability, Future Directions and Sustainability, Digital Materiality and Representation, as well as Challenges of Scaling and Implementation.

Community Practices and Standards

GitHub serves as a vital hub for community involvement, with a core group of active contributors often attending meetings regularly. This platform simplifies decision-making within the community, although it also reflects biases similar to those in FLOSS communities. Behind visible activities like meetings, there is substantial preparatory work managed by co-chairs, editorial boards, or driven by community-generated use cases. This foundational work often determines the direction and outcomes of formal gatherings. The LUX project at Yale, as seen in , has successfully fostered collaboration across various units, bringing together libraries and museums on a unified platform. The technological foundation of LUX, based on open standards, facilitates data integration and cross-collections discovery.

-

Not only does the deployment of FLOSS tools contribute to these achievements, but it also emphasises the social advantages of working collaboratively. The concept of the Tragedy of the Commons, as described by (Hardin, 1968), highlights the potential for individual self-interest to deplete shared resources. However, (Ostrom, 1990) offers a counterpoint by demonstrating how communities can successfully manage common resources through collective action and shared norms. In this context, initiatives like the CHAOSS initiative[77] play a significant role by providing metrics that help evaluate the health and sustainability of open source communities. These metrics include contributions, issue resolution times, and community growth, offering valuable insights into how collaborative efforts can be maintained and improved.

+

Not only does the deployment of FLOSS tools contribute to these achievements, but it also emphasises the social advantages of working collaboratively. The concept of the Tragedy of the Commons, as described by (Hardin, 1968), highlights the potential for individual self-interest to deplete shared resources. However, (Ostrom, 1990) offers a counterpoint by demonstrating how communities can successfully manage common resources through collective action and shared norms. In this context, initiatives like the CHAOSS initiative[92] play a significant role by providing metrics that help evaluate the health and sustainability of open source communities. These metrics include contributions, issue resolution times, and community growth, offering valuable insights into how collaborative efforts can be maintained and improved.

Reaching consensus is another critical aspect of community practices and standards. While the minutes of meetings are valuable artefacts, they often reflect an Anglo-Saxon approach to decision-making characterised by few substantive points and critical turning points. The formal aspects of conversations captured in minutes do not fully encompass the decision-making process, which frequently involves informal conversations, consensus-building through open dialogue, and subtle cues that influence outcomes. These elements are integral to the English and American approach and hold valuable lessons for an international community. IIIF and Linked Art are international communities, but decisions are made in English and the majority of participants are based in North America and the UK, significantly imprinting this approach. Understanding these nuances can help us improve our collaborative efforts within the IIIF and Linked Art communities. By recognising and appreciating these different facets of decision-making, we can learn from each other and enhance our collective ability to make effective and inclusive decisions.

Some of the challenges associated with these practices include the major demand on resources for community building, the slowness inherent in distributed development, and the difficulty in achieving consensus. Additionally, the concept of social sustainability can be seen as an imaginary construct that papers over differences, as discussed by Addressing these challenges is crucial for the long-term success and effectiveness of the IIIF and Linked Art communities.

Inclusion and Marginalised Groups

@@ -878,7 +968,7 @@

9.1 Empirical Findings

I would suggest as a way forward for the IIIF and Linked Art communities to focus on further improving usability of the specifications. This includes conducting comprehensive usability assessments of APIs to evaluate the experiences of new developers versus existing ones, understanding the steepness of the learning curve associated with each API, and guiding improvements in documentation, on-boarding processes, and overall developer support. Efforts should be made to lower the barriers to entry for new developers by developing more intuitive and user-friendly tutorials, providing example projects, and creating a robust support community. Ensuring that developers can quickly and effectively leverage APIs will foster greater adoption. Addressing the challenges of transitioning between different versions of specifications is critical, and developing tools and guidelines that help maintain consistency across versions will reduce friction and ensure smoother updates.

Future Directions and Sustainability

Survey findings, as discussed in , underscore the need for ongoing efforts to develop LOUD standards that foster an inclusive, dynamic digital ecosystem. Future strategies should include creating educational resources and frameworks that support interdisciplinary collaboration and reduce barriers to participation. While the Manifest serves as the fundamental unit within IIIF, the Linked Art protocol can play a similar central role as semantic gateways in broader contexts, allowing round-tripping across the APIs. The topic modelling exercise in LUX, detailed in , reveals complex actor-networks of organisations, individuals, and non-human actors, providing insights into the relationships sustaining the LUX initiative.

-

The next steps for Linked Art might involve forming a new consortium independent of a CIDOC Working Group, which could provide the necessary support to sustain the initiative. Alternatively, integrating Linked Art into IIIF as a new TSG and specification could address the discovery challenges within IIIF, as discussed during the birds of a feather session led by Robert Sanderson (see Raemy, 2024) at the 2024 IIIF Conference in Los Angeles[78]. Design principles that act as bridges across different disciplines, as proposed by (Roke & Tillman, 2022), are crucial. IIIF has demonstrated that this collaborative approach is feasible, and Linked Art could follow in its footsteps. However, achieving this requires increased dedication from passive members and broader adoption of the model and the API ecosystem in the near future.

+

The next steps for Linked Art might involve forming a new consortium independent of a CIDOC Working Group, which could provide the necessary support to sustain the initiative. Alternatively, integrating Linked Art into IIIF as a new TSG and specification could address the discovery challenges within IIIF, as discussed during the birds of a feather session led by Robert Sanderson (see Raemy, 2024) at the 2024 IIIF Conference in Los Angeles[93]. Design principles that act as bridges across different disciplines, as proposed by (Roke & Tillman, 2022), are crucial. IIIF has demonstrated that this collaborative approach is feasible, and Linked Art could follow in its footsteps. However, achieving this requires increased dedication from passive members and broader adoption of the model and the API ecosystem in the near future.

Digital Materiality and Representation

As explored in Chapter 7, the detailed digital representation of photographic albums, such as the Kreis Family Collection, demonstrates the need to comprehensively capture the materiality of digital objects. This includes the structure and context of images, which are crucial for maintaining their historical and social significance. The implementation of the IIIF Presentation API in creating a detailed digital replica of the Getty’s Bayard Album shows how digital materiality can be enhanced through thoughtful use of technology, but also highlights the scalability challenges for such detailed representations.

Creating these detailed digital representations can be seen as a ‘boutique’ approach, which, while labour-intensive and resource-demanding, is necessary for preserving the integrity and contextual significance of cultural heritage objects. The challenge lies in developing the appropriate means and methodologies to achieve this level of detail consistently. Future endeavours, whether through research projects or collaborative efforts between GLAM institutions and DH practitioners, should aim to address these challenges and create sustainable practices for digital materiality and representation. As Edwards aptly notes:

@@ -888,7 +978,7 @@

9.1 Empirical Findings

Challenges of Scaling and Implementation

As seen in Chapter 6, the IIIF Cookbook recipes and Linked Art patterns reflect the tension between creating advanced specifications and their practical implementation. This gap between ideation and real-world application underscores the challenges faced by the community in achieving broad adoption and interoperability. In Chapter 7, the exploration of APIs like the IIIF Change Discovery API illustrates the practical challenges and potential of scaling these technologies for wider adoption. The successful implementation in PIA demonstrates viability, but also points to the need for continued development and community engagement to fully realise the benefits.

-

Furthermore, assessing the scalability of IIIF image servers, as discussed by (Duin, 2022) and exemplified by the firm Q42 with their Edge-based service Micrio[79], highlights the importance of optimising data performance. Erwin Verbruggen aptly noted that ‘optimising data performance in my opinion mens sending as little data over as needed’[80], emphasising the need for efficient data handling to enhance scalability. This insight reinforces the necessity of continual refinement in scaling digital infrastructure to support broader use and integration.

+

Furthermore, assessing the scalability of IIIF image servers, as discussed by (Duin, 2022) and exemplified by the firm Q42 with their Edge-based service Micrio[94], highlights the importance of optimising data performance. Erwin Verbruggen aptly noted that ‘optimising data performance in my opinion mens sending as little data over as needed’[95], emphasising the need for efficient data handling to enhance scalability. This insight reinforces the necessity of continual refinement in scaling digital infrastructure to support broader use and integration.

Reflecting on these findings, I would like to assert that continuous participation, particularly for institutions that can afford to be part of initiatives like IIIF-C, is essential. Active members should not only focus on their own use cases but also consider the needs and perspectives of other, perhaps marginalised, groups. Achieving the dual goals of making progress within one community, whether it be IIIF or Linked Art, while also engaging in effective outreach and creating a solid baseline, will benefit everyone in the CH sector and beyond. Addressing where LOUD fits in, how people perceive this new concept or paradigm, and understanding how LOUD differs from Linked Data in general are essential. These questions help to clarify the stages at which themes related to one of the LOUD design principles emerge, crystallise, and potentially disappear. My thesis does not fully resolve these queries but offers insights and hints for further exploration.

In conclusion, the empirical findings reveal the richness of the implementation and maintenance of LOUD standards in the CH domain. From the critical role of community practices and standards to the challenges of achieving interoperability and inclusivity, each theme underlines the complex interplay of social, technical and organisational factors. will look at the evaluation of LOUD and explore its overall impact, delving into the delta of what to do with it, particularly in terms of Linked Data versus LOUD, where my thesis provides pointers but does not provide definitive answers.

9.2 Evaluation: Abstracting LOUD

@@ -1005,7 +1095,7 @@

10. Conclusion

  • Rijksmuseum: https://www.rijksmuseum.nl/ ↩︎

  • In the original version, these instances contained typographical -or factual errors. They have been struck through and corrected here. ↩︎

    +or factual errors. They have been struck through and corrected here. ↩︎ ↩︎

  • (Zeng & Qin, 2022) [p. 11] articulate that ‘as with “data”, metadata can be either singular or plural. It is used as singular in the sense of a kind of data; however, in plural form, the term refers to things one can count’. In the context of this thesis, I have chosen to favour the plural form of (meta)data. However, I acknowledge that I may occasionally use the singular form when referring to the overarching concepts or when quoting references verbatim. ↩︎

  • @@ -1170,24 +1260,87 @@

    10. Conclusion

    the creation of the first DL modelling languages in the mid-1980s (Krötzsch et al., 2013). ↩︎

    -
  • Author’s translation: ‘We need to give up on the idea of syntactic or structural interoperability through the use of a single model, whether for producing, storing or managing data within an information system’. ↩︎

    +
  • LinkedDataGPT: https://ld.gpt.liip.ch/ ↩︎

    +
  • +
  • Neo4j: https://neo4j.com/ ↩︎

    +
  • +
  • GB and +PB are units of +digital information storage capacity. 1 GB is equal to 1,000,000,000 ($10^{9}$) +bytes, 1 TB is +equal to 1,000,000,000,000 ($10^{12}$) bytes, and 1 +PB, is equal to +1,000,000,000,000,000 ($10^{15}$) bytes. If a standard +high-definition movie is around 4-5 GB, then 1 PB could store tens of thousands of +movies. In 2011, (Gomes et al., 2011) [p. 414] reported that the +Internet Archive held 150,000 million contents of archived websites +– crawled through the Wayback Machine – or approximately 5.5 +PB. As of +December 2021, it was about 57 PB of archived websites and a total used +storage of 212 PB, see +https://archive.org/web/petabox.php. ↩︎

    +
  • +
  • In this context, UX is understood as an umbrella term +encompassing both user and/or customer service, emphasising that the +focus is on individuals who need or use a given service, regardless +of their categorisation as users or customers. ↩︎

    +
  • +
  • According to (Nargesian et al., 2019) [p. 1986], a data lake is a +vast collection of datasets that has four characteristics. It can be +stored in different storage systems, exhibit varying formats, may +lack useful metadata or use differing metadata formats, and can +change autonomously over time. ↩︎

    +
  • +
  • An interesting initiative in this area is the use of +RAIL, which +empower developers to restrict the use of AI on the software they develop to +prevent irresponsible and harmful applications: +https://www.licenses.ai/ ↩︎

    +
  • +
  • Common Objects in Context: https://cocodataset.org/ ↩︎

    +
  • +
  • Viscounth – A Large Dataset for Visual Question Answering for Cultural Heritage: https://github.com/misaelmongiovi/IDEHAdataset ↩︎

    +
  • +
  • Artificial Intelligence for Libraries, Archives & Museums: +https://sites.google.com/view/ai4lam ↩︎

    +
  • +
  • AEOLIAN Network: https://www.aeolian-network.net/ ↩︎

    +
  • +
  • Newspaper Navigator: https://news-navigator.labs.loc.gov/ ↩︎

    +
  • +
  • (Perrigo, 2023) investigated that Kenyan workers made +less than USD 2 an hour to identify and filter out harmful content +for ChatGPT. ↩︎

    +
  • +
  • FOSTER Plus (Fostering the practical implementation of Open +Science in Horizon 2020 and beyond) was a 2-year EU-funded project +initiated in 2017 with 11 partners across 6 countries. Its main goal +was to promote a lasting shift in European researchers’ behaviour +towards Open Science becoming the norm. ↩︎

    +
  • +
  • According to the Open Knowledge Foundation, a non-profit network established in 2004 in the U.K., which aims to promote the idea of open knowledge, sets out some some principles around the concept of openness and defines it as follows: ‘Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness)’. https://opendefinition.org/ ↩︎

    +
  • +
  • FAIR +Principles: https://www.go-fair.org/fair-principles/ ↩︎

    +
  • +
  • Author’s translation: ‘We need to give up on the idea of syntactic or structural interoperability through the use of a single model, whether for producing, storing or managing data within an information system’. ↩︎

  • -
  • CHAOSS: -https://chaoss.community/ ↩︎

    +
  • CHAOSS: +https://chaoss.community/ ↩︎

  • -
  • IIIF +

  • IIIF Annual Conference and Showcase - Los Angeles, CA, USA - June 4-7, -2024: https://iiif.io/event/2024/los-angeles/ ↩︎

    +2024: https://iiif.io/event/2024/los-angeles/ ↩︎

  • -
  • Micrio: https://micr.io/ ↩︎

    +
  • Micrio: https://micr.io/ ↩︎

  • -
  • Message written on the IIIF Slack Workspace on 28 October 2022. ↩︎

    +
  • Message written on the IIIF Slack Workspace on 28 October 2022. ↩︎

  • -

    Bibliography

    Adamou, Alessandro, Picca, Davide, Hou, Yumeng, & Loreto Granados-García, Paula. (2023). The Facets of Intangible Heritage in Southern Chinese Martial Arts: Applying a Knowledge-driven Cultural Contact Detection Approach. Journal on Computing and Cultural Heritage, 16(3), 63:1-63:27. https://doi.org/10.1145/3606702
    Ahmad, Yahaya. (2006). The Scope and Definitions of Heritage: From Tangible to Intangible. International Journal of Heritage Studies, 12(3), 292–300. https://doi.org/10.1080/13527250600604639
    Akrich, Madeleine, Callon, Michel, & Latour, Bruno (Eds.). (2006). Sociologie de la traduction: Textes fondateurs. Presses des Mines. https://doi.org/10.4000/books.pressesmines.1181
    Alter, George, Rizzolo, Flavio, & Schleidt, Kathi. (2023). View points on data points: A shared vocabulary for cross-domain conversations on data and metadata. IASSIST Quarterly, 47(1), 1–39. https://doi.org/10.29173/iq1051
    Aslan, Zaki. (1997). Protective Structures for the Conservation and Presentation of Archaeological Sites. Journal of Conservation and Museum Studies, 3(0), 16. https://doi.org/10.5334/jcms.3974
    Avram, Henriette D. (1968). The MARC Pilot Project. Final Report (ED029663; p. 173). Library of Congress. https://eric.ed.gov/?id=ED029663
    Azzopardi, Elaine, Kenter, Jasper O., Young, Juliette, Leakey, Chris, O’Connor, Seb, Martino, Simone, Flannery, Wesley, Sousa, Lisa P., Mylona, Dimitra, Frangoudes, Katia, Béguier, Irène, Pafi, Maria, Silva, Arturo Rey da, Ainscough, Jacob, Koutrakis, Manos, Silva, Margarida Ferreira da, & Pita, Cristina. (2023). What are heritage values? Integrating natural and cultural heritage into environmental valuation. People and Nature, 5(2), 368–383. https://doi.org/10.1002/pan3.10386
    Baader, Franz, & Lutz, Carsten. (2007). 13 Description logic. In Patrick Blackburn, Johan Van Benthem, & Frank Wolter (Eds.), Studies in Logic and Practical Reasoning (Vol. 3, pp. 757–819). Elsevier. https://doi.org/10.1016/S1570-2464(07)80016-4
    Baca, Murtha, & Harpring, Patricia. (2017). Categories for the description of works of art [Report]. Getty Research Institute. https://apo.org.au/node/14985
    Bekiari, Chryssoula, Bruseker, George, Doerr, Martin, Ore, Christian-Emil, Stead, Stephen, & Velios, Athanasios. (2021). CIDOC Conceptural Reference Model 7.1.1. https://doi.org/10.26225/FDZH-X261
    Beretta, Francesco. (2022). Interopérabilité des données de la recherche et ontologies fondationnelles : Un éco-système d’exnsions du CIDOC CRM pour les sciences humaines et sociales. In Nicolas Lasolle, Olivier Bruneau, & Jean Lieber (Eds.), Actes des journées humanités numériques et Web sémantique (pp. 2–22). Les Archives Henri-Poincaré - Philosophie et Recherches sur les Sciences et les Technologies (AHP-PReST); Laboratoire lorrain de recherche en informatique et ses applications (LORIA). https://doi.org/10.5281/zenodo.7014341
    Berressem, Hanjo. (2015). Déjà Vu: Serres after Latour, Deleuze after Harman, ‘Nature Writing’ after ‘Network Theory’. Amerikastudien / American Studies, 60(1), 59–79. https://www.jstor.org/stable/44071895
    Blue Shield. (2016). Blue Shield Statutes (Articles of Association) (p. 16). https://web.archive.org/web/20230802104458/https://theblueshield.org/wp-content/uploads/2021/12/statute-Amendments_BSI_2016.pdf
    Borgo, Stefano, Ferrario, Roberta, Gangemi, Aldo, Guarino, Nicola, Masolo, Claudio, Porello, Daniele, Sanfilippo, Emilio M., & Vieu, Laure. (2022). DOLCE: A descriptive ontology for linguistic and cognitive engineering. Applied Ontology, 17(1), 45–69. https://doi.org/10.3233/AO-210259
    Bowman, Blythe A. (2008). Transnational Crimes Against Culture: Looting at Archaeological Sites and the “Grey” Market in Antiquities. Journal of Contemporary Criminal Justice, 24(3), 225–242. https://doi.org/10.1177/1043986208318210
    Brown, Karen, Cummins, Alissandra, & González Rueda, Ana S. (Eds.). (2023). Communities and Museums in the 21st Century: Shared Histories and Climate Action (1st ed.). Routledge. https://doi.org/10.4324/9781003288138
    Bruseker, George, Carboni, Nicola, & Guillem, Anaïs. (2017). Cultural Heritage Data Management: The Role of Formal Ontology and CIDOC CRM. In Matthew L. Vincent, Víctor Manuel López-Menchero Bendicho, Marinos Ioannides, & Thomas E. Levy (Eds.), Heritage and Archaeology in the Digital Age: Acquisition, Curation, and Dissemination of Spatial Cultural Heritage Data (pp. 93–131). Springer International Publishing. https://doi.org/10.1007/978-3-319-65370-9_6
    Callon, Michel. (2001). Actor Network Theory. In Neil J. Smelser & Paul B. Baltes (Eds.), International Encyclopedia of the Social & Behavioral Sciences (pp. 62–66). Pergamon. https://doi.org/10.1016/B0-08-043076-7/03168-5
    Cameron, Fiona. (2007). Beyond the Cult of the Replicant: Museums and Historical Digital Objects—Traditional Concerns, New Discourses. In Fiona Cameron & Sarah Kenderdine (Eds.), Theorizing Digital Cultural Heritage: A Critical Discourse. The MIT Press. https://doi.org/10.7551/mitpress/9780262033534.003.0004
    Canning, Erin, Brown, Susan, Roger, Sarah, & Martin, Kimberley. (2022). The Power to Structure : Making Meaning from Metadata Through Ontologies. KULA: Knowledge Creation, Dissemination, and Preservation Studies, 6(3), 1–15. https://doi.org/10.18357/kula.169
    Cantara, Linda. (2005). METS: The Metadata Encoding and Transmission Standard. Cataloging & Classification Quarterly, 40(3–4), 237–253. https://doi.org/10.1300/J104v40n03_11
    Caplan, Priscilla, & Guenther, Rebecca S. (2005). Practical Preservation: The PREMIS Experience. Library Trends, 54(1), 111–124. https://muse.jhu.edu/pub/1/article/193223
    Carman, John. (2009). Where the Value Lies: The importance of materiality to the immaterial aspects of heritage. In Emma Waterton & Laurajane Smith (Eds.), Taking Archaeology Out of Heritage (pp. 192–208). Cambridge Scholars Publishing.
    Chang, Liang, Sattler, Uli, & Gu, Tianlong. (2014). An ABox Revision Algorithm for the Description Logic EL_bot. In Meghyn Bienvenu, Magdalena Ortiz, Riccardo Rosati, & Mantas Simkus (Eds.), Informal Proceedings of the 27th International Workshop on Description Logics (Vol. 1193, pp. 459–470). CEUR. https://ceur-ws.org/Vol-1193/#paper_64
    Charles, Valentine, & Isaac, Antoine. (2015). Enhancing the Europeana Data Model (EDM) (p. 21) [White paper]. Europeana Foundation. http://pro.europeana.eu/files/Europeana_Professional/Publications/EDM_WhitePaper_17062015.pdf
    Chiquet, Vera, Felsing, Ulrike, & Fornaro, Peter. (2023). A Participatory Interface for a Photo Archives. Archiving Conference, 20, 109–111. https://doi.org/10.2352/issn.2168-3204.2023.20.1.23
    Chiquet, Vera. (2023). How to digitally preserve UNESCO intangible cultural heritage? A web-archive for ephemeral events at the Basler Carnival. Archiving Conference, 20, 105–108. https://doi.org/10.2352/issn.2168-3204.2023.20.1.22
    Clavaud, Florence, & Wildi, Tobias. (2021). ICA Records in Contexts-Ontology (RiC-O): A Semantic Framework for Describing Archival Resources. Linked Archives International Workshop 2021, 3019, 79–92. https://enc.hal.science/hal-03965776
    Coburn, Erin, Lanzi, Elisa, O’Keefe, Elizabeth, Stein, Regine, & Whiteside, Ann. (2010). The Cataloging Cultural Objects experience: Codifying practice for the cultural heritage community. IFLA Journal, 36(1), 16–29. https://doi.org/10.1177/0340035209359561
    Coburn, Erin, Light, Richard, McKenna, Gordon, Stein, Regine, & Vitzthum, Axel. (2010). LIDO - Lightweight Information Describing Objects Version 1.0. https://lido-schema.org/schema/v1.0/lido-v1.0-specification.pdf
    Constantopoulos, Panos, & Dallas, Costis. (2008). Aspects of a digital curation agenda for cultural heritage. 2008 IEEE International Conference on Distributed Human-Machine Systems. Athens, Greece: IEEE, 1–6.
    Conway, Paul. (2015). Digital transformations and the archival nature of surrogates. Archival Science, 15(1), 51–69. https://doi.org/10.1007/s10502-014-9219-z
    Cornut, Murielle, Raemy, Julien Antoine, & Spiess, Florian. (2023). Annotations as Knowledge Practices in Image Archives: Application of Linked Open Usable Data and Machine Learning. Journal on Computing and Cultural Heritage, 16(4), 1–19. https://doi.org/10.1145/3625301
    Cornut, Murielle. (2023). Open, edit, save: Über die performative Materialität privater Fotoalben. In Ulrich Hägele (Ed.), Kuratierte Erinnerungen: Das Fotoalbum (pp. 157–170). Waxmann.
    Cossham, Amanda Frances. (2017). Models of the bibliographic universe [{PhD} {Thesis}, Monash University]. https://doi.org/10.4225/03/596e9bc6c1d09
    Coyle, Karen, & Hillmann, Diane. (2007). Resource Description and Access (RDA): Cataloging Rules for the 20th Century. D-Lib Magazine, 13(1/2). https://doi.org/10.1045/january2007-coyle
    Dahlgren, Anna, & Hansson, Karin. (2020). The Diversity Paradox: Conflicting Demands on Metadata Production in Cultural Heritage Collections. Digital Culture & Society, 6(2), 239–256. https://doi.org/10.14361/dcs-2020-0212
    De Muynke, Julien, Baltazar, Marie, Monferran, Martin, Voisenat, Claudie, & Katz, Brian F. G. (2022). Ears of the past, an inquiry into the sonic memory of the acoustics of Notre-Dame before the fire of 2019. Journal of Cultural Heritage. https://doi.org/10.1016/j.culher.2022.09.006
    Delmas-Glass, Emmanuelle, & Sanderson, Robert. (2020). Fostering a community of PHAROS scholars through the adoption of open standards. Art Libraries Journal, 45(1), 19–23. https://doi.org/10.1017/alj.2019.32
    Denton, William. (2006). Functional Requirements for Bibliographic Records (FRBR): Hype or Cure-All? Portal: Libraries and the Academy, 6(2), 231–232. https://doi.org/10.1353/pla.2006.0018
    Digital Preservation Coalition. (2017). Persistent Identifiers. In Digital Preservation Handbook. DPC. https://www.dpconline.org/handbook/technical-solutions-and-tools/persistent-identifiers
    Dijkshoorn, Chris. (2023, October). Building Collection Data Infrastructure at the Rijksmuseum. EuropeanaTech 2023.
    Doerr, Martin. (2003). The CIDOC Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata. AI Magazine, 24(3), 75–92. https://doi.org/10.1609/aimag.v24i3.1720
    Duin, Marcel. (2022). WebAssembly: Beyond the Browser. In Q42 Engineering. https://engineering.q42.nl/webassembly-beyond-the-browser/
    Edmunds, Jeff. (2023). BIBFRAME Must Die. ScholarSphere, 1–7. https://doi.org/10.26207/V18M-0G05
    Edwards, Elizabeth, & Hart, Janice (Eds.). (2004). Photographs Objects Histories (1st Edition). Routledge. https://doi.org/10.4324/9780203506493
    Ehrlinger, Lisa, & Wöß, Wolfram. (2016). Towards a Definition of Knowledge Graphs. In Michael Martin, Martí Cuquet, & Erwin Folmer (Eds.), Joint Proceedings of the Posters and Demos Track of the 12th International Conference on Semantic Systems - SEMANTiCS2016 and the 1st International Workshop on Semantic Change & Evolving Semantics (SuCCESS’16) (Vol. 1695). CEUR. https://ceur-ws.org/Vol-1695/#paper4
    Endres, Bill. (2019). Digitizing Medieval Manuscripts: The St. Chad Gospels, Materiality, Recoveries, and Representation in 2D & 3D. In Digitizing Medieval Manuscripts. ARC, Amsterdam University Press. https://doi.org/10.1515/9781942401803
    Felsing, Ulrike, & Cornut, Murielle. (2024). Re-Imagining the Collection of the Kreis Family. Research in Arts and Education, 2024(1), 41–53. https://doi.org/10.54916/rae.142567
    Felsing, Ulrike, & Frischknecht, Max. (2021). Critical Map Visualizations. In Christine Schranz (Ed.), Shifts in Mapping (pp. 95–124). transcript Verlag. https://doi.org/10.1515/9783839460412-008
    Felsing, Ulrike, Fornaro, Peter, Frischknecht, Max, & Raemy, Julien Antoine. (2023). Community and Interoperability at the Core of Sustaining Image Archives. Digital Humanities in the Nordic and Baltic Countries Publications, 5(1), 40–54. https://doi.org/10.5617/dhnbpub.10649
    Ferrazzi, Sabrina. (2021). The Notion of “Cultural Heritage” in the International Field: Behind Origin and Evolution of a Concept. International Journal for the Semiotics of Law - Revue Internationale de Sémiotique Juridique, 34(3), 743–768. https://doi.org/10.1007/s11196-020-09739-0
    Fiorentino, Sara, & Chinni, Tania. (2023). The Persistence of Memory. Exploring the Significance of Glass from Materiality to Intangible Values. Heritage, 6(6), 4834–4842. https://doi.org/10.3390/heritage6060257
    Floridi, Luciano. (2005). Is Semantic Information Meaningful Data? Philosophy and Phenomenological Research, 70(2), 351–370. https://www.jstor.org/stable/40040796
    Floridi, Luciano. (2010). Information: A very short introduction. Oxford University Press.
    Force, Donald C., & Smith, Randy. (2021). Context Lost: Digital Surrogates, Their Physical Counterparts, and the Metadata that Is Keeping Them Apart. The American Archivist, 84(1), 91–118. https://doi.org/10.17723/0360-9081-84.1.91
    Freire, Nuno, & Isaac, Antoine. (2019). Technical Usability of Wikidata’s Linked Data. In Witold Abramowicz & Rafael Corchuelo (Eds.), Business Information Systems Workshops (pp. 556–567). Springer International Publishing. https://doi.org/10.1007/978-3-030-36691-9_47
    Freire, Nuno, Calado, Pável, & Martins, Bruno. (2018). Availability of Cultural Heritage Structured Metadata in the World Wide Web. In Leslie Chan & Pierre Mounier (Eds.), ELPUB 2018. https://doi.org/10.4000/proceedings.elpub.2018.20
    Freire, Nuno, Isaac, Antoine, Robson, Glen, Brooks, John, & Manguinhas, Hugo. (2017). A survey of Web technology for metadata aggregation in cultural heritage. Information Services & Use, 37(4), 425–436. https://doi.org/10.3233/ISU-170859
    Freire, Nuno, Meijers, Enno, Valk, Sjors de, Raemy, Julien A., & Isaac, Antoine. (2021). Metadata Aggregation via Linked Data: Results of the Europeana Common Culture Project. In Emmanouel Garoufallou & María-Antonia Ovalle-Perandones (Eds.), Metadata and Semantic Research (pp. 383–394). Springer International Publishing. https://doi.org/10.1007/978-3-030-71903-6_35
    Fresa, Antonella. (2013). A Data Infrastructure for Digital Cultural Heritage: Characteristics, Requirements and Priority Services. International Journal of Humanities and Arts Computing, 7(supplement), 29–46. https://doi.org/10.3366/ijhac.2013.0058
    Frischknecht, Max. (2022). Generating Perspectives: Applying Generative Design to critically explore the Atlas of Swiss Folklore. DARIAH-CH Study Day 2022 Posters. https://doi.org/10.24451/arbor.17911
    Georgopoulos, Andreas. (2018). CIPA’s Perspectives on Cultural Heritage. In Sander Münster, Kristina Friedrichs, Florian Niebling, & Agnieszka Seidel-Grzesińska (Eds.), Digital Research and Education in Architectural Heritage (pp. 215–245). Springer International Publishing. https://doi.org/10.1007/978-3-319-76992-9_13
    Giacomo, Giuseppe De, & Lenzerini, Maurizio. (1996). TBox and ABox reasoning in expressive description logics. Proceedings of the Fifth International Conference on Principles of Knowledge Representation and Reasoning, 316–327. https://dl.acm.org/doi/10.5555/3087368.3087406
    Gilliland, Anne J. (2016). Setting the Stage. In Murtha Baca (Ed.), Introduction to metadata (Third edition). Getty Research Institute. https://www.getty.edu/publications/intrometadata/setting-the-stage/
    Greenberg, Jane. (2005). Understanding Metadata and Metadata Schemes. Cataloging & Classification Quarterly, 40(3–4), 17–36. https://doi.org/10.1300/J104v40n03_02
    Gruber, Thomas R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2), 199–220. https://doi.org/10.1006/knac.1993.1008
    Guenther, Rebecca S. (2003). MODS: The Metadata Object Description Schema. Portal: Libraries and the Academy, 3(1), 137–150. https://doi.org/10.1353/pla.2003.0006
    Guillem, Anaïs, Gros, Antoine, & De Luca, Livio. (2023, June). Faire parler les claveaux effondrés de la cathédrale Notre-Dame de Paris. Recueil Des Communications Du 4e Colloque Humanistica. https://hal.science/hal-04106101
    Guillem, Anaïs, Gros, Antoine, Reby, Kevin, Abergel, Violette, & De Luca, Livio. (2023). RCC8 for CIDOC CRM: Semantic Modeling of Mereological and Topological Spatial Relations in Notre-Dame de Paris. In Antonis Bikakis, Roberta Ferrario, Stéphane Jean, Béatrice Markhoff, Alessandro Mosca, & Marianna Nicolosi Asmundo (Eds.), Proceedings of the International Workshop on Semantic Web and Ontology Design for Cultural Heritage (Vol. 3540). CEUR. https://ceur-ws.org/Vol-3540/#paper2
    Hacıgüzeller, Piraye, Taylor, James Stuart, & Perry, Sara. (2021). On the Emerging Supremacy of Structured Digital Data in Archaeology: A Preliminary Assessment of Information, Knowledge and Wisdom Left Behind. Open Archaeology, 7(1), 1709–1730. https://doi.org/10.1515/opar-2020-0220
    Haraway, Donna Jeanne. (2003). The companion species manifesto: Dogs, people, and significant otherness. Prickly Paradigm Press.
    Haraway, Donna Jeanne. (2016). Staying with the trouble: Making kin in the Chthulucene. Duke University Press.
    Haraway, Donna. (2008). Encounters with Companion Species: Entangling Dogs, Baboons, Philosophers, and Biologists. Configurations, 14(1), 97–114. https://doi.org/10.1353/con.0.0002
    Hardesty, Juliet, & Nolan, Allison. (2021). Mitigating Bias in Metadata: A Use Case Using Homosaurus Linked Data. Information Technology and Libraries, 40(3). https://doi.org/10.6017/ital.v40i3.13053
    Hardin, Garrett. (1968). The Tragedy of the Commons. Science, 162(3859), 1243–1248. https://doi.org/10.1126/science.162.3859.1243
    Harpring, Patricia. (2010). Development of the Getty Vocabularies: AAT, TGN, ULAN, and CONA. Art Documentation: Journal of the Art Libraries Society of North America, 29(1), 67–72. https://doi.org/10.1086/adx.29.1.27949541
    Haxaire, Claudie. (2009). The Power of Ambiguity: The Nature and Efficacy of the Zamble Masks Revealed by “Disease Masks” Among the Gouro People (Côte d’Ivoire). Africa, 79(4), 543–569. https://doi.org/10.3366/E0001972009001065
    He, Y., Ma, Y. H., & Zhang, X. R. (2017). “Digital Heritage” Theory and Innovative Practice. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W5, 335–342. https://doi.org/10.5194/isprs-archives-XLII-2-W5-335-2017
    Hertz, Ellen, Graezer Bideau, Florence, Leimgruber, Walter, & Munz, Hervé. (2018). Politiques de la tradition. Le patrimoine culturel immatériel (Vol. 131). Presses polytechniques et universitaires romandes. https://edoc.unibas.ch/68569/
    Hill, Linda, Buchel, Olha, Janée, Greg, & Zeng, Marcia Lei. (2002). Integration of Knowledge Organization Systems into Digital Library Architectures: Position Paper for 13th ASIST SIGICR Workshop, Reconceptualizing Classification Research. Advances in Classification Research Online, 13(1), 46–52. https://doi.org/10.7152/acro.v13i1.13835
    Hillmann, Diane I., Marker, Rhonda, & Brady, Chris. (2008). Metadata Standards and Applications. The Serials Librarian, 54(1–2), 7–21. https://doi.org/10.1080/03615260801973364
    Hodge, Gail M. (2000). Systems of knowledge organization for digital libraries: Beyond traditional authority files. Digital Library Federation, Council on Library; Information Resources.
    Hoffmann, Anna Lauren. (2021). Terms of inclusion: Data, discourse, violence. New Media & Society, 23(12), 3539–3556. https://doi.org/10.1177/1461444820958725
    Hou, Yumeng, & Kenderdine, Sarah. (2024). Ontology-based knowledge representation for traditional martial arts. Digital Scholarship in the Humanities, 1–18. https://doi.org/10.1093/llc/fqae005
    Hou, Yumeng, Kenderdine, Sarah, Picca, Davide, Egloff, Mattia, & Adamou, Alessandro. (2022). Digitizing Intangible Cultural Heritage Embodied: State of the Art. Journal on Computing and Cultural Heritage, 15(3), 55:1-55:20. https://doi.org/10.1145/3494837
    Huber, Birgit, & Frischknecht, Max. (2024). Digitalisierung und (De-)Konstruktion. Überlegungen zur Entwicklung eines Prototyps für die digitale Zugänglichmachung des «Atlas der Schweizerischen Volkskunde. In Sabine Eggmann & Konrad J. Kuhn (Eds.), Schweizerisches Archiv für Volkskunde  Archives suisses des traditions populaires (Vol. 2024/1, pp. 27–52). Chronos. https://doi.org/10.33057/CHRONOS.1785/27-51
    Huber, Birgit. (2023). Die Entdeckung der «Brünig-Napf-Reuss-Linie». In Blog zur Schweizer Geschichte - Schweizerisches Nationalmuseum. https://blog.nationalmuseum.ch/2023/10/die-entdeckung-der-bruenig-napf-reuss-linie/
    Hyvönen, Eero. (2012). Cultural Heritage on the Semantic Web. In Publishing and Using Cultural Heritage Linked Data on the Semantic Web (pp. 1–11). Springer International Publishing. https://doi.org/10.1007/978-3-031-79438-4_1
    Hyvönen, Eero. (2020). Using the Semantic Web in digital humanities: Shift from data publishing to data-analysis and serendipitous knowledge discovery. Semantic Web, 11(1), 187–193. https://doi.org/10.3233/SW-190386
    ICA Expert Group on Archival Description. (2023). Records in Context Conceptual Model 1.0. https://www.ica.org/sites/default/files/ric-cm-1.0_0.pdf
    Ioannides, Marinos, & Davies, Robert. (2019). Towards a Holistic Documentation and Wider Use of Digital Cultural Heritage. In Emmanouel Garoufallou, Fabio Sartori, Rania Siatri, & Marios Zervas (Eds.), Metadata and Semantic Research (pp. 76–88). Springer International Publishing. https://doi.org/10.1007/978-3-030-14401-2_7
    Izu, Benjamin Obeghare. (2022). The Sociocultural Significance of the Emedjo (Masquerade) Dance Among the Abraka People in Delta State, Nigeria. E-Journal of Humanities, Arts and Social Sciences, 413–423. https://doi.org/10.38159/ehass.2022394
    Jackson, Steven J., Edwards, Paul N., Bowker, Geoffrey C., & Knobel, Cory P. (2007). Understanding infrastructure: History, heuristics and cyberinfrastructure policy. First Monday, 12(6). https://doi.org/10.5210/fm.v12i6.1904
    Katz, Brian F. G. (2023, October). Digitally exploring the acoustic history of Notre-Dame Cathedral. EuropeanaTech 2023. https://youtu.be/JDcNV_X54oQ
    Koch, Inês, Ribeiro, Cristina, & Teixeira Lopes, Carla. (2020). ArchOnto, a CIDOC-CRM-Based Linked Data Model for the Portuguese Archives. In Mark Hall, Tanja Merčun, Thomas Risse, & Fabien Duchateau (Eds.), Digital Libraries for Open Knowledge (Vol. 12246, pp. 133–146). Springer International Publishing. https://doi.org/10.1007/978-3-030-54956-5_10
    Krötzsch, Markus, Simancik, Frantisek, & Horrocks, Ian. (2013). A Description Logic Primer. arXiv. https://doi.org/10.48550/arXiv.1201.4089
    Lagoze, Carl, Van de Sompel, Herbert, Nelson, Michael, & Warner, Simeon. (2002). The Open Archives Initiative Protocol for Metadata Harvesting - v.2.0. In Open Archives Initiative. http://www.openarchives.org/OAI/openarchivesprotocol.html
    Latour, Bruno. (1990). Postmodern? No, simply amodern! Steps towards an anthropology of science. Studies in History and Philosophy of Science Part A, 21(1), 145–171. https://doi.org/10.1016/0039-3681(90)90018-4
    Latour, Bruno. (1993). We have never been modern. Harvard University Press.
    Latour, Bruno. (1996). On actor-network theory: A few clarifications. Soziale Welt, 47(4), 369–381. https://www.jstor.org/stable/40878163
    Latour, Bruno. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford University Press.
    Latour, Bruno. (2022). Habiter la Terre : Entretiens avec Nicolas Truong. Éditions Les Liens qui libèrent ; Arte éditions.
    Lave, Jean, & Wenger, Etienne. (1991). Situated learning: Legitimate peripheral participation. Cambridge University Press.
    Lee, Christopher A. (2009). Open Archival Information System (OAIS) Reference Model. In Marcia J. Bates & Mary Niles Maack (Eds.), Encyclopedia of Library and Information Sciences, Third Edition (3rd ed., pp. 4020–4030). CRC Press. https://doi.org/10.1081/E-ELIS3-120044377
    Leimgruber, Walter. (2008). Was ist immaterielles Kulturerbe? Bulletin / Schweizerische Akademie Der Geistes- Und Sozialwissenschaften, 2008, H. 2, 24–25. http://edoc.unibas.ch/dok/A5251330
    Leimgruber, Walter. (2010). Switzerland and the UNESCO Convention on Intangible Cultural Heritage. Journal of Folklore Research, 47(1–2), 161–196. https://doi.org/10.2979/JFR.2010.47.1-2.161
    Lemmer-Webber, Christine, & Tallon, Jessica. (2018). ActivityPub. In W3C. https://www.w3.org/TR/activitypub/
    Lenzerini, Federico. (2011). Intangible Cultural Heritage: The Living Culture of Peoples. European Journal of International Law, 22(1), 101–120. https://doi.org/10.1093/ejil/chr006
    Lim, Shirley, & Li Liew, Chern. (2011). Metadata quality and interoperability of GLAM digital images. Aslib Proceedings, 63(5), 484–498. https://doi.org/10.1108/00012531111164978
    Lindenthal, Jutta, Meiners, Hanna-Lena, & Balzer, Detlev. (2023). LIDO Primer. In LIDO. https://lido-schema.org/documents/primer/latest/lido-primer.html
    Lit, L. W. C. van. (2020). The Digital Materiality of Digitized Manuscripts. In Among Digitized Manuscripts. Philology, Codicology, Paleography in a Digital World (pp. 51–72). Brill. https://www.jstor.org/stable/10.1163/j.ctv2gjwzrd.6
    Loulanski, Tolina. (2006). Revising the Concept for Cultural Heritage: The Argument for a Functional Approach. International Journal of Cultural Property, 13(2), 207–233. https://doi.org/10.1017/S0940739106060085
    Lowenthal, David. (2005). Natural and cultural heritage. International Journal of Heritage Studies, 11(1), 81–92. https://doi.org/10.1080/13527250500037088
    Mahony, Simon. (2018). Cultural Diversity and the Digital Humanities. Fudan Journal of the Humanities and Social Sciences, 11(3), 371–388. https://doi.org/10.1007/s40647-018-0216-0
    Marcondes, Carlos Henrique. (2021). Integrated classification schemas to interlink cultural heritage collections over the web using LOD technologies. International Journal of Metadata, Semantics and Ontologies, 15(3), 170. https://doi.org/10.1504/IJMSO.2021.123040
    Martinez Demarco, Sol. (2019). Empowering women through digital skills in Argentina: A tale of two stories. TATuP - Zeitschrift Für Technikfolgenabschätzung in Theorie Und Praxis, 28(2), 23–28. https://doi.org/10.14512/tatup.28.2.s23
    Martinez Demarco, Sol. (2023). From digital inclusion to IT appropriation: Gendered aspects of appropriation imaginary and practices. GENDER – Zeitschrift Für Geschlecht, Kultur Und Gesellschaft, 15(1), 72–86. https://doi.org/10.3224/gender.v15i1.06
    Masolo, Claudio, Borgo, Stefano, Gangemi, Aldo, Guarino, Nicola, & Oltramari, Alessandro. (2003). Wonder Web Deliverable D18: Ontology Library (Ontology {Infrastructure} for the {Semantic} {Web} Del 18; p. 343). Laboratory For Applied Ontology - ISTC-CNR. http://www.loa.istc.cnr.it/old/Papers/D18.pdf
    Mathieualexhache. (2021). OAIS Functional Model. https://commons.wikimedia.org/wiki/File:OAIS_Functional_Model_(en).svg
    Mazzocchi, Fulvio. (2018). Knowledge organization system (KOS). Knowledge Organization, 45(1), 54–78. https://doi.org/10.5771/0943-7444-2018-1-54
    Micle, Dorel. (2014). Archaeological Heritage Between Natural Hazard and Anthropic Destruction: The Negative Impact of Social Non-involvement in the Protection of Archaeological Sites. Procedia - Social and Behavioral Sciences, 163, 269–278. https://doi.org/10.1016/j.sbspro.2014.12.316
    Mikhaylova, Daria, & Metilli, Daniele. (2023). Extending RiC-O to Model Historical Architectural Archives: The ITDT Ontology. Journal on Computing and Cultural Heritage, 16(4), 67:1-67:15. https://doi.org/10.1145/3606706
    Mixter, Jeff. (2014). Using a Common Model: Mapping VRA Core 4.0 Into an RDF Ontology. Journal of Library Metadata, 14(1), 1–23. https://doi.org/10.1080/19386389.2014.891890
    Morales, Susana. (2009). La apropiación de TIC: Una perspectiva. In Susana Morales & M. I. Loyola (Eds.), Los jóvenes y las TIC. Apropiación y uso en educación (pp. 99–120). Edición de las autoras.
    Morales, Susana. (2017). Imaginación y software: Aportes para la construcción del paradigma de la apropiación. Del Gato Gris. http://hdl.handle.net/11086/27405
    Morales, Susana. (2018). La apropiación de tecnologías. Ideas para un paradigma en construcción. In Acerca de la apropiación de tecnologías. Teoría, estudios y debates (pp. 23–33). Del Gato Gris.
    Mr Gee. (2023, October). Day 2 Closing – A multitude of tools. EuropeanaTech 2023. https://youtu.be/pOX9CrvAG7I
    Müller, Katja. (2018). Digitale Objekte - subjektive Materie. Zur Materialität digitalisierter Objekte in Museum und Archiv. In Hans Peter Hahn & Friedemann Neumann (Eds.), Edition Kulturwissenschaft (1st ed., Vol. 182, pp. 49–66). transcript Verlag. https://doi.org/10.14361/9783839445136-004
    Munjeri, Dawson. (2004). Tangible and Intangible Heritage: From difference to convergence. Museum International, 56(1–2), 12–20. https://doi.org/10.1111/j.1350-0775.2004.00453.x
    Münster, S., Apollonio, F. I., Bell, P., Kuroczynski, P., Di Lenardo, I., Rinaudo, F., & Tamborrino, R. (2019). Digital Cultural Heritage meets Digital Humanities. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W15, 813–820. https://doi.org/10.5194/isprs-archives-XLII-2-W15-813-2019
    Münster, Sander, Utescher, Ronja, & Ulutas Aydogan, Selda. (2021). Digital topics on cultural heritage investigated: How can data-driven and data-guided methods support to identify current topics and trends in digital heritage? Built Heritage, 5(1), 25. https://doi.org/10.1186/s43238-021-00045-7
    Nelson, Peter A. (2021). The Role of GPR in Community-Driven Compliance Archaeology with Tribal and Non-tribal Communities in Central California. Advances in Archaeological Practice, 9(3), 215–225. https://doi.org/10.1017/aap.2021.14
    Newbury, David. (2018). LOUD: Linked Open Usable Data and linked.art. 2018 CIDOC Conference, 1–11. https://cidoc.mini.icom.museum/wp-content/uploads/sites/6/2021/03/CIDOC2018_paper_153.pdf
    Newbury, David. (2024). Linked Data in Production: Moving Beyond Ontologies. https://www.slideshare.net/slideshow/linked-data-in-production-moving-beyond-ontologies/266976602
    Nielsen, Erland Kolding. (2008). Digitisation of Library Material in Europe: Problems, Obstacles and Perspectives anno 2007. LIBER Quarterly: The Journal of the Association of European Research Libraries, 18(1), 20–27. https://doi.org/10.18352/lq.7901
    NISO. (2010). Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies (American {National} {Standard} ANSI/NISO Z39.19-2005 (R2010)). National Information Standards Organization. https://groups.niso.org/higherlogic/ws/public/download/12591/z39-19-2005r2010.pdf
    Ostrom, Elinor. (1990). Governing the Commons: The Evolution of Institutions for Collective Action (1st ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511807763
    Owens, Trevor. (2011). Defining Data for Humanists: Text, Artifact, Information or Evidence? Journal of Digital Humanities, 1(1). https://journalofdigitalhumanities.org/1-1/defining-data-for-humanists-by-trevor-owens/
    Padfield, Joseph, Bolland, Charlotte, Fitzgerald, Neil, McLaughlin, Anne, Robson, Glen, & Terras, Melissa. (2022). Practical applications of IIIF as a building block towards a digital National Collection [Arts and {Humanities} {Research} {Council} {Final} {Report}]. Towards a National Collection. https://doi.org/10.5281/zenodo.6884885
    Page, Kevin R., Delmas-Glass, Emmanuelle, Beaudet, David, Norling, Samantha, Rother, Lynn, & Hänsli, Thomas. (2020). Linked Art: Networking Digital Collections and Scholarship. Digital Humanities 2020 Book of Abstracts, 504–509. https://dh2020.adho.org/wp-content/uploads/2020/07/139_LinkedArtNetworkingDigitalCollectionsandScholarship.html
    Pagenstecher, Cord. (2009). Private Fotoalben als historische Quelle. Zeithistorische Forschungen/Studies in Contemporary History, 6(3), 449–463. https://doi.org/10.14765/ZZF.DOK-1803
    Patrón, Pedro, Miguelañez, Emilio, Petillot, Yvan R., Patrón, Pedro, Miguelañez, Emilio, & Petillot, Yvan R. (2011). Embedded Knowledge and Autonomous Planning: The Path Towards Permanent Presence of Underwater Networks. In Autonomous Underwater Vehicles (pp. 199–224). IntechOpen. https://doi.org/10.5772/24649
    Peterhans, Simon, Sauter, Loris, Spiess, Florian, & Schuldt, Heiko. (2022). Automatic Generation of Coherent Image Galleries in Virtual Reality. In Gianmaria Silvello, Oscar Corcho, Paolo Manghi, Giorgio Maria Di Nunzio, Koraljka Golub, Nicola Ferro, & Antonella Poggi (Eds.), Linking Theory and Practice of Digital Libraries (Vol. 13541, pp. 282–288). Springer International Publishing. https://doi.org/10.1007/978-3-031-16802-4_23
    Pfrunder, Peter. (1995). Ernst Brunner: Photographien, 1937-1962 (2. Aufl). Schweizerische Gesellschaft für Volkskunde ; Offizin.
    Philip, Kavita. (2021). The Internet Will Be Decolonized. In Thomas S. Mullaney, Benjamin Peters, Mar Hicks, & Kavita Philip (Eds.), Your Computer Is on Fire (pp. 91–116). The MIT Press. https://doi.org/10.7551/mitpress/10993.003.0002
    Pitti, Daniel V. (1999). Encoded archival description: An introduction and overview. New Review of Information Networking, 5(1), 61–69. https://doi.org/10.1080/13614579909516936
    Portalés, Cristina, Rodrigues, João M. F., Rodrigues Gonçalves, Alexandra, Alba, Ester, & Sebastián, Jorge. (2018). Digital Cultural Heritage. Multimodal Technologies and Interaction, 2(3), 58. https://doi.org/10.3390/mti2030058
    Poupeau, Gautier. (2018). Réflexions et questions autour du Web sémantique. In Les petites cases. https://web.archive.org/web/20240813032044/https://www.lespetitescases.net/reflexions-et-questions-autour-du-web-semantique
    Raemy, Julien Antoine, Gray, Tanya, Collinson, Alwyn, & Page, Kevin R. (2023). Enabling Participatory Data Perspectives for Image Archives through a Linked Art Workflow. In Anne Baillot, Walter Scholger, Toma Tasovac, & Georg Vogeler (Eds.), Digital Humanities 2023 Book of Abstracts (Vol. 2023, pp. 515–516). Alliance of Digital Humanities Organizations (ADHO). https://doi.org/10.5451/unibas-ep95099
    Raemy, Julien Antoine. (2017). The International Image Interoperability Framework (IIIF): Raising awareness of the user benefits for scholarly editions [Bachelor’s thesis, HES-SO University of Applied Sciences; Arts, Haute école de gestion de Genève]. https://sonar.ch/hesso/documents/314853
    Raemy, Julien Antoine. (2020). Enabling better aggregation and discovery of cultural heritage content for Europeana and its partner institutions [Master’s thesis, HES-SO University of Applied Sciences; Arts, Haute école de gestion de Genève]. https://sonar.ch/hesso/documents/315109
    Raemy, Julien Antoine. (2021). Applying Effective Data Modelling Approaches for the Creation of a Participatory Archive Platform. In Yumeng Hou (Ed.), Human Factors in Digital Humanities (pp. 1–5). Institut des humanités digitales. https://doi.org/10.5451/unibas-ep87517
    Raemy, Julien Antoine. (2022). Améliorer la valorisation des données du patrimoine culturel grâce au Linked Open Usable Data (LOUD). In Nicolas Lasolle, Olivier Bruneau, & Jean Lieber (Eds.), Actes des journées humanités numériques et Web sémantique (pp. 132–149). Les Archives Henri-Poincaré - Philosophie et Recherches sur les Sciences et les Technologies (AHP-PReST); Laboratoire lorrain de recherche en informatique et ses applications (LORIA). https://doi.org/10.5451/unibas-ep89725
    Raemy, Julien Antoine. (2024). Interlinking Cultural Heritage Data with Community-driven Principles and Standards. https://julsraemy.ch/prezi/pia-ringvorlesung-2024.html
    Raemy, Julien Antoine. (2024). Some notes from the 2024 IIIF Conference held in Los Angeles. In Thoughts and discombobulations of Julien A. Raemy. https://julsraemy.ch/posts/2024/06/26/iiif-conference-la/
    Rautenberg, Michel. (1998). L’émergence patrimoniale de l’ethnologie : Entre mémoire et politiques publiques. In Patrimoine et modernité (pp. 279–289). L’Harmattan.
    Respaldiza Hidalgo, María Aránzazu, Wachowicz, Monica, & Vázquez Hoehne, Antonio. (2011). Metadata Visualization of Cultural Heritage Information within a Collaborative Environment. Proceedings of XXIIIrd International CIPA Symposium. https://oa.upm.es/11636/
    Ribes, David, & Lee, Charlotte P. (2010). Sociotechnical Studies of Cyberinfrastructure and e-Research: Current Themes and Future Trajectories. Computer Supported Cooperative Work (CSCW), 19(3), 231–244. https://doi.org/10.1007/s10606-010-9120-0
    Ridge, Mia, Blickhan, Samantha, Ferriter, Meghan, Mast, Austin, Brumfield, Ben, Wilkins, Brendon, Cybulska, Daria, Burgher, Denise, Casey, Jim, Luther, Kurt, Goldman, Michael Haley, White, Nick, Willcox, Pip, Brumfield, Sara Carlstead, Coleman, Sonya J., & Prytz, Ylva Berglund. (2021). 12. Connecting with communities. In The Collective Wisdom Handbook: Perspectives on Crowdsourcing in Cultural Heritage - community review version (1st edition). British Library. https://doi.org/10.21428/a5d7554f.1b80974b
    Riley, Jenn. (2009). Seeing Standards: A Visualization of the Metadata Universe. In Jenn Riley. https://jennriley.com/metadatamap/
    Riley, Jenn. (2017). Understanding metadata. What is metadata, and what is it for? National Information Standards Organization (NISO).
    Riva, Pat, Le Boeuf, Patrick, & Žumer, Maja. (2017). IFLA Library Reference Model: A Conceptual Model for Bibliographic Information [{IFLA}-{LRM}]. International Federation of Library Associations; Institutions. https://repository.ifla.org/handle/123456789/40
    Rodighiero, Dario. (2021). Mapping Affinities: Democratizing Data Visualization. Métis Presses. https://dash.harvard.edu/handle/1/37368046
    Roke, Elizabeth Russey, & Tillman, Ruth Kitchin. (2022). Pragmatic Principles for Archival Linked Data. The American Archivist, 85(1), 173–201. https://doi.org/10.17723/2327-9702-85.1.173
    Rossenova, Lozana, & Di Franco, Karen. (2022). Iterative Pasts and Linked Futures: A Feminist Approach to Modeling Data in Archives and Collections of Artists’ Publishing. Perspectives on Data, 2. https://doi.org/10.53269/9780865593152/05
    SAA Dictionary. (2023). Taxonomy. In Dictionary of Archives Terminology. Society of American Archivists. https://dictionary.archivists.org/entry/taxonomy.html
    Sabharwal, Arjun. (2015). 2 - Archives and special collections in the digital humanities. In Arjun Sabharwal (Ed.), Digital Curation in the Digital Humanities (pp. 27–47). Chandos Publishing. https://doi.org/10.1016/B978-0-08-100143-1.00002-7
    Sanderson, Robert. (2013). RDF: Resource Description Failures and Linked Data Letdowns. Journal of Digital Humanities, 2(3). https://journalofdigitalhumanities.org/2-3/rdf-resource-description-failures-and-linked-data-letdowns/
    Sanderson, Robert. (2015). Linked Data Best Practices and BibFrame. https://www.slideshare.net/azaroth42/linked-data-best-practices-and-bibframe
    Sanderson, Robert. (2019). Keynote: Standards and Communities: Connected People, Consistent Data, Usable Applications. 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), 28. https://doi.org/10.1109/JCDL.2019.00009
    Schmoll, Friedemann. (2009a). Die Vermessung der Kultur: Der ‘Atlas der deutschen Volkskunde’ und die Deutsche Forschungsgemeinschaft, 1928-1980. Steiner.
    Schmoll, Friedemann. (2009b). Richard Weiss : Skizzen zum internationalen Wirken des Schweizer Volkskundlers. Schweizerisches Archiv Für Volkskunde/ Archives Suisses Des Traditions Populaires, 2009(105), 15–32. https://doi.org/10.5169/SEALS-118266
    Schöch, Christof. (2013). Big? Smart? Clean? Messy? Data in the Humanities. Journal of Digital Humanities, 2(3). https://journalofdigitalhumanities.org/2-3/big-smart-clean-messy-data-in-the-humanities/
    Semeraro, Concetta, Lezoche, Mario, Panetto, Hervé, & Dassisti, Michele. (2021). Digital twin paradigm: A systematic literature review. Computers in Industry, 130, 103469. https://doi.org/10.1016/j.compind.2021.103469
    Serres, Michel. (2014). Le parasite. Pluriel.
    Shao, Guodong, & Kibira, Deogratias. (2018). DIGITAL MANUFACTURING: REQUIREMENTS AND CHALLENGES FOR IMPLEMENTING DIGITAL SURROGATES. 2018 Winter Simulation Conference (WSC), 1226–1237. https://doi.org/10.1109/WSC.2018.8632242
    Shepherd, Elizabeth, & Smith, Charlotte. (2000). The Application of ISAD(G) to the Description of Archival Datasets. Journal of the Society of Archivists, 21(1), 55–86. https://doi.org/10.1080/00379810050006911
    Simandiraki-Grimshaw, Anna. (2023). What is a museum object according to a museum database? In TETRARCHs. https://www.tetrarchs.org/index.php/2023/09/19/what-is-a-museum-object-according-to-a-museum-database/
    Snell, James M., & Prodromou, Evan. (2017). Activity Streams 2.0. In W3C. https://www.w3.org/TR/activitystreams-core/
    Snydman, Stuart, Sanderson, Robert, & Cramer, Tom. (2015). The International Image Interoperability Framework (IIIF): A community & technology approach for web-based images. Archiving Conference, 12, 16–21. https://doi.org/10.2352/issn.2168-3204.2015.12.1.art00005
    Spiess, Florian, & Schuldt, Heiko. (2022). Multimodal Interactive Lifelog Retrieval with vitrivr-VR. Proceedings of the 5th Annual on Lifelog Search Challenge, 38–42. https://doi.org/10.1145/3512729.3533008
    Spiess, Florian, & Stauffiger, Markus. (2023). Forschung und Archive: Erschliessung und Zugänglichkeit neu gedacht. Arbido, 2023(1). https://arbido.ch/de/ausgaben-artikel/2023/archiv-der-zukunft/forschung-und-archive-erschliessung-und-zugaenglichkeit-neu-gedacht
    Spiess, Florian, Rossetto, Luca, & Schuldt, Heiko. (2024). Exploring Multimedia Vector Spaces with vitrivr-VR. In Stevan Rudinac, Alan Hanjalic, Cynthia Liem, Marcel Worring, Björn Thor Jónsson, Bei Liu, & Yoko Yamakata (Eds.), MultiMedia Modeling (Vol. 14557, pp. 317–323). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-53302-0_27
    Sprochi, Amanda. (2016). Where Are We Headed? Resource Description and Access, Bibliographic Framework, and the Functional Requirements for Bibliographic Records Library Reference Model. The International Information & Library Review, 48(2), 129–136. https://doi.org/10.1080/10572317.2016.1176455
    Star, Susan Leigh, & Griesemer, James R. (1989). Institutional Ecology, ’Translations’ and Boundary Objects: Amateurs and Professionals in Berkeley’s Museum of Vertebrate Zoology, 1907-39. Social Studies of Science, 19(3), 387–420. https://www.jstor.org/stable/285080
    Star, Susan Leigh, & Ruhleder, Karen. (1994). Steps towards an ecology of infrastructure: Complex problems in design and access for large-scale collaborative systems. Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, 253–264. https://doi.org/10.1145/192844.193021
    Star, Susan Leigh. (1999). The Ethnography of Infrastructure. American Behavioral Scientist, 43(3), 377–391. https://doi.org/10.1177/00027649921955326
    Stein, Regine, & Balandi, Oguzhan. (2019). Using LIDO for Evolving Object Documentation into CIDOC CRM. Heritage, 2(1), 1023–1031. https://doi.org/10.3390/heritage2010066
    Tasovac, Toma, Chambers, Sally, & Tóth-Czifra, Erzsébet. (2020). Cultural Heritage Data from a Humanities Research Perspective: A DARIAH Position Paper. DARIAH-EU. https://hal.science/hal-02961317
    Tennant, Roy. (2002). MARC must die. Library Journal, 127(17), 26–27. http://soiscompsfall2007.pbworks.com/f/marc%20must%20die.pdf
    Terras, Melissa, Coleman, Stephen, Drost, Steven, Elsden, Chris, Helgason, Ingi, Lechelt, Susan, Osborne, Nicola, Panneels, Inge, Pegado, Briana, Schafer, Burkhard, Smyth, Michael, Thornton, Pip, & Speed, Chris. (2021). The value of mass-digitised cultural heritage content in creative contexts. Big Data & Society, 8(1), 20539517211006165. https://doi.org/10.1177/20539517211006165
    Tuominen, Jouni, Hyvönen, Eero, & Leskinen, Petri. (2017). Bio CRM: A Data Model for Representing Biographical Data for Prosopographical Research. In Antske Fokkens, Serge ter Braake, Ronald Sluijter, Paul Arthur, & Eveline Wandl-Vogt (Eds.), Proceedings of the Second Conference on Biographical Data in a Digital World 2017 (Vol. 2119, pp. 59–66). CEUR. https://ceur-ws.org/Vol-2119/#paper10
    Tweed, Christopher, & Sutherland, Margaret. (2007). Built cultural heritage and sustainable urban development. Landscape and Urban Planning, 83(1), 62–69. https://doi.org/10.1016/j.landurbplan.2007.05.008
    UNESCO Institute for Statistics. (2009). UNESCO Framework for Cultural Statistics (FCS). United Nations Educational, Scientific; Cultural Organization. https://doi.org/10.15220/978-92-9189-075-0-en
    UNESCO. (2009). Charter on the Preservation of the Digital Heritage (Circular {Letter} CL/3865). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000179529
    UNESCO. (2022). Basic texts of the 2003 Convention for the Safeguarding of the Intangible Cultural Heritage (Programme and Meeting Document CLT-2022/WS/3). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000383762
    UNESCO. Culture for Development Indicators. (2014). Methodology Manual. United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000229608
    Van de Sompel, Herbert, & Nelson, Michael L. (2015). Reminiscing About 15 Years of Interoperability Efforts. D-Lib Magazine, 21(11/12). https://doi.org/10.1045/november2015-vandesompel
    Van der Auwera, Sigrid. (2013). UNESCO and the protection of cultural property during armed conflict. International Journal of Cultural Policy, 19(1), 1–19. https://doi.org/10.1080/10286632.2011.625415
    Vandenhende, Lise, & Van Hoorick, Geert. (2017). The management of cultural heritage and nature : Complementary or conflicting regulations? EELF Annual Conference, 5th, Abstracts. http://hdl.handle.net/1854/LU-8722614
    Vecco, Marilena. (2010). A definition of cultural heritage: From the tangible to the intangible. Journal of Cultural Heritage, 11(3), 321–324. https://doi.org/10.1016/j.culher.2010.01.006
    Vienni-Baptista, Bianca, Fletcher, Isabel, & Lyall, Catherine (Eds.). (2023). Foundations of Interdisciplinary and Transdisciplinary Research: A Reader. Bristol University Press. https://doi.org/10.56687/9781529235012
    Vinck, Dominique. (2019). Les métiers de l’ombre de la Fête des Vignerons. Editions Antipodes. https://doi.org/10.33056/antipodes.1711
    Weibel, Stuart L., & Koch, Traugott. (2000). The Dublin Core Metadata Initiative: Mission, Current Activities, and Future Directions. D-Lib Magazine, 6(12). https://doi.org/10.1045/december2000-weibel
    Weinthal, Dianne, & Childress, Dawn. (2019). IIIF for Open Access. https://escholarship.org/uc/item/260616w7
    Weiss, Richard. (1940). Atlas der schweizerischen Volkskunde : Die bisherigen Erfahrungen der Exploratoren. Schweizerisches Archiv Für Volkskunde/ Archives Suisses Des Traditions Populaires., 38(1), 105–118. https://doi.org/10.5169/SEALS-113634
    Wenger, Etienne. (2011). Communities of practice: A brief introduction. National Science Foundation, 1–7. http://hdl.handle.net/1794/11736
    Windhager, Florian, Federico, Paolo, Schreder, Gunther, Glinka, Katrin, Dork, Marian, Miksch, Silvia, & Mayr, Eva. (2019). Visualization of Cultural Heritage Collection Data: State of the Art and Future Challenges. IEEE Transactions on Visualization and Computer Graphics, 25(6), 2311–2330. https://doi.org/10.1109/TVCG.2018.2830759
    Zeng, Marcia Lei, & Qin, Jian. (2022). Metadata (Third edition). ALA Neal-Schuman.
    Zeng, Marcia Lei. (2008). Knowledge Organization Systems (KOS). Knowledge Organization, 35(2–3), 160–182. https://doi.org/10.5771/0943-7444-2008-2-3-160
    Zou, Xiaozhu, Xiong, Siyi, Li, Zhi, & Jiang, Ping. (2018). Constructing Metadata Schema of Scientific and Technical Report Based on FRBR. Computer and Information Science, 11(2), 34–39. https://doi.org/10.5539/cis.v11n2p34
    Žumer, Maja. (2007). Functional requirements for bibliographic records: FRBR: The end of the road or a new beginning. Bulletin of the American Society for Information Science and Technology, 33(6), 27–29. https://doi.org/10.1002/bult.2007.1720330608
    +

    Bibliography

    Adamou, Alessandro, Picca, Davide, Hou, Yumeng, & Loreto Granados-García, Paula. (2023). The Facets of Intangible Heritage in Southern Chinese Martial Arts: Applying a Knowledge-driven Cultural Contact Detection Approach. Journal on Computing and Cultural Heritage, 16(3), 63:1-63:27. https://doi.org/10.1145/3606702
    Ahmad, Yahaya. (2006). The Scope and Definitions of Heritage: From Tangible to Intangible. International Journal of Heritage Studies, 12(3), 292–300. https://doi.org/10.1080/13527250600604639
    Akrich, Madeleine, Callon, Michel, & Latour, Bruno (Eds.). (2006). Sociologie de la traduction: Textes fondateurs. Presses des Mines. https://doi.org/10.4000/books.pressesmines.1181
    Allen, Laurie. (2023). Why Experiment: Machine Learning at the Library of Congress. In The Library of Congress. https://blogs.loc.gov/thesignal/2023/11/why-experiment-machine-learning-at-the-library-of-congress/
    Alter, George, Rizzolo, Flavio, & Schleidt, Kathi. (2023). View points on data points: A shared vocabulary for cross-domain conversations on data and metadata. IASSIST Quarterly, 47(1), 1–39. https://doi.org/10.29173/iq1051
    Andresen, S. L. (2002). John McCarthy: Father of AI. IEEE Intelligent Systems, 17(5), 84–85. https://doi.org/10.1109/MIS.2002.1039837
    Aslan, Zaki. (1997). Protective Structures for the Conservation and Presentation of Archaeological Sites. Journal of Conservation and Museum Studies, 3(0), 16. https://doi.org/10.5334/jcms.3974
    Avram, Henriette D. (1968). The MARC Pilot Project. Final Report (ED029663; p. 173). Library of Congress. https://eric.ed.gov/?id=ED029663
    Azzopardi, Elaine, Kenter, Jasper O., Young, Juliette, Leakey, Chris, O’Connor, Seb, Martino, Simone, Flannery, Wesley, Sousa, Lisa P., Mylona, Dimitra, Frangoudes, Katia, Béguier, Irène, Pafi, Maria, Silva, Arturo Rey da, Ainscough, Jacob, Koutrakis, Manos, Silva, Margarida Ferreira da, & Pita, Cristina. (2023). What are heritage values? Integrating natural and cultural heritage into environmental valuation. People and Nature, 5(2), 368–383. https://doi.org/10.1002/pan3.10386
    Baader, Franz, & Lutz, Carsten. (2007). 13 Description logic. In Patrick Blackburn, Johan Van Benthem, & Frank Wolter (Eds.), Studies in Logic and Practical Reasoning (Vol. 3, pp. 757–819). Elsevier. https://doi.org/10.1016/S1570-2464(07)80016-4
    Baca, Murtha, & Harpring, Patricia. (2017). Categories for the description of works of art [Report]. Getty Research Institute. https://apo.org.au/node/14985
    Barrile, Vincenzo, & Bernardo, Ernesto. (2022). Big Data and Cultural Heritage. In Francesco Calabrò, Lucia Della Spina, & María José Piñeira Mantiñán (Eds.), New Metropolitan Perspectives (Vol. 482, pp. 2708–2716). Springer International Publishing. https://doi.org/10.1007/978-3-031-06825-6_259
    Becattini, Federico, Bongini, Pietro, Bulla, Luana, Bimbo, Alberto Del, Marinucci, Ludovica, Mongiovì, Misael, & Presutti, Valentina. (2023). VISCOUNTH: A Large-scale Multilingual Visual Question Answering Dataset for Cultural Heritage. ACM Transactions on Multimedia Computing, Communications, and Applications, 19(6), 193:1-193:20. https://doi.org/10.1145/3590773
    Bekiari, Chryssoula, Bruseker, George, Doerr, Martin, Ore, Christian-Emil, Stead, Stephen, & Velios, Athanasios. (2021). CIDOC Conceptural Reference Model 7.1.1. https://doi.org/10.26225/FDZH-X261
    Beretta, Francesco. (2022). Interopérabilité des données de la recherche et ontologies fondationnelles : Un éco-système d’exnsions du CIDOC CRM pour les sciences humaines et sociales. In Nicolas Lasolle, Olivier Bruneau, & Jean Lieber (Eds.), Actes des journées humanités numériques et Web sémantique (pp. 2–22). Les Archives Henri-Poincaré - Philosophie et Recherches sur les Sciences et les Technologies (AHP-PReST); Laboratoire lorrain de recherche en informatique et ses applications (LORIA). https://doi.org/10.5281/zenodo.7014341
    Bermès, Emmanuelle. (2023). Modélisons un peu : Le choix d’un type de bases de données. In Figoblog. https://figoblog.org/2023/12/13/modelisons-un-peu-le-choix-dun-type-de-bases-de-donnees/
    Berners-Lee, Tim, Hendler, James, & Lassila, Ora. (2001). The Semantic Web. Scientific American, 284(5), 34–43. https://www.jstor.org/stable/26059207
    Berressem, Hanjo. (2015). Déjà Vu: Serres after Latour, Deleuze after Harman, ‘Nature Writing’ after ‘Network Theory’. Amerikastudien / American Studies, 60(1), 59–79. https://www.jstor.org/stable/44071895
    Bezjak, Sonja, Conzett, Philipp, Fernandes, Pedro L., Görögh, Edit, Helbig, Kerstin, Kramer, Bianca, Labastida, Ignasi, Niemeyer, Kyle, Psomopoulos, Fotis, Ross-Hellauer, Tony, Schneider, René, Tennant, Jon, Verbakel, Ellen, & Clyburne-Sherin, April. (2019). The Open Science Training Handbook. FOSTER. https://doi.org/10.5281/zenodo.2587951
    Bizer, Christian, Heath, Tom, Idehen, Kingsley, & Berners-Lee, Tim. (2008). Linked data on the web (LDOW2008). Proceedings of the 17th International Conference on World Wide Web, 1265–1266. https://doi.org/10.1145/1367497.1367760
    Blue Shield. (2016). Blue Shield Statutes (Articles of Association) (p. 16). https://web.archive.org/web/20230802104458/https://theblueshield.org/wp-content/uploads/2021/12/statute-Amendments_BSI_2016.pdf
    Borgo, Stefano, Ferrario, Roberta, Gangemi, Aldo, Guarino, Nicola, Masolo, Claudio, Porello, Daniele, Sanfilippo, Emilio M., & Vieu, Laure. (2022). DOLCE: A descriptive ontology for linguistic and cognitive engineering. Applied Ontology, 17(1), 45–69. https://doi.org/10.3233/AO-210259
    Bowman, Blythe A. (2008). Transnational Crimes Against Culture: Looting at Archaeological Sites and the “Grey” Market in Antiquities. Journal of Contemporary Criminal Justice, 24(3), 225–242. https://doi.org/10.1177/1043986208318210
    Brown, Karen, Cummins, Alissandra, & González Rueda, Ana S. (Eds.). (2023). Communities and Museums in the 21st Century: Shared Histories and Climate Action (1st ed.). Routledge. https://doi.org/10.4324/9781003288138
    Bruseker, George, Carboni, Nicola, & Guillem, Anaïs. (2017). Cultural Heritage Data Management: The Role of Formal Ontology and CIDOC CRM. In Matthew L. Vincent, Víctor Manuel López-Menchero Bendicho, Marinos Ioannides, & Thomas E. Levy (Eds.), Heritage and Archaeology in the Digital Age: Acquisition, Curation, and Dissemination of Spatial Cultural Heritage Data (pp. 93–131). Springer International Publishing. https://doi.org/10.1007/978-3-319-65370-9_6
    Burchardt, Jørgen. (2014). Researchers Outside APC-Financed Open Access: Implications for Scholars Without a Paying Institution. Sage Open, 4(4), 2158244014551714. https://doi.org/10.1177/2158244014551714
    Callon, Michel. (2001). Actor Network Theory. In Neil J. Smelser & Paul B. Baltes (Eds.), International Encyclopedia of the Social & Behavioral Sciences (pp. 62–66). Pergamon. https://doi.org/10.1016/B0-08-043076-7/03168-5
    Cameron, Fiona. (2007). Beyond the Cult of the Replicant: Museums and Historical Digital Objects—Traditional Concerns, New Discourses. In Fiona Cameron & Sarah Kenderdine (Eds.), Theorizing Digital Cultural Heritage: A Critical Discourse. The MIT Press. https://doi.org/10.7551/mitpress/9780262033534.003.0004
    Canning, Erin, Brown, Susan, Roger, Sarah, & Martin, Kimberley. (2022). The Power to Structure : Making Meaning from Metadata Through Ontologies. KULA: Knowledge Creation, Dissemination, and Preservation Studies, 6(3), 1–15. https://doi.org/10.18357/kula.169
    Cantara, Linda. (2005). METS: The Metadata Encoding and Transmission Standard. Cataloging & Classification Quarterly, 40(3–4), 237–253. https://doi.org/10.1300/J104v40n03_11
    Caplan, Priscilla, & Guenther, Rebecca S. (2005). Practical Preservation: The PREMIS Experience. Library Trends, 54(1), 111–124. https://muse.jhu.edu/pub/1/article/193223
    Carman, John. (2009). Where the Value Lies: The importance of materiality to the immaterial aspects of heritage. In Emma Waterton & Laurajane Smith (Eds.), Taking Archaeology Out of Heritage (pp. 192–208). Cambridge Scholars Publishing.
    Chang, Liang, Sattler, Uli, & Gu, Tianlong. (2014). An ABox Revision Algorithm for the Description Logic EL_bot. In Meghyn Bienvenu, Magdalena Ortiz, Riccardo Rosati, & Mantas Simkus (Eds.), Informal Proceedings of the 27th International Workshop on Description Logics (Vol. 1193, pp. 459–470). CEUR. https://ceur-ws.org/Vol-1193/#paper_64
    Charles, Valentine, & Isaac, Antoine. (2015). Enhancing the Europeana Data Model (EDM) (p. 21) [White paper]. Europeana Foundation. http://pro.europeana.eu/files/Europeana_Professional/Publications/EDM_WhitePaper_17062015.pdf
    Chiquet, Vera, Felsing, Ulrike, & Fornaro, Peter. (2023). A Participatory Interface for a Photo Archives. Archiving Conference, 20, 109–111. https://doi.org/10.2352/issn.2168-3204.2023.20.1.23
    Chiquet, Vera. (2023). How to digitally preserve UNESCO intangible cultural heritage? A web-archive for ephemeral events at the Basler Carnival. Archiving Conference, 20, 105–108. https://doi.org/10.2352/issn.2168-3204.2023.20.1.22
    Clavaud, Florence, & Wildi, Tobias. (2021). ICA Records in Contexts-Ontology (RiC-O): A Semantic Framework for Describing Archival Resources. Linked Archives International Workshop 2021, 3019, 79–92. https://enc.hal.science/hal-03965776
    Coburn, Erin, Lanzi, Elisa, O’Keefe, Elizabeth, Stein, Regine, & Whiteside, Ann. (2010). The Cataloging Cultural Objects experience: Codifying practice for the cultural heritage community. IFLA Journal, 36(1), 16–29. https://doi.org/10.1177/0340035209359561
    Coburn, Erin, Light, Richard, McKenna, Gordon, Stein, Regine, & Vitzthum, Axel. (2010). LIDO - Lightweight Information Describing Objects Version 1.0. https://lido-schema.org/schema/v1.0/lido-v1.0-specification.pdf
    Coleman, Catherine Nicole. (2020). Managing Bias When Library Collections Become Data. International Journal of Librarianship, 5(1), 8–19. https://doi.org/10.23974/ijol.2020.vol5.1.162
    Constantopoulos, Panos, & Dallas, Costis. (2008). Aspects of a digital curation agenda for cultural heritage. 2008 IEEE International Conference on Distributed Human-Machine Systems. Athens, Greece: IEEE, 1–6.
    Conway, Paul. (2015). Digital transformations and the archival nature of surrogates. Archival Science, 15(1), 51–69. https://doi.org/10.1007/s10502-014-9219-z
    Cornut, Murielle, Raemy, Julien Antoine, & Spiess, Florian. (2023). Annotations as Knowledge Practices in Image Archives: Application of Linked Open Usable Data and Machine Learning. Journal on Computing and Cultural Heritage, 16(4), 1–19. https://doi.org/10.1145/3625301
    Cornut, Murielle. (2023). Open, edit, save: Über die performative Materialität privater Fotoalben. In Ulrich Hägele (Ed.), Kuratierte Erinnerungen: Das Fotoalbum (pp. 157–170). Waxmann.
    Cossham, Amanda Frances. (2017). Models of the bibliographic universe [{PhD} {Thesis}, Monash University]. https://doi.org/10.4225/03/596e9bc6c1d09
    Coyle, Karen, & Hillmann, Diane. (2007). Resource Description and Access (RDA): Cataloging Rules for the 20th Century. D-Lib Magazine, 13(1/2). https://doi.org/10.1045/january2007-coyle
    Dahlgren, Anna, & Hansson, Karin. (2020). The Diversity Paradox: Conflicting Demands on Metadata Production in Cultural Heritage Collections. Digital Culture & Society, 6(2), 239–256. https://doi.org/10.14361/dcs-2020-0212
    Darmont, Jérôme, Favre, Cécile, Loudcher, Sabine, & Noûs, Camille. (2020). Data lakes for digital humanities. Proceedings of the 2nd International Conference on Digital Tools & Uses Congress, 1–4. https://doi.org/10.1145/3423603.3424004
    De Muynke, Julien, Baltazar, Marie, Monferran, Martin, Voisenat, Claudie, & Katz, Brian F. G. (2022). Ears of the past, an inquiry into the sonic memory of the acoustics of Notre-Dame before the fire of 2019. Journal of Cultural Heritage. https://doi.org/10.1016/j.culher.2022.09.006
    Debattista, Jeremy, Lange, Christoph, Scerri, Simon, & Auer, Sören. (2015). Linked ’Big’ Data: Towards a Manifold Increase in Big Data Value and Veracity. 2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC), 92–98. https://doi.org/10.1109/BDC.2015.34
    Delmas-Glass, Emmanuelle, & Sanderson, Robert. (2020). Fostering a community of PHAROS scholars through the adoption of open standards. Art Libraries Journal, 45(1), 19–23. https://doi.org/10.1017/alj.2019.32
    Denton, William. (2006). Functional Requirements for Bibliographic Records (FRBR): Hype or Cure-All? Portal: Libraries and the Academy, 6(2), 231–232. https://doi.org/10.1353/pla.2006.0018
    Digital Preservation Coalition. (2017). Persistent Identifiers. In Digital Preservation Handbook. DPC. https://www.dpconline.org/handbook/technical-solutions-and-tools/persistent-identifiers
    Dijkshoorn, Chris. (2023, October). Building Collection Data Infrastructure at the Rijksmuseum. EuropeanaTech 2023.
    Doerr, Martin. (2003). The CIDOC Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata. AI Magazine, 24(3), 75–92. https://doi.org/10.1609/aimag.v24i3.1720
    Drakopoulos, Georgios, Spyrou, Evaggelos, Voutos, Yorghos, & Mylonas, Phivos. (2019). A semantically annotated JSON metadata structure for open linked cultural data in Neo4j. Proceedings of the 23rd Pan-Hellenic Conference on Informatics, 81–88. https://doi.org/10.1145/3368640.3368659
    Duin, Marcel. (2022). WebAssembly: Beyond the Browser. In Q42 Engineering. https://engineering.q42.nl/webassembly-beyond-the-browser/
    Edmunds, Jeff. (2023). BIBFRAME Must Die. ScholarSphere, 1–7. https://doi.org/10.26207/V18M-0G05
    Edwards, Elizabeth, & Hart, Janice (Eds.). (2004). Photographs Objects Histories (1st Edition). Routledge. https://doi.org/10.4324/9780203506493
    Ehrlinger, Lisa, & Wöß, Wolfram. (2016). Towards a Definition of Knowledge Graphs. In Michael Martin, Martí Cuquet, & Erwin Folmer (Eds.), Joint Proceedings of the Posters and Demos Track of the 12th International Conference on Semantic Systems - SEMANTiCS2016 and the 1st International Workshop on Semantic Change & Evolving Semantics (SuCCESS’16) (Vol. 1695). CEUR. https://ceur-ws.org/Vol-1695/#paper4
    Emmanuel, Isitor, & Stanier, Clare. (2016). Defining Big Data. Proceedings of the International Conference on Big Data and Advanced Wireless Technologies, 1–6. https://doi.org/10.1145/3010089.3010090
    Endres, Bill. (2019). Digitizing Medieval Manuscripts: The St. Chad Gospels, Materiality, Recoveries, and Representation in 2D & 3D. In Digitizing Medieval Manuscripts. ARC, Amsterdam University Press. https://doi.org/10.1515/9781942401803
    Felsing, Ulrike, & Cornut, Murielle. (2024). Re-Imagining the Collection of the Kreis Family. Research in Arts and Education, 2024(1), 41–53. https://doi.org/10.54916/rae.142567
    Felsing, Ulrike, & Frischknecht, Max. (2021). Critical Map Visualizations. In Christine Schranz (Ed.), Shifts in Mapping (pp. 95–124). transcript Verlag. https://doi.org/10.1515/9783839460412-008
    Felsing, Ulrike, Fornaro, Peter, Frischknecht, Max, & Raemy, Julien Antoine. (2023). Community and Interoperability at the Core of Sustaining Image Archives. Digital Humanities in the Nordic and Baltic Countries Publications, 5(1), 40–54. https://doi.org/10.5617/dhnbpub.10649
    Ferrazzi, Sabrina. (2021). The Notion of “Cultural Heritage” in the International Field: Behind Origin and Evolution of a Concept. International Journal for the Semiotics of Law - Revue Internationale de Sémiotique Juridique, 34(3), 743–768. https://doi.org/10.1007/s11196-020-09739-0
    Fiorentino, Sara, & Chinni, Tania. (2023). The Persistence of Memory. Exploring the Significance of Glass from Materiality to Intangible Values. Heritage, 6(6), 4834–4842. https://doi.org/10.3390/heritage6060257
    Fitzpatrick, Kathleen. (2010). Reporting from the Digital Humanities 2010 Conference. In The Chronicle of Higher Education. https://web.archive.org/web/20190829004943/https://www.chronicle.com/blogs/profhacker/reporting-from-the-digital-humanities-2010-conference/25473
    Floridi, Luciano. (2005). Is Semantic Information Meaningful Data? Philosophy and Phenomenological Research, 70(2), 351–370. https://www.jstor.org/stable/40040796
    Floridi, Luciano. (2010). Information: A very short introduction. Oxford University Press.
    Floridi, Luciano. (2023). On good and evil, the mistaken idea that technology is ever neutral, and the importance of the double-charge thesis [{SSRN} {Scholarly} {Paper}]. https://doi.org/10.2139/ssrn.4551487
    Force, Donald C., & Smith, Randy. (2021). Context Lost: Digital Surrogates, Their Physical Counterparts, and the Metadata that Is Keeping Them Apart. The American Archivist, 84(1), 91–118. https://doi.org/10.17723/0360-9081-84.1.91
    FOSTER. (2019). Open Science. In Foster Taxonomy. FACILITATE OPEN SCIENCE TRAINING FOR EUROPEAN RESEARCH. https://www.fosteropenscience.eu/taxonomy/term/100
    Freire, Nuno, & Isaac, Antoine. (2019). Technical Usability of Wikidata’s Linked Data. In Witold Abramowicz & Rafael Corchuelo (Eds.), Business Information Systems Workshops (pp. 556–567). Springer International Publishing. https://doi.org/10.1007/978-3-030-36691-9_47
    Freire, Nuno, Calado, Pável, & Martins, Bruno. (2018). Availability of Cultural Heritage Structured Metadata in the World Wide Web. In Leslie Chan & Pierre Mounier (Eds.), ELPUB 2018. https://doi.org/10.4000/proceedings.elpub.2018.20
    Freire, Nuno, Isaac, Antoine, Robson, Glen, Brooks, John, & Manguinhas, Hugo. (2017). A survey of Web technology for metadata aggregation in cultural heritage. Information Services & Use, 37(4), 425–436. https://doi.org/10.3233/ISU-170859
    Freire, Nuno, Meijers, Enno, Valk, Sjors de, Raemy, Julien A., & Isaac, Antoine. (2021). Metadata Aggregation via Linked Data: Results of the Europeana Common Culture Project. In Emmanouel Garoufallou & María-Antonia Ovalle-Perandones (Eds.), Metadata and Semantic Research (pp. 383–394). Springer International Publishing. https://doi.org/10.1007/978-3-030-71903-6_35
    Fresa, Antonella. (2013). A Data Infrastructure for Digital Cultural Heritage: Characteristics, Requirements and Priority Services. International Journal of Humanities and Arts Computing, 7(supplement), 29–46. https://doi.org/10.3366/ijhac.2013.0058
    Frischknecht, Max. (2022). Generating Perspectives: Applying Generative Design to critically explore the Atlas of Swiss Folklore. DARIAH-CH Study Day 2022 Posters. https://doi.org/10.24451/arbor.17911
    Gandon, Fabien. (2017). Pour tout le monde : Tim Berners-Lee, lauréat du prix Turing 2016 pour avoir inventé… le Web. Bulletin 1024, 2017(11), 129–154. https://doi.org/10.48556/SIF.1024.11.129
    Gandon, Fabien. (2019). Web Science, Artificial Intelligence and Intelligence Augmentation (Seminar Dagstuhl Perspectives Workshop 18262. 10 Years of Web Science: Closing The Loop; pp. 10–13). Schloss Dagstuhl — Leibniz-Zentrum für Informatik. https://inria.hal.science/hal-01976768
    Georgopoulos, Andreas. (2018). CIPA’s Perspectives on Cultural Heritage. In Sander Münster, Kristina Friedrichs, Florian Niebling, & Agnieszka Seidel-Grzesińska (Eds.), Digital Research and Education in Architectural Heritage (pp. 215–245). Springer International Publishing. https://doi.org/10.1007/978-3-319-76992-9_13
    Giacomo, Giuseppe De, & Lenzerini, Maurizio. (1996). TBox and ABox reasoning in expressive description logics. Proceedings of the Fifth International Conference on Principles of Knowledge Representation and Reasoning, 316–327. https://dl.acm.org/doi/10.5555/3087368.3087406
    Gilliland, Anne J. (2016). Setting the Stage. In Murtha Baca (Ed.), Introduction to metadata (Third edition). Getty Research Institute. https://www.getty.edu/publications/intrometadata/setting-the-stage/
    Gomes, Daniel, Miranda, João, & Costa, Miguel. (2011). A Survey on Web Archiving Initiatives. In Stefan Gradmann, Francesca Borri, Carlo Meghini, & Heiko Schuldt (Eds.), Research and Advanced Technology for Digital Libraries (Vol. 6966, pp. 408–420). Springer. https://doi.org/10.1007/978-3-642-24469-8_41
    Goyal, Yash, Khot, Tejas, Summers-Stay, Douglas, Batra, Dhruv, & Parikh, Devi. (2017). Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering. arXiv. https://doi.org/10.48550/arXiv.1612.00837
    Greenberg, Jane. (2005). Understanding Metadata and Metadata Schemes. Cataloging & Classification Quarterly, 40(3–4), 17–36. https://doi.org/10.1300/J104v40n03_02
    Gruber, Thomas R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2), 199–220. https://doi.org/10.1006/knac.1993.1008
    Guenther, Rebecca S. (2003). MODS: The Metadata Object Description Schema. Portal: Libraries and the Academy, 3(1), 137–150. https://doi.org/10.1353/pla.2003.0006
    Guillem, Anaïs, Gros, Antoine, & De Luca, Livio. (2023, June). Faire parler les claveaux effondrés de la cathédrale Notre-Dame de Paris. Recueil Des Communications Du 4e Colloque Humanistica. https://hal.science/hal-04106101
    Guillem, Anaïs, Gros, Antoine, Reby, Kevin, Abergel, Violette, & De Luca, Livio. (2023). RCC8 for CIDOC CRM: Semantic Modeling of Mereological and Topological Spatial Relations in Notre-Dame de Paris. In Antonis Bikakis, Roberta Ferrario, Stéphane Jean, Béatrice Markhoff, Alessandro Mosca, & Marianna Nicolosi Asmundo (Eds.), Proceedings of the International Workshop on Semantic Web and Ontology Design for Cultural Heritage (Vol. 3540). CEUR. https://ceur-ws.org/Vol-3540/#paper2
    Hacıgüzeller, Piraye, Taylor, James Stuart, & Perry, Sara. (2021). On the Emerging Supremacy of Structured Digital Data in Archaeology: A Preliminary Assessment of Information, Knowledge and Wisdom Left Behind. Open Archaeology, 7(1), 1709–1730. https://doi.org/10.1515/opar-2020-0220
    Haraway, Donna Jeanne. (2003). The companion species manifesto: Dogs, people, and significant otherness. Prickly Paradigm Press.
    Haraway, Donna Jeanne. (2016). Staying with the trouble: Making kin in the Chthulucene. Duke University Press.
    Haraway, Donna. (2008). Encounters with Companion Species: Entangling Dogs, Baboons, Philosophers, and Biologists. Configurations, 14(1), 97–114. https://doi.org/10.1353/con.0.0002
    Hardesty, Juliet, & Nolan, Allison. (2021). Mitigating Bias in Metadata: A Use Case Using Homosaurus Linked Data. Information Technology and Libraries, 40(3). https://doi.org/10.6017/ital.v40i3.13053
    Hardin, Garrett. (1968). The Tragedy of the Commons. Science, 162(3859), 1243–1248. https://doi.org/10.1126/science.162.3859.1243
    Harpring, Patricia. (2010). Development of the Getty Vocabularies: AAT, TGN, ULAN, and CONA. Art Documentation: Journal of the Art Libraries Society of North America, 29(1), 67–72. https://doi.org/10.1086/adx.29.1.27949541
    Haxaire, Claudie. (2009). The Power of Ambiguity: The Nature and Efficacy of the Zamble Masks Revealed by “Disease Masks” Among the Gouro People (Côte d’Ivoire). Africa, 79(4), 543–569. https://doi.org/10.3366/E0001972009001065
    He, Y., Ma, Y. H., & Zhang, X. R. (2017). “Digital Heritage” Theory and Innovative Practice. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W5, 335–342. https://doi.org/10.5194/isprs-archives-XLII-2-W5-335-2017
    Hertz, Ellen, Graezer Bideau, Florence, Leimgruber, Walter, & Munz, Hervé. (2018). Politiques de la tradition. Le patrimoine culturel immatériel (Vol. 131). Presses polytechniques et universitaires romandes. https://edoc.unibas.ch/68569/
    Hill, Linda, Buchel, Olha, Janée, Greg, & Zeng, Marcia Lei. (2002). Integration of Knowledge Organization Systems into Digital Library Architectures: Position Paper for 13th ASIST SIGICR Workshop, Reconceptualizing Classification Research. Advances in Classification Research Online, 13(1), 46–52. https://doi.org/10.7152/acro.v13i1.13835
    Hillmann, Diane I., Marker, Rhonda, & Brady, Chris. (2008). Metadata Standards and Applications. The Serials Librarian, 54(1–2), 7–21. https://doi.org/10.1080/03615260801973364
    Hodge, Gail M. (2000). Systems of knowledge organization for digital libraries: Beyond traditional authority files. Digital Library Federation, Council on Library; Information Resources.
    Hoffmann, Anna Lauren. (2021). Terms of inclusion: Data, discourse, violence. New Media & Society, 23(12), 3539–3556. https://doi.org/10.1177/1461444820958725
    Hou, Yumeng, & Kenderdine, Sarah. (2024). Ontology-based knowledge representation for traditional martial arts. Digital Scholarship in the Humanities, 1–18. https://doi.org/10.1093/llc/fqae005
    Hou, Yumeng, Kenderdine, Sarah, Picca, Davide, Egloff, Mattia, & Adamou, Alessandro. (2022). Digitizing Intangible Cultural Heritage Embodied: State of the Art. Journal on Computing and Cultural Heritage, 15(3), 55:1-55:20. https://doi.org/10.1145/3494837
    Huber, Birgit, & Frischknecht, Max. (2024). Digitalisierung und (De-)Konstruktion. Überlegungen zur Entwicklung eines Prototyps für die digitale Zugänglichmachung des «Atlas der Schweizerischen Volkskunde. In Sabine Eggmann & Konrad J. Kuhn (Eds.), Schweizerisches Archiv für Volkskunde  Archives suisses des traditions populaires (Vol. 2024/1, pp. 27–52). Chronos. https://doi.org/10.33057/CHRONOS.1785/27-51
    Huber, Birgit. (2023). Die Entdeckung der «Brünig-Napf-Reuss-Linie». In Blog zur Schweizer Geschichte - Schweizerisches Nationalmuseum. https://blog.nationalmuseum.ch/2023/10/die-entdeckung-der-bruenig-napf-reuss-linie/
    Hyvönen, Eero. (2012). Cultural Heritage on the Semantic Web. In Publishing and Using Cultural Heritage Linked Data on the Semantic Web (pp. 1–11). Springer International Publishing. https://doi.org/10.1007/978-3-031-79438-4_1
    Hyvönen, Eero. (2020). Using the Semantic Web in digital humanities: Shift from data publishing to data-analysis and serendipitous knowledge discovery. Semantic Web, 11(1), 187–193. https://doi.org/10.3233/SW-190386
    ICA Expert Group on Archival Description. (2023). Records in Context Conceptual Model 1.0. https://www.ica.org/sites/default/files/ric-cm-1.0_0.pdf
    Ioannides, Marinos, & Davies, Robert. (2019). Towards a Holistic Documentation and Wider Use of Digital Cultural Heritage. In Emmanouel Garoufallou, Fabio Sartori, Rania Siatri, & Marios Zervas (Eds.), Metadata and Semantic Research (pp. 76–88). Springer International Publishing. https://doi.org/10.1007/978-3-030-14401-2_7
    Irish, Kathryn, & Saba, Jessica. (2023). Bots are the new fraud: A post-hoc exploration of statistical methods to identify bot-generated responses in a corrupt data set. Personality and Individual Differences, 213. https://doi.org/10.1016/j.paid.2023.112289
    Izu, Benjamin Obeghare. (2022). The Sociocultural Significance of the Emedjo (Masquerade) Dance Among the Abraka People in Delta State, Nigeria. E-Journal of Humanities, Arts and Social Sciences, 413–423. https://doi.org/10.38159/ehass.2022394
    Jackson, Steven J., Edwards, Paul N., Bowker, Geoffrey C., & Knobel, Cory P. (2007). Understanding infrastructure: History, heuristics and cyberinfrastructure policy. First Monday, 12(6). https://doi.org/10.5210/fm.v12i6.1904
    Jaillant, Lise, & Caputo, Annalina. (2022). Unlocking digital archives: Cross-disciplinary perspectives on AI and born-digital data. AI & SOCIETY, 37(3), 823–835. https://doi.org/10.1007/s00146-021-01367-x
    Jaillant, Lise, & Rees, Arran. (2023). Applying AI to digital archives: Trust, collaboration and shared professional ethics. Digital Scholarship in the Humanities, 38(2), 571–585. https://doi.org/10.1093/llc/fqac073
    Jaton, Florian. (2017). We get the algorithms of our ground truths: Designing referential databases in digital image processing. Social Studies of Science, 47(6), 811–840. https://doi.org/10.1177/0306312717730428
    Jaton, Florian. (2021). Assessing biases, relaxing moralism: On ground-truthing practices in machine learning design and application. Big Data & Society, 8(1), 1–15. https://doi.org/10.1177/20539517211013569
    Jaton, Florian. (2023). Groundwork for AI: Enforcing a benchmark for neoantigen prediction in personalized cancer immunotherapy. Social Studies of Science, 53(5), 787–810. https://doi.org/10.1177/03063127231192857
    Katz, Brian F. G. (2023, October). Digitally exploring the acoustic history of Notre-Dame Cathedral. EuropeanaTech 2023. https://youtu.be/JDcNV_X54oQ
    Knöchelmann, Marcel. (2019). Open Science in the Humanities, or: Open Humanities? Publications, 7(4), 65. https://doi.org/10.3390/publications7040065
    Koch, Inês, Ribeiro, Cristina, & Teixeira Lopes, Carla. (2020). ArchOnto, a CIDOC-CRM-Based Linked Data Model for the Portuguese Archives. In Mark Hall, Tanja Merčun, Thomas Risse, & Fabien Duchateau (Eds.), Digital Libraries for Open Knowledge (Vol. 12246, pp. 133–146). Springer International Publishing. https://doi.org/10.1007/978-3-030-54956-5_10
    Krötzsch, Markus, Simancik, Frantisek, & Horrocks, Ian. (2013). A Description Logic Primer. arXiv. https://doi.org/10.48550/arXiv.1201.4089
    Lagoze, Carl, Van de Sompel, Herbert, Nelson, Michael, & Warner, Simeon. (2002). The Open Archives Initiative Protocol for Metadata Harvesting - v.2.0. In Open Archives Initiative. http://www.openarchives.org/OAI/openarchivesprotocol.html
    Laney, Doug. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6(70), 1.
    Latour, Bruno. (1990). Postmodern? No, simply amodern! Steps towards an anthropology of science. Studies in History and Philosophy of Science Part A, 21(1), 145–171. https://doi.org/10.1016/0039-3681(90)90018-4
    Latour, Bruno. (1993). We have never been modern. Harvard University Press.
    Latour, Bruno. (1996). On actor-network theory: A few clarifications. Soziale Welt, 47(4), 369–381. https://www.jstor.org/stable/40878163
    Latour, Bruno. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford University Press.
    Latour, Bruno. (2022). Habiter la Terre : Entretiens avec Nicolas Truong. Éditions Les Liens qui libèrent ; Arte éditions.
    Lave, Jean, & Wenger, Etienne. (1991). Situated learning: Legitimate peripheral participation. Cambridge University Press.
    Lee, Christopher A. (2009). Open Archival Information System (OAIS) Reference Model. In Marcia J. Bates & Mary Niles Maack (Eds.), Encyclopedia of Library and Information Sciences, Third Edition (3rd ed., pp. 4020–4030). CRC Press. https://doi.org/10.1081/E-ELIS3-120044377
    Leimgruber, Walter. (2008). Was ist immaterielles Kulturerbe? Bulletin / Schweizerische Akademie Der Geistes- Und Sozialwissenschaften, 2008, H. 2, 24–25. http://edoc.unibas.ch/dok/A5251330
    Leimgruber, Walter. (2010). Switzerland and the UNESCO Convention on Intangible Cultural Heritage. Journal of Folklore Research, 47(1–2), 161–196. https://doi.org/10.2979/JFR.2010.47.1-2.161
    Lemmer-Webber, Christine, & Tallon, Jessica. (2018). ActivityPub. In W3C. https://www.w3.org/TR/activitypub/
    Lenzerini, Federico. (2011). Intangible Cultural Heritage: The Living Culture of Peoples. European Journal of International Law, 22(1), 101–120. https://doi.org/10.1093/ejil/chr006
    Li, Xigao, Azad, Babak Amin, Rahmati, Amir, & Nikiforakis, Nick. (2021). Good Bot, Bad Bot: Characterizing Automated Browsing Activity. 2021 IEEE Symposium on Security and Privacy (SP), 1589–1605. https://doi.org/10.1109/SP40001.2021.00079
    Lim, Shirley, & Li Liew, Chern. (2011). Metadata quality and interoperability of GLAM digital images. Aslib Proceedings, 63(5), 484–498. https://doi.org/10.1108/00012531111164978
    Lin, Tsung-Yi, Maire, Michael, Belongie, Serge, Hays, James, Perona, Pietro, Ramanan, Deva, Dollár, Piotr, & Zitnick, C. Lawrence. (2014). Microsoft COCO: Common Objects in Context. In David Fleet, Tomas Pajdla, Bernt Schiele, & Tinne Tuytelaars (Eds.), Computer Vision – ECCV 2014 (Vol. 8693, pp. 740–755). Springer International Publishing. https://doi.org/10.1007/978-3-319-10602-1_48
    Lindenthal, Jutta, Meiners, Hanna-Lena, & Balzer, Detlev. (2023). LIDO Primer. In LIDO. https://lido-schema.org/documents/primer/latest/lido-primer.html
    Lit, L. W. C. van. (2020). The Digital Materiality of Digitized Manuscripts. In Among Digitized Manuscripts. Philology, Codicology, Paleography in a Digital World (pp. 51–72). Brill. https://www.jstor.org/stable/10.1163/j.ctv2gjwzrd.6
    Loulanski, Tolina. (2006). Revising the Concept for Cultural Heritage: The Argument for a Functional Approach. International Journal of Cultural Property, 13(2), 207–233. https://doi.org/10.1017/S0940739106060085
    Lowenthal, David. (2005). Natural and cultural heritage. International Journal of Heritage Studies, 11(1), 81–92. https://doi.org/10.1080/13527250500037088
    Mahony, Simon. (2018). Cultural Diversity and the Digital Humanities. Fudan Journal of the Humanities and Social Sciences, 11(3), 371–388. https://doi.org/10.1007/s40647-018-0216-0
    Marcondes, Carlos Henrique. (2021). Integrated classification schemas to interlink cultural heritage collections over the web using LOD technologies. International Journal of Metadata, Semantics and Ontologies, 15(3), 170. https://doi.org/10.1504/IJMSO.2021.123040
    Martinez Demarco, Sol. (2019). Empowering women through digital skills in Argentina: A tale of two stories. TATuP - Zeitschrift Für Technikfolgenabschätzung in Theorie Und Praxis, 28(2), 23–28. https://doi.org/10.14512/tatup.28.2.s23
    Martinez Demarco, Sol. (2023). From digital inclusion to IT appropriation: Gendered aspects of appropriation imaginary and practices. GENDER – Zeitschrift Für Geschlecht, Kultur Und Gesellschaft, 15(1), 72–86. https://doi.org/10.3224/gender.v15i1.06
    Masolo, Claudio, Borgo, Stefano, Gangemi, Aldo, Guarino, Nicola, & Oltramari, Alessandro. (2003). Wonder Web Deliverable D18: Ontology Library (Ontology {Infrastructure} for the {Semantic} {Web} Del 18; p. 343). Laboratory For Applied Ontology - ISTC-CNR. http://www.loa.istc.cnr.it/old/Papers/D18.pdf
    Mathieualexhache. (2021). OAIS Functional Model. https://commons.wikimedia.org/wiki/File:OAIS_Functional_Model_(en).svg
    Mazzocchi, Fulvio. (2018). Knowledge organization system (KOS). Knowledge Organization, 45(1), 54–78. https://doi.org/10.5771/0943-7444-2018-1-54
    McGillivray, Barbara, Poibeau, Thierry, & Fabo, Pablo Ruiz. (2020). Digital Humanities and Natural Language Processing: Je t’aime... Moi non plus. Digital Humanities Quarterly, 014(2). https://www.digitalhumanities.org/dhq/vol/14/2/000454/000454.html
    Micle, Dorel. (2014). Archaeological Heritage Between Natural Hazard and Anthropic Destruction: The Negative Impact of Social Non-involvement in the Protection of Archaeological Sites. Procedia - Social and Behavioral Sciences, 163, 269–278. https://doi.org/10.1016/j.sbspro.2014.12.316
    Mikhaylova, Daria, & Metilli, Daniele. (2023). Extending RiC-O to Model Historical Architectural Archives: The ITDT Ontology. Journal on Computing and Cultural Heritage, 16(4), 67:1-67:15. https://doi.org/10.1145/3606706
    Mixter, Jeff. (2014). Using a Common Model: Mapping VRA Core 4.0 Into an RDF Ontology. Journal of Library Metadata, 14(1), 1–23. https://doi.org/10.1080/19386389.2014.891890
    Morales, Susana. (2009). La apropiación de TIC: Una perspectiva. In Susana Morales & M. I. Loyola (Eds.), Los jóvenes y las TIC. Apropiación y uso en educación (pp. 99–120). Edición de las autoras.
    Morales, Susana. (2017). Imaginación y software: Aportes para la construcción del paradigma de la apropiación. Del Gato Gris. http://hdl.handle.net/11086/27405
    Morales, Susana. (2018). La apropiación de tecnologías. Ideas para un paradigma en construcción. In Acerca de la apropiación de tecnologías. Teoría, estudios y debates (pp. 23–33). Del Gato Gris.
    Morrison, Robbie. (2021). Redrawn slide from presentation of Ana Persic, Division of Science Policy and Capacity-Building (SC/PCB), UNESCO (France) presentation to Open Science Conference 2021, ZBW — Leibniz Information Centre for Economics, Germany. https://commons.wikimedia.org/wiki/File:Osc2021-unesco-open-science-no-gray.png
    Mr Gee. (2023, October). Day 2 Closing – A multitude of tools. EuropeanaTech 2023. https://youtu.be/pOX9CrvAG7I
    Müller, Katja. (2018). Digitale Objekte - subjektive Materie. Zur Materialität digitalisierter Objekte in Museum und Archiv. In Hans Peter Hahn & Friedemann Neumann (Eds.), Edition Kulturwissenschaft (1st ed., Vol. 182, pp. 49–66). transcript Verlag. https://doi.org/10.14361/9783839445136-004
    Munjeri, Dawson. (2004). Tangible and Intangible Heritage: From difference to convergence. Museum International, 56(1–2), 12–20. https://doi.org/10.1111/j.1350-0775.2004.00453.x
    Münster, S., Apollonio, F. I., Bell, P., Kuroczynski, P., Di Lenardo, I., Rinaudo, F., & Tamborrino, R. (2019). Digital Cultural Heritage meets Digital Humanities. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W15, 813–820. https://doi.org/10.5194/isprs-archives-XLII-2-W15-813-2019
    Münster, Sander, Utescher, Ronja, & Ulutas Aydogan, Selda. (2021). Digital topics on cultural heritage investigated: How can data-driven and data-guided methods support to identify current topics and trends in digital heritage? Built Heritage, 5(1), 25. https://doi.org/10.1186/s43238-021-00045-7
    Nargesian, Fatemeh, Zhu, Erkang, Miller, Renée J., Pu, Ken Q., & Arocena, Patricia C. (2019). Data lake management: Challenges and opportunities. Proceedings of the VLDB Endowment, 12(12), 1986–1989. https://doi.org/10.14778/3352063.3352116
    Nelson, Peter A. (2021). The Role of GPR in Community-Driven Compliance Archaeology with Tribal and Non-tribal Communities in Central California. Advances in Archaeological Practice, 9(3), 215–225. https://doi.org/10.1017/aap.2021.14
    Neudecker, Clemens. (2022). Cultural Heritage as Data: Digital Curation and Artificial Intelligence in Libraries. In Adrian Paschke, Georg Rehm, Clemens Neudecker, & Lydia Pintscher (Eds.), Proceedings of the Third Conference on Digital Curation Technologies (Qurator 2022) (Vol. 3234). CEUR. https://ceur-ws.org/Vol-3234/#paper2
    Newbury, David. (2018). LOUD: Linked Open Usable Data and linked.art. 2018 CIDOC Conference, 1–11. https://cidoc.mini.icom.museum/wp-content/uploads/sites/6/2021/03/CIDOC2018_paper_153.pdf
    Newbury, David. (2024). Linked Data in Production: Moving Beyond Ontologies. https://www.slideshare.net/slideshow/linked-data-in-production-moving-beyond-ontologies/266976602
    Nielsen, Erland Kolding. (2008). Digitisation of Library Material in Europe: Problems, Obstacles and Perspectives anno 2007. LIBER Quarterly: The Journal of the Association of European Research Libraries, 18(1), 20–27. https://doi.org/10.18352/lq.7901
    NISO. (2010). Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies (American {National} {Standard} ANSI/NISO Z39.19-2005 (R2010)). National Information Standards Organization. https://groups.niso.org/higherlogic/ws/public/download/12591/z39-19-2005r2010.pdf
    Ostrom, Elinor. (1990). Governing the Commons: The Evolution of Institutions for Collective Action (1st ed.). Cambridge University Press. https://doi.org/10.1017/CBO9780511807763
    Owens, Trevor. (2011). Defining Data for Humanists: Text, Artifact, Information or Evidence? Journal of Digital Humanities, 1(1). https://journalofdigitalhumanities.org/1-1/defining-data-for-humanists-by-trevor-owens/
    Oxford English Dictionary. (2023). Artificial Intelligence. In Oxford English Dictionary (OED). Oxford University Press. https://doi.org/10.1093/OED/3194963277
    Padfield, Joseph, Bolland, Charlotte, Fitzgerald, Neil, McLaughlin, Anne, Robson, Glen, & Terras, Melissa. (2022). Practical applications of IIIF as a building block towards a digital National Collection [Arts and {Humanities} {Research} {Council} {Final} {Report}]. Towards a National Collection. https://doi.org/10.5281/zenodo.6884885
    Page, Kevin R., Delmas-Glass, Emmanuelle, Beaudet, David, Norling, Samantha, Rother, Lynn, & Hänsli, Thomas. (2020). Linked Art: Networking Digital Collections and Scholarship. Digital Humanities 2020 Book of Abstracts, 504–509. https://dh2020.adho.org/wp-content/uploads/2020/07/139_LinkedArtNetworkingDigitalCollectionsandScholarship.html
    Pagenstecher, Cord. (2009). Private Fotoalben als historische Quelle. Zeithistorische Forschungen/Studies in Contemporary History, 6(3), 449–463. https://doi.org/10.14765/ZZF.DOK-1803
    Pan, Jeff Z., Razniewski, Simon, Kalo, Jan-Christoph, Singhania, Sneha, Chen, Jiaoyan, Dietze, Stefan, Jabeen, Hajira, Omeliyanenko, Janna, Zhang, Wen, Lissandrini, Matteo, Biswas, Russa, Melo, Gerard de, Bonifati, Angela, Vakaj, Edlira, Dragoni, Mauro, & Graux, Damien. (2023). Large Language Models and Knowledge Graphs: Opportunities and Challenges. arXiv. https://doi.org/10.48550/arXiv.2308.06374
    Patrón, Pedro, Miguelañez, Emilio, Petillot, Yvan R., Patrón, Pedro, Miguelañez, Emilio, & Petillot, Yvan R. (2011). Embedded Knowledge and Autonomous Planning: The Path Towards Permanent Presence of Underwater Networks. In Autonomous Underwater Vehicles (pp. 199–224). IntechOpen. https://doi.org/10.5772/24649
    Perrigo, Billy. (2023). Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic. Time. https://time.com/6247678/openai-chatgpt-kenya-workers/
    Persic, Ana. (2021). Building a Global Consensus on Open Science – the future UNESCO Recommendation on Open Science. https://doi.org/10.5446/53434
    Peterhans, Simon, Sauter, Loris, Spiess, Florian, & Schuldt, Heiko. (2022). Automatic Generation of Coherent Image Galleries in Virtual Reality. In Gianmaria Silvello, Oscar Corcho, Paolo Manghi, Giorgio Maria Di Nunzio, Koraljka Golub, Nicola Ferro, & Antonella Poggi (Eds.), Linking Theory and Practice of Digital Libraries (Vol. 13541, pp. 282–288). Springer International Publishing. https://doi.org/10.1007/978-3-031-16802-4_23
    Pfrunder, Peter. (1995). Ernst Brunner: Photographien, 1937-1962 (2. Aufl). Schweizerische Gesellschaft für Volkskunde ; Offizin.
    Philip, Kavita. (2021). The Internet Will Be Decolonized. In Thomas S. Mullaney, Benjamin Peters, Mar Hicks, & Kavita Philip (Eds.), Your Computer Is on Fire (pp. 91–116). The MIT Press. https://doi.org/10.7551/mitpress/10993.003.0002
    Pirgova-Morgan, Luba. (2023). Looking towards a brighter future: The potentiality of AI and digital transformations to library spaces (p. 111) [Digital {Futures} {Research} {Report}]. University of Leeds Libraries. https://library.leeds.ac.uk/downloads/download/196/artificial-intelligence-ai-in-libraries
    Pitti, Daniel V. (1999). Encoded archival description: An introduction and overview. New Review of Information Networking, 5(1), 61–69. https://doi.org/10.1080/13614579909516936
    Portalés, Cristina, Rodrigues, João M. F., Rodrigues Gonçalves, Alexandra, Alba, Ester, & Sebastián, Jorge. (2018). Digital Cultural Heritage. Multimodal Technologies and Interaction, 2(3), 58. https://doi.org/10.3390/mti2030058
    Poulopoulos, Vassilis, & Wallace, Manolis. (2022). Digital Technologies and the Role of Data in Cultural Heritage: The Past, the Present, and the Future. Big Data and Cognitive Computing, 6(3), 181–199. https://doi.org/10.3390/bdcc6030073
    Poupeau, Gautier. (2018). Réflexions et questions autour du Web sémantique. In Les petites cases. https://web.archive.org/web/20240813032044/https://www.lespetitescases.net/reflexions-et-questions-autour-du-web-semantique
    Raemy, Julien Antoine, Gray, Tanya, Collinson, Alwyn, & Page, Kevin R. (2023). Enabling Participatory Data Perspectives for Image Archives through a Linked Art Workflow. In Anne Baillot, Walter Scholger, Toma Tasovac, & Georg Vogeler (Eds.), Digital Humanities 2023 Book of Abstracts (Vol. 2023, pp. 515–516). Alliance of Digital Humanities Organizations (ADHO). https://doi.org/10.5451/unibas-ep95099
    Raemy, Julien Antoine. (2017). The International Image Interoperability Framework (IIIF): Raising awareness of the user benefits for scholarly editions [Bachelor’s thesis, HES-SO University of Applied Sciences; Arts, Haute école de gestion de Genève]. https://sonar.ch/hesso/documents/314853
    Raemy, Julien Antoine. (2020). Enabling better aggregation and discovery of cultural heritage content for Europeana and its partner institutions [Master’s thesis, HES-SO University of Applied Sciences; Arts, Haute école de gestion de Genève]. https://sonar.ch/hesso/documents/315109
    Raemy, Julien Antoine. (2021). Applying Effective Data Modelling Approaches for the Creation of a Participatory Archive Platform. In Yumeng Hou (Ed.), Human Factors in Digital Humanities (pp. 1–5). Institut des humanités digitales. https://doi.org/10.5451/unibas-ep87517
    Raemy, Julien Antoine. (2022). Améliorer la valorisation des données du patrimoine culturel grâce au Linked Open Usable Data (LOUD). In Nicolas Lasolle, Olivier Bruneau, & Jean Lieber (Eds.), Actes des journées humanités numériques et Web sémantique (pp. 132–149). Les Archives Henri-Poincaré - Philosophie et Recherches sur les Sciences et les Technologies (AHP-PReST); Laboratoire lorrain de recherche en informatique et ses applications (LORIA). https://doi.org/10.5451/unibas-ep89725
    Raemy, Julien Antoine. (2024). Interlinking Cultural Heritage Data with Community-driven Principles and Standards. https://julsraemy.ch/prezi/pia-ringvorlesung-2024.html
    Raemy, Julien Antoine. (2024). Some notes from the 2024 IIIF Conference held in Los Angeles. In Thoughts and discombobulations of Julien A. Raemy. https://julsraemy.ch/posts/2024/06/26/iiif-conference-la/
    Rautenberg, Michel. (1998). L’émergence patrimoniale de l’ethnologie : Entre mémoire et politiques publiques. In Patrimoine et modernité (pp. 279–289). L’Harmattan.
    Respaldiza Hidalgo, María Aránzazu, Wachowicz, Monica, & Vázquez Hoehne, Antonio. (2011). Metadata Visualization of Cultural Heritage Information within a Collaborative Environment. Proceedings of XXIIIrd International CIPA Symposium. https://oa.upm.es/11636/
    Ribes, David, & Lee, Charlotte P. (2010). Sociotechnical Studies of Cyberinfrastructure and e-Research: Current Themes and Future Trajectories. Computer Supported Cooperative Work (CSCW), 19(3), 231–244. https://doi.org/10.1007/s10606-010-9120-0
    Ridge, Mia, Blickhan, Samantha, Ferriter, Meghan, Mast, Austin, Brumfield, Ben, Wilkins, Brendon, Cybulska, Daria, Burgher, Denise, Casey, Jim, Luther, Kurt, Goldman, Michael Haley, White, Nick, Willcox, Pip, Brumfield, Sara Carlstead, Coleman, Sonya J., & Prytz, Ylva Berglund. (2021). 12. Connecting with communities. In The Collective Wisdom Handbook: Perspectives on Crowdsourcing in Cultural Heritage - community review version (1st edition). British Library. https://doi.org/10.21428/a5d7554f.1b80974b
    Ridge, Mia. (2023, October). Enriching lives: Connecting communities and culture with the help of machines. EuropeanaTech 2023. https://doi.org/10.5281/zenodo.8429858
    Riley, Jenn. (2009). Seeing Standards: A Visualization of the Metadata Universe. In Jenn Riley. https://jennriley.com/metadatamap/
    Riley, Jenn. (2017). Understanding metadata. What is metadata, and what is it for? National Information Standards Organization (NISO).
    Riva, Pat, Le Boeuf, Patrick, & Žumer, Maja. (2017). IFLA Library Reference Model: A Conceptual Model for Bibliographic Information [{IFLA}-{LRM}]. International Federation of Library Associations; Institutions. https://repository.ifla.org/handle/123456789/40
    Rodighiero, Dario. (2021). Mapping Affinities: Democratizing Data Visualization. Métis Presses. https://dash.harvard.edu/handle/1/37368046
    Roke, Elizabeth Russey, & Tillman, Ruth Kitchin. (2022). Pragmatic Principles for Archival Linked Data. The American Archivist, 85(1), 173–201. https://doi.org/10.17723/2327-9702-85.1.173
    Rossenova, Lozana, & Di Franco, Karen. (2022). Iterative Pasts and Linked Futures: A Feminist Approach to Modeling Data in Archives and Collections of Artists’ Publishing. Perspectives on Data, 2. https://doi.org/10.53269/9780865593152/05
    SAA Dictionary. (2023). Taxonomy. In Dictionary of Archives Terminology. Society of American Archivists. https://dictionary.archivists.org/entry/taxonomy.html
    Sabharwal, Arjun. (2015). 2 - Archives and special collections in the digital humanities. In Arjun Sabharwal (Ed.), Digital Curation in the Digital Humanities (pp. 27–47). Chandos Publishing. https://doi.org/10.1016/B978-0-08-100143-1.00002-7
    Saha, Barna, & Srivastava, Divesh. (2014). Data quality: The other face of Big Data. 2014 IEEE 30th International Conference on Data Engineering, 1294–1297. https://doi.org/10.1109/ICDE.2014.6816764
    Sanderson, Robert. (2013). RDF: Resource Description Failures and Linked Data Letdowns. Journal of Digital Humanities, 2(3). https://journalofdigitalhumanities.org/2-3/rdf-resource-description-failures-and-linked-data-letdowns/
    Sanderson, Robert. (2015). Linked Data Best Practices and BibFrame. https://www.slideshare.net/azaroth42/linked-data-best-practices-and-bibframe
    Sanderson, Robert. (2019). Keynote: Standards and Communities: Connected People, Consistent Data, Usable Applications. 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), 28. https://doi.org/10.1109/JCDL.2019.00009
    Schmoll, Friedemann. (2009a). Die Vermessung der Kultur: Der ‘Atlas der deutschen Volkskunde’ und die Deutsche Forschungsgemeinschaft, 1928-1980. Steiner.
    Schmoll, Friedemann. (2009b). Richard Weiss : Skizzen zum internationalen Wirken des Schweizer Volkskundlers. Schweizerisches Archiv Für Volkskunde/ Archives Suisses Des Traditions Populaires, 2009(105), 15–32. https://doi.org/10.5169/SEALS-118266
    Schöch, Christof. (2013). Big? Smart? Clean? Messy? Data in the Humanities. Journal of Digital Humanities, 2(3). https://journalofdigitalhumanities.org/2-3/big-smart-clean-messy-data-in-the-humanities/
    Semeraro, Concetta, Lezoche, Mario, Panetto, Hervé, & Dassisti, Michele. (2021). Digital twin paradigm: A systematic literature review. Computers in Industry, 130, 103469. https://doi.org/10.1016/j.compind.2021.103469
    Serres, Michel. (2014). Le parasite. Pluriel.
    Shao, Guodong, & Kibira, Deogratias. (2018). DIGITAL MANUFACTURING: REQUIREMENTS AND CHALLENGES FOR IMPLEMENTING DIGITAL SURROGATES. 2018 Winter Simulation Conference (WSC), 1226–1237. https://doi.org/10.1109/WSC.2018.8632242
    Shepherd, Elizabeth, & Smith, Charlotte. (2000). The Application of ISAD(G) to the Description of Archival Datasets. Journal of the Society of Archivists, 21(1), 55–86. https://doi.org/10.1080/00379810050006911
    Simandiraki-Grimshaw, Anna. (2023). What is a museum object according to a museum database? In TETRARCHs. https://www.tetrarchs.org/index.php/2023/09/19/what-is-a-museum-object-according-to-a-museum-database/
    Snell, James M., & Prodromou, Evan. (2017). Activity Streams 2.0. In W3C. https://www.w3.org/TR/activitystreams-core/
    Snydman, Stuart, Sanderson, Robert, & Cramer, Tom. (2015). The International Image Interoperability Framework (IIIF): A community & technology approach for web-based images. Archiving Conference, 12, 16–21. https://doi.org/10.2352/issn.2168-3204.2015.12.1.art00005
    Spiess, Florian, & Schuldt, Heiko. (2022). Multimodal Interactive Lifelog Retrieval with vitrivr-VR. Proceedings of the 5th Annual on Lifelog Search Challenge, 38–42. https://doi.org/10.1145/3512729.3533008
    Spiess, Florian, & Stauffiger, Markus. (2023). Forschung und Archive: Erschliessung und Zugänglichkeit neu gedacht. Arbido, 2023(1). https://arbido.ch/de/ausgaben-artikel/2023/archiv-der-zukunft/forschung-und-archive-erschliessung-und-zugaenglichkeit-neu-gedacht
    Spiess, Florian, Rossetto, Luca, & Schuldt, Heiko. (2024). Exploring Multimedia Vector Spaces with vitrivr-VR. In Stevan Rudinac, Alan Hanjalic, Cynthia Liem, Marcel Worring, Björn Thor Jónsson, Bei Liu, & Yoko Yamakata (Eds.), MultiMedia Modeling (Vol. 14557, pp. 317–323). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-53302-0_27
    Sporleder, Caroline. (2010). Natural Language Processing for Cultural Heritage Domains. Language and Linguistics Compass, 4(9), 750–768. https://doi.org/10.1111/j.1749-818X.2010.00230.x
    Sprochi, Amanda. (2016). Where Are We Headed? Resource Description and Access, Bibliographic Framework, and the Functional Requirements for Bibliographic Records Library Reference Model. The International Information & Library Review, 48(2), 129–136. https://doi.org/10.1080/10572317.2016.1176455
    Star, Susan Leigh, & Griesemer, James R. (1989). Institutional Ecology, ’Translations’ and Boundary Objects: Amateurs and Professionals in Berkeley’s Museum of Vertebrate Zoology, 1907-39. Social Studies of Science, 19(3), 387–420. https://www.jstor.org/stable/285080
    Star, Susan Leigh, & Ruhleder, Karen. (1994). Steps towards an ecology of infrastructure: Complex problems in design and access for large-scale collaborative systems. Proceedings of the 1994 ACM Conference on Computer Supported Cooperative Work, 253–264. https://doi.org/10.1145/192844.193021
    Star, Susan Leigh. (1999). The Ethnography of Infrastructure. American Behavioral Scientist, 43(3), 377–391. https://doi.org/10.1177/00027649921955326
    Stein, Regine, & Balandi, Oguzhan. (2019). Using LIDO for Evolving Object Documentation into CIDOC CRM. Heritage, 2(1), 1023–1031. https://doi.org/10.3390/heritage2010066
    Stocker, Christian. (2023). Use LinkedDataGPT to query Open Linked Data from the City of Zurich. In Liip. https://www.liip.ch/en/blog/use-linkeddatagpt-to-query-open-linked-data-from-the-city-of-zurich
    Strien, Daniel van, Bell, Mark, McGregor, Nora Rose, & Trizna, Michael. (2022). An Introduction to AI for GLAM. Proceedings of the Second Teaching Machine Learning and Artificial Intelligence Workshop, 20–24. https://proceedings.mlr.press/v170/strien22a.html
    Susnjak, Teo. (2023). Applying BERT and ChatGPT for Sentiment Analysis of Lyme Disease in Scientific Literature. arXiv. https://doi.org/10.48550/arXiv.2302.06474
    Tasovac, Toma, Chambers, Sally, & Tóth-Czifra, Erzsébet. (2020). Cultural Heritage Data from a Humanities Research Perspective: A DARIAH Position Paper. DARIAH-EU. https://hal.science/hal-02961317
    Tennant, Jonathan, Agarwal, Ritwik, Baždarić, Ksenija, Brassard, David, Crick, Tom, Dunleavy, Daniel J., Evans, Thomas Rhys, Gardner, Nicholas, Gonzalez-Marquez, Monica, Graziotin, Daniel, Greshake Tzovaras, Bastian, Gunnarsson, Daniel, Havemann, Johanna, Hosseini, Mohammad, Katz, Daniel S., Knöchelmann, Marcel, Madan, Christopher R., Manghi, Paolo, Marocchino, Alberto, … Yarkoni, Tal. (2020). A tale of two ’opens’: Intersections between Free and Open Source Software and Open Scholarship [Preprint]. SocArXiv. https://doi.org/10.31235/osf.io/2kxq8
    Tennant, Roy. (2002). MARC must die. Library Journal, 127(17), 26–27. http://soiscompsfall2007.pbworks.com/f/marc%20must%20die.pdf
    Terras, Melissa, Coleman, Stephen, Drost, Steven, Elsden, Chris, Helgason, Ingi, Lechelt, Susan, Osborne, Nicola, Panneels, Inge, Pegado, Briana, Schafer, Burkhard, Smyth, Michael, Thornton, Pip, & Speed, Chris. (2021). The value of mass-digitised cultural heritage content in creative contexts. Big Data & Society, 8(1), 20539517211006165. https://doi.org/10.1177/20539517211006165
    Tuominen, Jouni, Hyvönen, Eero, & Leskinen, Petri. (2017). Bio CRM: A Data Model for Representing Biographical Data for Prosopographical Research. In Antske Fokkens, Serge ter Braake, Ronald Sluijter, Paul Arthur, & Eveline Wandl-Vogt (Eds.), Proceedings of the Second Conference on Biographical Data in a Digital World 2017 (Vol. 2119, pp. 59–66). CEUR. https://ceur-ws.org/Vol-2119/#paper10
    Tweed, Christopher, & Sutherland, Margaret. (2007). Built cultural heritage and sustainable urban development. Landscape and Urban Planning, 83(1), 62–69. https://doi.org/10.1016/j.landurbplan.2007.05.008
    UNESCO Institute for Statistics. (2009). UNESCO Framework for Cultural Statistics (FCS). United Nations Educational, Scientific; Cultural Organization. https://doi.org/10.15220/978-92-9189-075-0-en
    UNESCO. (2009). Charter on the Preservation of the Digital Heritage (Circular {Letter} CL/3865). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000179529
    UNESCO. (2019). Preliminary study of the technical, financial and legal aspects of the desirability of a UNESCO recommendation on Open Science (Programme and Meeting Document 40 C/63; p. 24). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000370291
    UNESCO. (2021). Implementation of the UNESCO Recommendation on Open Science (Programme and Meeting Document SC-PCB-SPP/2021/OS/UROS; p. 36). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000379949
    UNESCO. (2022). Basic texts of the 2003 Convention for the Safeguarding of the Intangible Cultural Heritage (Programme and Meeting Document CLT-2022/WS/3). United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000383762
    UNESCO. Culture for Development Indicators. (2014). Methodology Manual. United Nations Educational, Scientific; Cultural Organization. https://n2t.net/ark:/48223/pf0000229608
    Van de Sompel, Herbert, & Nelson, Michael L. (2015). Reminiscing About 15 Years of Interoperability Efforts. D-Lib Magazine, 21(11/12). https://doi.org/10.1045/november2015-vandesompel
    Van der Auwera, Sigrid. (2013). UNESCO and the protection of cultural property during armed conflict. International Journal of Cultural Policy, 19(1), 1–19. https://doi.org/10.1080/10286632.2011.625415
    Vandenhende, Lise, & Van Hoorick, Geert. (2017). The management of cultural heritage and nature : Complementary or conflicting regulations? EELF Annual Conference, 5th, Abstracts. http://hdl.handle.net/1854/LU-8722614
    Vecco, Marilena. (2010). A definition of cultural heritage: From the tangible to the intangible. Journal of Cultural Heritage, 11(3), 321–324. https://doi.org/10.1016/j.culher.2010.01.006
    Vicente-Saez, Ruben, & Martinez-Fuentes, Clara. (2018). Open Science now: A systematic literature review for an integrated definition. Journal of Business Research, 88, 428–436. https://doi.org/10.1016/j.jbusres.2017.12.043
    Vienni-Baptista, Bianca, Fletcher, Isabel, & Lyall, Catherine (Eds.). (2023). Foundations of Interdisciplinary and Transdisciplinary Research: A Reader. Bristol University Press. https://doi.org/10.56687/9781529235012
    Vinck, Dominique. (2019). Les métiers de l’ombre de la Fête des Vignerons. Editions Antipodes. https://doi.org/10.33056/antipodes.1711
    Webber, Jim. (2012). A programmatic introduction to Neo4j. Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, 217–218. https://doi.org/10.1145/2384716.2384777
    Weibel, Stuart L., & Koch, Traugott. (2000). The Dublin Core Metadata Initiative: Mission, Current Activities, and Future Directions. D-Lib Magazine, 6(12). https://doi.org/10.1045/december2000-weibel
    Weinthal, Dianne, & Childress, Dawn. (2019). IIIF for Open Access. https://escholarship.org/uc/item/260616w7
    Weiss, Richard. (1940). Atlas der schweizerischen Volkskunde : Die bisherigen Erfahrungen der Exploratoren. Schweizerisches Archiv Für Volkskunde/ Archives Suisses Des Traditions Populaires., 38(1), 105–118. https://doi.org/10.5169/SEALS-113634
    Wenger, Etienne. (2011). Communities of practice: A brief introduction. National Science Foundation, 1–7. http://hdl.handle.net/1794/11736
    Wilkinson, Mark D., Dumontier, Michel, Aalbersberg, IJsbrand Jan, Appleton, Gabrielle, Axton, Myles, Baak, Arie, Blomberg, Niklas, Boiten, Jan-Willem, Silva Santos, Luiz Bonino da, Bourne, Philip E., Bouwman, Jildau, Brookes, Anthony J., Clark, Tim, Crosas, Mercè, Dillo, Ingrid, Dumon, Olivier, Edmunds, Scott, Evelo, Chris T., Finkers, Richard, … Mons, Barend. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018. https://doi.org/10.1038/sdata.2016.18
    Windhager, Florian, Federico, Paolo, Schreder, Gunther, Glinka, Katrin, Dork, Marian, Miksch, Silvia, & Mayr, Eva. (2019). Visualization of Cultural Heritage Collection Data: State of the Art and Future Challenges. IEEE Transactions on Visualization and Computer Graphics, 25(6), 2311–2330. https://doi.org/10.1109/TVCG.2018.2830759
    Zeng, Marcia Lei, & Qin, Jian. (2022). Metadata (Third edition). ALA Neal-Schuman.
    Zeng, Marcia Lei. (2008). Knowledge Organization Systems (KOS). Knowledge Organization, 35(2–3), 160–182. https://doi.org/10.5771/0943-7444-2008-2-3-160
    Zou, Xiaozhu, Xiong, Siyi, Li, Zhi, & Jiang, Ping. (2018). Constructing Metadata Schema of Scientific and Technical Report Based on FRBR. Computer and Information Science, 11(2), 34–39. https://doi.org/10.5539/cis.v11n2p34
    Žumer, Maja. (2007). Functional requirements for bibliographic records: FRBR: The end of the road or a new beginning. Bulletin of the American Society for Information Science and Technology, 33(6), 27–29. https://doi.org/10.1002/bult.2007.1720330608
    diff --git a/src/thesis.md b/src/thesis.md index 352a47b..5ce04e5 100644 --- a/src/thesis.md +++ b/src/thesis.md @@ -973,14 +973,160 @@ The current frontier in {{ "CH" | abbr | safe }} involves developing application While automation has significantly enhanced the efficiency of digitisation processes in {{ "CH" | abbr | safe }}, cataloguing and indexing remain complex challenges. The intricacies involved in accurately understanding and categorising resources necessitate more than just technological solutions; they require context-aware and culturally sensitive approaches. Here, {{ "ML" | abbr | safe }} offers promising perspectives. {{ "ML" | abbr | safe }}, particularly in its advanced forms like deep learning, can assist in cataloguing and indexing by analysing large datasets to identify patterns, categorise content, and even suggest metadata. This can be particularly useful in handling large volumes of {{ "CH" | abbr | safe }} data, where manual processing is time-consuming and prone to human error. Typical applications of {{ "ML" | abbr | safe }} in this field include image recognition for identifying and classifying visual elements in artefacts, {{ "NLP" | abbr | safe }} for analysing textual content, and pattern recognition for sorting and organising data based on specific characteristics. Furthermore, prospective developments may entail the refinement of metadata mapping and the enhancement of quality control mechanisms. Moreover, {{ "ML" | abbr | safe }} algorithms can be trained to recognise stylistic elements, historical contexts, and other nuances that are essential for accurate cataloguing in {{ "CH" | abbr | safe }}. However, it is crucial to note that the effectiveness of {{ "ML" | abbr | safe }} depends heavily on the quality and diversity of the training data. Biases in this data can lead to inaccuracies in cataloguing and indexing. Thus, a collaborative approach, where {{ "ML" | abbr | safe }} is supplemented by expert human oversight, is often the most effective strategy. -Overall, this section provides a comprehensive overview of six technological trends as well as five key scientific movements and guiding principles that are shaping research and how universities and {{ "GLAM" | abbr | safe }}s should provide environments, services, and tools with a view to collecting and disseminating content. By exploring each of these trends, movements, and principles, we can gain a deeper understanding of how research and {{ "CH" | abbr | safe }} processes are permeated by dynamic movements and how resources can be made more transparent, inclusive and accountable, as well as how data can be made available to human and non-human users. +Overall, this section provides a comprehensive overview of ~~six~~ **three**[^400] technological trends as well as five key scientific movements and guiding principles that are shaping research and how universities and {{ "GLAM" | abbr | safe }}s should provide environments, services, and tools with a view to collecting and disseminating content. By exploring each of these trends, movements, and principles, we can gain a deeper understanding of how research and {{ "CH" | abbr | safe }} processes are permeated by dynamic movements and how resources can be made more transparent, inclusive and accountable, as well as how data can be made available to human and non-human users. #### 3.3.1 Current and Emerging Technological Trends in Cultural Heritage {id="subsec:current-emerging-trends"} +I will explore some current and emerging technological trends in {{ "CH" | abbr | safe }}, organised into three components: Linked Data, big data, and {{ "AI" | abbr | safe }}. Each represents a critical driver shaping the landscape and practices of heritage data. The three trends have been around for a few decades, with the ‘Linked Data’ principles and underlying standards coming from the late 1990s, ‘big data’ being coined in 1990 and {{ "AI" | abbr | safe }} in 1956. +Before considering the trends discussed hereafter, note that current technological developments do not exist in isolation, but tend to intertwine and act synergistically. A vivid example of this interplay can be seen in {{ "AI" | abbr | safe }} and its latent impact on the semantic web, particularly in facilitating more efficient querying and crawling processes such as the LinkedDataGPT proof-of-concept service[^82] from Liip on the City of Zurich that combines ChatGPT --- a generative {{ "AI" | abbr | safe }} solution --- on top of a Linked Data portal to facilitate querying open datasets [@stocker_use_2023]. Inversely {{ "AI" | abbr | safe }} can be fed by data on the web to learn and reason, as outlined by @gandonWebScienceArtificial2019. + +##### 3.3.1.1 Linked Data {id="subsubsec:linked_data"} + +Linked Data, and most precisely {{ "LOD" | abbr | safe }}, is a set of design principles adhering to {{ "RDF" | abbr | safe }} which is a significant approach to interconnect data on the web in order to make semantic queries more useful [@berners-lee_semantic_2001]. In other words, this standardisation allows data to be not only linked, but also openly accessible and reusable. As noted by @gandonWebScienceArtificial2019 [p. 115, citing [@gandon_pour_2017]]: + +> The Web was initially perceived and used as a globally distributed hypertext space for humans. But from its inception, the Web has always been more: its hypermedia architecture is in fact linking programs world-wide through remote procedure calls. + +This deeper understanding of the web's architecture as a conduit for linking programs on a global scale holds profound implications. It signifies that the web is not merely a medium for accessing information but a dynamic environment where data-driven programs interact, exchange data, and collaborate across geographical boundaries. In this context, Linked Data emerges as a powerful enabler, providing a structured and standardised approach for these programs to communicate and share meaningful data [@bizer_linked_2008]. + +In the context of {{ "CH" | abbr | safe }}, institutions such as museums, libraries and archives can publish their collections using Linked Data principles, enabling a web of linked information that is accessible to all. As this dissertation's main topic revolves around Linked (Open) (Usable) Data, two dedicated sections have been written within this literature review in [Section 3.4](#sec:linkeddata) and [Section 3.5](#sec:loud). + +Beyond formal {{ "LOD" | abbr | safe }}, {{ "CHI" | abbr | safe }}s may also link their databases or collections in more informal ways. This interconnection may take the form of shared metadata, common identifiers, or simply hyperlinks. These links can enhance the user experience by supporting a more seamless navigation between related items or pieces of information. For instance, a parallel strategy is the use of graph-based data representation, i.e. property graph which consists of a set of objects or vertices, and a set of arrows or edges connecting the objects, that are most likely not {{ "RDF" | abbr | safe }}-compliant [see @bermes_modelisons_2023]. Graph databases, such as Neo4j[^83] which is quite prevalent in {{ "DH" | abbr | safe }} [see @webber_programmatic_2012; @drakopoulos_semantically_2019; @darmont_data_2020], allow for efficient storage and retrieval of interconnected data through nodes representing entities and relationships linking them. + +##### 3.3.1.2 Big Data {id="subsubsec:big_data"} + +Big Data refers to extremely large and complex datasets that exceed the capabilities of traditional data processing methods and tools. It encompasses a massive volume of structured, semi-structured and unstructured data that is currently flooding across a variety of sectors, companies and organisations [see @emmanuel_defining_2016]. The characteristics of big data are often described by the three ⋁ model [@laney_3d_2001]: + +- **Volume**: Big data refers to a massive amount of data. This can encompass a spectrum of data sizes, extending from {{ "GB" | abbr | safe }} and {{ "TB" | abbr | safe }}, to {{ "PB" | abbr | safe }}[^84] and beyond. The sheer size of the data is a key aspect of big data, making traditional database systems inadequate for storage and analysis. +- **Velocity**: Data is being generated and collected at an unprecedented rate. Social media posts, sensor data, online transactions and more are constantly being generated, requiring real-time or near real-time processing and analysis. +- **Variety**: Big data comes in a variety of formats, including structured data (e.g. databases), semi-structured data (e.g. {{ "XML" | abbr | safe }}, {{ "JSON" | abbr | safe }}) and unstructured data (e.g. text, images, video). The variety of data types requires flexible processing methods. + +In addition to the three ⋁ model, two more characteristics are often included [@saha_data_2014 p. 1294]: + +- **Veracity**: It refers to the quality of the data, including its accuracy, reliability and trustworthiness. Big data sources can be inherently uncertain or inaccurate, and addressing data quality is a critical challenge. +- **Value**: Extracting value and actionable insights from big data is the ultimate goal. Analysing and interpreting Big data should lead to better decision-making, improved business strategies, as well as enhanced {{ "UX" | abbr | safe }}[^85]. + +Regarding the two latter dimensions, @debattista_linked_2015 argue that that Linked Data is the most suitable technology to increase the value of data over conventional formats, thus contributing towards the value challenge in Big Data. As for veracity, they describe a semantic pipeline with eight key metrics to address the veracity dimension. Building on this technological foundation, the integration of Linked Data and Big Data analytics takes centre stage. + +Big data analytics can be employed on {{ "CH" | abbr | safe }} content to uncover insights and correlations that can be used in decision-making. @barrile_big_2022 [p. 2708] highlight the transformative potential of using big data by investigating how analytical approach can enhance conservation strategies, aid resource allocation and optimise the management of {{ "CH" | abbr | safe }} resources. @poulopoulos_digital_2022 [pp. 188-189] emphasise that emerging technology trends, including big data, have a significant impact on related research areas such as {{ "CH" | abbr | safe }}. Big data primarily originates from sources such as social media, online gaming, data lakes[^86], logs and frameworks that generate or use significant amounts of data. They stress that the incorporation of multi-faceted analytics in the {{ "CH" | abbr | safe }} domain is an area of active research, and present a data lake that provides essential user and data/knowledge management functionalities. However, they emphasise a crucial consideration - the need to bridge the theoretical foundations of disciplines such as cultural sociology with the technological advances of big data. + +##### 3.3.1.3 Artificial Intelligence {id="subsubsec:ch_ai"} + +{{ "AI" | abbr | safe }} has been coined for the first time by John McCarthy, an American computer scientist and cognitive scientist, during the 1956 Dartmouth Conference, which is often considered the birth of {{ "AI" | abbr | safe }} as an academic field [@andresen_john_2002 p. 84]. According to the @oxford_english_dictionary_artificial_2023, {{ "AI" | abbr | safe }} is described as follows: + +> The capacity of computers or other machines to exhibit or simulate intelligent behaviour; the field of study concerned with this. In later use also: software used to perform tasks or produce output previously thought to require human intelligence, esp. by using machine learning to extrapolate from large collections of data. + +While {{ "AI" | abbr | safe }} is not the central focus of my PhD thesis, I acknowledge its impact in several instances. As a rapidly developing technology, {{ "AI" | abbr | safe }} has the potential to significantly transform various aspects of society, including the way we describe, analyse, and disseminate {{ "CH" | abbr | safe }} resources. It is worth mentioning that I endeavour to engage in a broader discourse concerning the domain of {{ "AI" | abbr | safe }}. In this context, I use the acronyms {{ "AI" | abbr | safe }} to talk about the overarching domain or its ethics, and {{ "ML" | abbr | safe }} to discuss the specifics of methodologies and algorithmic approaches, while refraining from delving into the intricacies of Deep Learning, which is a distinct subdomain within {{ "ML" | abbr | safe }}. + +{{ "AI" | abbr | safe }} and {{ "ML" | abbr | safe }} offer great potential for digitising, curating and analysing {{ "CH" | abbr | safe }}, leveraging the vast digital datasets from {{ "CHI" | abbr | safe }}s. Some of the examples include text recognition mechanisms using {{ "OCR" | abbr | safe }} and {{ "HTR" | abbr | safe }}, {{ "NLP" | abbr | safe }} and {{ "NER" | abbr | safe }} for enriching unstructured text, as well as object detection methods for finding patterns within still and moving images [@neudecker_cultural_2022; @sporleder_natural_2010]. Textual works can also be analysed, for instance for sentiment analysis [see @susnjak_applying_2023], and generated using {{ "LLM" | abbr | safe }} -- a variety of {{ "NLP" | abbr | safe }}, such as BERT or ChatGPT, which predicts the likelihood of a word given the previous words present in recorded texts. However, challenges such as data quality and biases in {{ "AI" | abbr | safe }} persist [@neudecker_cultural_2022]. + +In addition, there are still uncertainties regarding the licensing and reuse of {{ "CH" | abbr | safe }} datasets by {{ "ML" | abbr | safe }} algorithms[^87]. @neudecker_cultural_2022 emphasises the importance of well-curated digitised {{ "CH" | abbr | safe }} resources that are openly licensed, accompanied by relevant metadata, and accessible through {{ "API" | abbr | safe }}s or download dumps in various formats. These curated resources have the potential to address the existing gap in this domain. + +Building on the theme of enhancing {{ "CH" | abbr | safe }} through digital technologies, @mcgillivray_digital_2020 explore the synergies and challenges found at the intersection of {{ "DH" | abbr | safe }} and {{ "NLP" | abbr | safe }}. {{ "DH" | abbr | safe }} is aptly described as *‘a nexus of fields within which scholars use computing technologies to investigate the kinds of questions that are traditional to the humanities [...] or who ask traditional kinds of humanities-oriented questions about computing technologies’* [@fitzpatrick_reporting_2010]. This broad characterisation encapsulates the transformative potential of digital tools, including {{ "ML" | abbr | safe }} techniques, in enriching humanities research. + +@mcgillivray_digital_2020 highlight the critical need for bridging the communication gap between {{ "DH" | abbr | safe }} and {{ "NLP" | abbr | safe }} to drive progress in both fields. They propose increased interdisciplinary collaboration, encouraging {{ "DH" | abbr | safe }} researchers to actively utilise {{ "NLP" | abbr | safe }} tools to refine their research methodologies. A primary challenge in this convergence is the application of {{ "NLP" | abbr | safe }} to the complex, historical, or noisy texts often encountered in {{ "DH" | abbr | safe }} research. They conclude by advocating for stronger cooperation between practitioners in these fields. This collaborative effort is vital for harnessing the full potential of {{ "ML" | abbr | safe }} in analysing and interpreting {{ "CH" | abbr | safe }}. + +The use of {{ "ML" | abbr | safe }} scripts in the context of {{ "CH" | abbr | safe }} --- and beyond --- is inherently limited by their applicability, namely when dealing with historical photographs. In such cases, the use of algorithms that are mostly trained and grounded in contemporary image data becomes quite incongruous due to the dissimilarity in temporal contexts. This dilemma is exemplified by datasets such as Microsoft's Common Object in Context (COCO)[^88] [@fleet_microsoft_2014], where the available data are predominantly contemporary photographic content, which is misaligned with the historical nuances inherent in most of the digitised {{ "CH" | abbr | safe }} images. @coleman_managing_2020 corroborates that a sound approach would be for {{ "ML" | abbr | safe }} practitioners to collaborate with libraries as they can draw practical lessons from critical data studies and the thoughtful integration of {{ "AI" | abbr | safe }} into their collections, using guidelines from {{ "DH" | abbr | safe }}. She also advocates that as handing handing over datasets would be a disservice to library patrons and that *‘Librarians need to master the instruments of AI and employ them both to learn more about their own resources—to see and analyze them in new ways—and to help shape applications of AI with the expertise and ethos of libraries.’* + +Ethical concerns, particularly regarding social biases and racism, are prevalent in technologies like ImageNet, where facial recognition may yield {{ "AI" | abbr | safe }} statements with strong negative connotations [@neudecker_cultural_2022]. Addressing this, @gandonWebScienceArtificial2019 suggest the production of {{ "AI" | abbr | safe }} services that are *‘benevolent-by-design for the good of the Web and society’*. Furthermore, @floridi_good_2023 introduces the double-charge thesis, asserting that all technology design is a moral act, challenging the neutrality thesis. He emphasises that technologies are not neutral and can be influenced by a dynamic equilibrium of values, predisposing them towards morally good or evil directions. + +As mentioned previously, the {{ "ML" | abbr | safe }} training datasets are often not enough representative to be properly leveraged in the {{ "CH" | abbr | safe }} sector [@strien_introduction_2022]. Fine-tuning is now a topic though and new ground truth datasets have been created and tailored for the needs of {{ "CH" | abbr | safe }}, such as Viscounth[^89], a large-scale {{ "VQA" | abbr | safe }} dataset --- i.e a dataset containing open-ended questions about images which requires an understanding of vision, language and commonsense knowledge to answer [@goyal_making_2017] --- for {{ "CH" | abbr | safe }} in English and Italian [see @becattini_viscounth_2023]. + +@jaillant_unlocking_2022 argue that the governance of {{ "AI" | abbr | safe }} ought to be carried out in partnership with {{ "GLAM" | abbr | safe }} institutions. However, while this collaboration has been proposed as a promising way forward, it still requires further exploration and evaluation, particularly with regards to the specific challenges and opportunities that it presents. On the one hand, the involvement of {{ "GLAM" | abbr | safe }}s in {{ "AI" | abbr | safe }} governance could enhance the development of digital {{ "CH" | abbr | safe }} projects that promote social justice and equity. However, on the other hand, this collaboration raises several challenges, such as the need to address issues of privacy, data protection, and intellectual property rights, and to ensure that the values and perspectives of {{ "GLAM" | abbr | safe }} professionals are adequately represented in the development of {{ "AI" | abbr | safe }} algorithms and systems. Therefore, it is crucial to examine the specific challenges and opportunities of this collaboration and to develop appropriate frameworks and guidelines that enable effective and ethical governance of {{ "AI" | abbr | safe }} in the {{ "GLAM" | abbr | safe }} sector. + +One of these platforms that address these issues is AI4LAM, which is an international and participatory community focused on advancing the use of {{ "AI" | abbr | safe }} in, for and by libraries, archives, and museums[^90]. The initiative was launched by the National Library of Norway and Stanford University Libraries in 2018 inspired by the success of the {{ "IIIF" | abbr | safe }} community. Another agency is the AEOLIAN Network[^91], {{ "AI" | abbr | safe }} for Cultural Organisations, which investigates the role that {{ "AI" | abbr | safe }} can play to make born-digital and digitised cultural records more accessible to users [@jaillant_applying_2023 p. 582]. + +As an illustrative case, the {{ "LoC" | abbr | safe }}'s exploration into {{ "ML" | abbr | safe }} technologies, as highlighted by @allen_why_2023, demonstrates a strategic commitment to enhancing the accessibility and utility of its diverse collections. This initiative reflects the {{ "LoC" | abbr | safe }}'s acknowledgement of the transformative potential of {{ "ML" | abbr | safe }}, balanced with a cautious approach due to the necessity for accurate and responsible information stewardship. The {{ "LoC" | abbr | safe }} faces several challenges in applying {{ "ML" | abbr | safe }}, particularly the limitations of commercial {{ "AI" | abbr | safe }} systems in handling its varied materials and the requirement for substantial human intervention. This cautious exploration into {{ "ML" | abbr | safe }} is indicative of a broader trend in {{ "CHI" | abbr | safe }}s, where maintaining a balance between embracing technological advancements and preserving authenticity and integrity is crucial. + +The specific experiments and projects undertaken by the {{ "LoC" | abbr | safe }} in the realm of {{ "ML" | abbr | safe }} are diverse and illustrative of the institution's comprehensive approach to innovation. For instance, image recognition systems have been tested for identifying and classifying visual elements in artefacts, a task that requires a nuanced understanding of historical and cultural contexts. In another initiative, speech-to-text technology was employed to transcribe spoken word collections, confronting challenges such as accent recognition and audio quality variation. Additionally, the {{ "LoC" | abbr | safe }} explored the potential of {{ "ML" | abbr | safe }} in enhancing search and discovery capabilities through projects like Newspaper Navigator[^92], which aimed to identify and extract images from digitised newspaper pages. + +These experiments not only highlight the potential of {{ "ML" | abbr | safe }} in transforming the way {{ "LoC" | abbr | safe }} manages and disseminates its collections but also reveal the complexities and limitations inherent in these technologies. As @allen_why_2023 notes, the ongoing research and experimentation in {{ "ML" | abbr | safe }} at the {{ "LoC" | abbr | safe }} are critical in revolutionising access and discovery in the cultural heritage sector. These efforts, while facing challenges, represent a diligent integration of advanced technologies, upholding principles of responsible custodianship and setting a precedent for similar institutions globally in the adoption and adaption of {{ "ML" | abbr | safe }} and {{ "AI" | abbr | safe }} in {{ "CHI" | abbr | safe }}s. + +The integration of {{ "LLM" | abbr | safe }} and {{ "KG" | abbr | safe }} presents a groundbreaking opportunity, particularly within the realm of {{ "CHI" | abbr | safe }}s, where there is already considerable expertise. This is aptly demonstrated in the work of @pan_large_2023, which elucidates the harmonisation between explicit knowledge and parametric knowledge, i.e. knowledge derived from patterns in data, as learned by models such as {{ "LLM" | abbr | safe }}s. The authors highlight three key areas for the advancement of {{ "KR" | abbr | safe }} and processing: + +1. **Knowledge Extraction**, where {{ "LLM" | abbr | safe }}s improves the extraction of knowledge from diverse sources for applications such as information retrieval and {{ "KG" | abbr | safe }} construction; +2. **Knowledge Graph Construction**, which involves {{ "LLM" | abbr | safe }}s in tasks such as link prediction and triple extraction from data, albeit with challenges in precision and management of long tail entities; +3. **Training {{ "LLM" | abbr | safe }}s Using {{ "KG" | abbr | safe }}s**, where {{ "KG" | abbr | safe }}s provides structured knowledge for {{ "LLM" | abbr | safe }}s, helping to build retrieval-augmented models on the fly, enriching {{ "LLM" | abbr | safe }}s with world knowledge and increasing its adaptability. + +In a report for the University of Leeds in the UK, @pirgova-morgan_looking_2023 explores the potential and practical implications of {{ "AI" | abbr | safe }} in libraries. The project, forming part of the university's ambitious vision for digital transformation, aims to understand how {{ "AI" | abbr | safe }} can be effectively integrated into library services. This research looks at both the use of general {{ "AI" | abbr | safe }} for long term strategic planning and specific {{ "AI" | abbr | safe }} applications for improving {{ "UX" | abbr | safe }}, process optimisation and enhancing the discoverability of collections. The methodology used in this study involves a multi-faceted approach including desk-based assessments, a university-wide survey and expert interviews. Specifically, the study highlights the following key findings: + +- **{{ "AI" | abbr | safe }} for {{ "UX" | abbr | safe }} and Process Optimisation**: The integration of {{ "AI" | abbr | safe }} technologies offers substantial opportunities for improving user experiences in libraries. This includes optimising library processes, enhancing collections descriptions, and improving their discoverability. +- **Challenges and Opportunities of {{ "AI" | abbr | safe }} Application**: While {{ "AI" | abbr | safe }} presents exciting possibilities, its practical application in library settings faces challenges. These include evaluating specific {{ "AI" | abbr | safe }} technologies in the unique context of the University of Leeds, ensuring they align with the institution's needs and goals. +- **Perceptions of {{ "AI" | abbr | safe }} in Libraries**: The report reveals varying perceptions among librarians and users regarding {{ "AI" | abbr | safe }}. This includes views on how {{ "AI" | abbr | safe }} can contribute to resilience, awareness of climate change, and practices promoting equality, diversity, and inclusion. +- **Role of {{ "AI" | abbr | safe }} in Strategic Library Development**: General {{ "AI" | abbr | safe }} technologies are seen as instrumental in shaping long-term strategies for libraries, highlighting the need for ongoing adaptation and development in response to evolving {{ "AI" | abbr | safe }} capabilities. +- **Expert Perspectives on {{ "AI" | abbr | safe }} in Libraries**: Interviews with experts from around the world underscore the importance of understanding both general and specific applications of {{ "AI" | abbr | safe }}. These insights help in identifying priority areas where AI can significantly enhance library operations and services. + +These insights from the University of Leeds report illustrate the complex impact of {{ "AI" | abbr | safe }} on library services, from enhancing user interaction to influencing strategic decision-making, while also emphasising the importance of adapting {{ "AI" | abbr | safe }} applications to specific institutional needs. + +It must be also stated that {{ "AI" | abbr | safe }} lacks inherent intelligence and consciousness, and have been ultimately built by people. An important concern, namely with {{ "LLM" | abbr | safe }}, is the perceptual illusion of cognitive interaction, where the machine appears to be engaging in dialogue and reasoning, when in fact it is generating content through predictive algorithms [see @ridge_enriching_2023]. Furthermore, regarding the topic of data colonialism, poor people in underprivileged nations are often burdened with the responsibility of cleaning up the toxic repercussions of {{ "AI" | abbr | safe }}, shielding affluent individuals and prosperous countries from direct exposure to its harmful effects[^93]. + +Concluding this segment, it is essential to perceive {{ "ML" | abbr | safe }} algorithms as uncertain *‘socio-material configurations’*, which can be seen as both powerful and inscrutable, demanding an axiomatic and problem-oriented approach in their understanding and application. @jaton_we_2017 elaborates on this by examining how these algorithms, while technologically complex, are firmly rooted in and shaped by the social, material, and human contexts in which they are developed. Beyond their computational complexity, these algorithms are deeply embedded in the process of constructing . These ground truths are not inherent or fixed; instead, they emerge from collaborative efforts that reflect the varied inputs of actors. This process underscores the algorithms as socio-material constructs, influenced by the characteristics and contexts of their creators. Understanding algorithms in this light highlights their deep integration with human actions and societal norms, offering a more nuanced view of their design and implementation [see @jaton_assessing_2021; @jaton_groundwork_2023]. + + +#### 3.3.2 Scientific Movements and Guiding Principles {id="subsec:scientific-movements-guiding-principles"} + +First, [3.3.2.1](#subsubsec:open-scholarship) examines the movement towards more open and transparent forms of research. Open scholarship is a broad concept that encompasses practices such as open access publishing, open data, open source software, and open educational resources. The subsection explores the benefits and challenges of open scholarship, and how it can help to increase the accessibility and impact of research data. + +Then, [3.3.2.2](#subsubsec:citizen) explores the growing trend of involving members of the public in scientific research. Citizen science and citizen humanities involve collaborations between scientists and non-expert individuals, with the aim of generating new knowledge or solving complex problems. The subsubsection examines the benefits and challenges of citizen science and citizen humanities, and how they can help to democratise research. + +[3.3.2.3](#subsubsec:fair) examines the set of guiding principles designed to ensure that research outputs are {{ "FAIR" | abbr | safe }}. It explores the importance of each data principle for research integrity, reproducibility, and collaboration, and provides examples of how they can be implemented in practice. + +[3.3.2.4](#subsubsec:care) explores the importance of ethical and culturally sensitive data governance practices for indigenous communities that are materialised through {{ "CARE" | abbr | safe }}. These principles provide a framework for managing data in a way that is consistent with the values and cultural traditions of indigenous communities. This part explores as well the challenges and opportunities of implementing the {{ "CARE" | abbr | safe }} Principles for Indigenous Data Governance. + +Finally, [3.3.2.5](#subsubsec:collections-data) explores the concept of ‘Collections as Data’, a perspective that has emerged from the practical need and desire to improve decades of digital collecting practice. This approach re-conceptualises collections as ordered digital information that is inherently amenable to computational processing. + +##### 3.3.2.1 Towards Open Scholarship {id="subsubsec:open-scholarship"} + +According to the FOSTER[^94], Open Science can be described as *‘[...] the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods.’* [@foster_open_2019]. + +In recent years, the principles of Open Science, that historically include Open methodology, Open source, Open data, {{ "OA" | abbr | safe }}, Open peer review, as well as open educational resources, have become increasingly important as they emphasise transparency, collaboration and accessibility in scientific research [@bezjak_open_2019]. Open methodology refers to the sharing of research processes and methods, allowing other researchers to reproduce and build on existing work [see @vicente-saez_open_2018]. Open source software and tools enable researchers to collaborate, while open data practices promote the sharing of research data in ways that are accessible, discoverable and reusable by others[^95]. Open access seeks to remove financial and other barriers to accessing scientific knowledge, while open peer review provides greater transparency and accountability in the publication process. Finally, open educational resources encourage the sharing of teaching and learning materials, thereby facilitating the dissemination of knowledge and skills. + +@unesco_preliminary_2019 conducted a preliminary study of the technical, financial and considerations related to the promotion of Open Science. This research underscored the necessity for a holistic approach to Open Science and stressed the significance of tackling international legal matters, as well as the existing challenges stemming from unequal access to justice, which can hinder global scientific collaboration. This study laid the groundwork for a recommendation on making *‘[...] multilingual scientific knowledge openly available, accessible and reusable for everyone, to increase scientific collaborations and sharing of information for the benefits of science and society, and to open the processes of scientific knowledge creation’* [@unesco_implementation_2021 p. 7]. {{ "UNESCO" | abbr | safe }} identified five types of access related to Open Science: infrastructures, societal actors, as well as associated and diverse knowledge systems where dialogue is needed. This includes acknowledging the rights of indigenous peoples and local communities to govern and make decisions on the custodianship, ownership, and administration of data on traditional knowledge and on their lands and resources. [Figure 3.10](#fig:unesco-open-science) provides a visual summary of this. + +
    + Open Science Elements, Redrawn Slide from Presentation of Ana Persic [@morrison_redrawn_2021 citing [@persic_building_2021]] +
    + Figure 3.10: Open Science Elements, Redrawn Slide from Presentation of + Ana Persic [@morrison_redrawn_2021 citing [@persic_building_2021]] +
    +
    + +While Open Science offers numerous benefits, it also presents challenges and potential drawbacks that warrant careful consideration. One major concern is the risk of exacerbating inequities between researchers from well-resourced institutions and those from less privileged backgrounds. Open access publishing often entails significant costs in the form of article processing charges, which can disproportionately burden researchers without adequate funding support [@burchardt_researchers_2014]. Additionally, Open Science practices relying on open protocols may be vulnerable to misuse, such as automated bots excessively crawling open repositories or datasets. This can lead to overloading systems, unauthorised data extraction, or unintended uses of research outputs [see @irish_bots_2023; @li_good_2021]. These risks underscore the importance of balancing openness with safeguards that ensure equitable participation and secure, sustainable access to research materials. + +These challenges are particularly relevant in the context of {{ "DH" | abbr | safe }}, a field that harnesses the promise and impact of digital technologies and methodologies for the study and understanding of cultural phenomena. The adoption of Open Science principles has contributed to greater collaboration, transparency and accessibility in research practices in this field. Open data practices are particularly relevant, as they allow scholars to work with large and complex datasets, including digitised archives and social media data. Open educational resources can also be used to support the dissemination of {{ "CH" | abbr | safe }} literacy and skills, enabling wider audiences to engage with such resources. However, ensuring that such openness does not exacerbate inequities or introduce vulnerabilities requires thoughtful implementation. + +In addition to the principles of Open Science, the concept of Open Scholarship has been introduced by [@tennant_tale_2020] as a broader approach that encompasses the arts and humanities and goes beyond the research community to the wider public. Open Scholarship emphasises the importance of making research and scholarship accessible to a wider audience, including non-experts, educators and policy makers. It can be particularly relevant to the arts and humanities, as they often deal with complex cultural materials and narratives that have wider societal implications. By making their work openly accessible and engaging with non-experts, humanities researchers can contribute to public discourse, promote cultural understanding, and inform policy and decision-making. Open scholarship can also support greater collaboration and innovation within the Arts and Humanities by enabling researchers to work collaboratively across disciplines and with a wide range of constituents. For instance, open educational resources can be used to develop collaborative teaching and learning materials that draw on the expertise of scholars and practitioners from different disciplines, while open data practices can facilitate the sharing and reuse of {{ "CH" | abbr | safe }} materials. + +Conversely, @knochelmann_open_2019 advocates for the term Open Humanities as a dedicated discourse that would within the humanities. Notably, he argues that Open Humanities should adapt key Open Science elements to the Humanities' unique context. In the case of preprints, the challenges in the humanities, such as limited discipline-specific preprint servers and linguistic diversity, require tailored solutions to encourage adoption. Open peer review in the humanities should accommodate the field's subjectivity and diverse perspectives. Concerns about liberal copyright licenses revolve around potential misrepresentation and plagiarism, highlighting the importance of maintaining scholarly integrity regardless of the chosen license. Knochelmann's proposal underscores the need for context-sensitive approaches to promote openness and collaboration while respecting humanities' distinct characteristics. + +Overall, the principles of Open Science provide a framework for promoting greater collaboration, transparency and accessibility in research practices. Yet, the challenges discussed underscore the need for careful adaptation to address inequities, cybersecurity concerns, and field-specific nuances. The concept of Open Scholarship, which stresses the importance of making research and scholarship accessible to wider audiences, can be instrumental in broadening the impact of research in both natural sciences and the humanities, as Open Science encourages greater collaboration and innovation across disciplines. Ultimately, this underscores the need for adaptation and positions all academic disciplines as essential contributors to societal understanding, cultural preservation and informed decision-making, while ensuring the sustainability and integrity of open practices. + +##### 3.3.2.2 Citizen Science, Citizen Humanities {id="subsubsec:citizen"} + +(...) + +##### 3.3.2.3 FAIR Data Principles {id="subsubsec:fair"} + +The {{ "FAIR" | abbr | safe }} data principles[^99] were developed to ensure that three types of entities -- namely data, metadata, as well as infrastructures -- are Findable, Accessible, Interoperable, and Reusable. The four key principles of {{ "FAIR" | abbr | safe }} and their underlying 15 sub-elements or facets are as follows [@wilkinson_fair_2016]: + +(...) + +##### 3.3.2.4 CARE Principles for Indigenous Data Governance {id="subsubsec:care"} + +(...) + +##### 3.3.2.5 Collections as Data {id="subsubsec:collections-data"} (...) +(...) + + + ## 4. Exploring Relationships through an Actor-Network Theory Lens {id="cha:theoretical"} > As Jim Clifford taught me, we need stories (and theories) that are just big enough to gather up the complexities and keep the edges open and greedy for surprising new and old connections. [@haraway_staying_2016 p. 101] @@ -1444,9 +1590,7 @@ As I reflect on the journey of this thesis, I am reminded of the powerful dialog [^88]: Common Objects in Context: -[^89]: Viscounth -- A Large Dataset for Visual Question Answering for - Cultural Heritage:\ - +[^89]: Viscounth -- A Large Dataset for Visual Question Answering for Cultural Heritage: [^90]: Artificial Intelligence for Libraries, Archives & Museums: @@ -1465,11 +1609,7 @@ As I reflect on the journey of this thesis, I am reminded of the powerful dialog was to promote a lasting shift in European researchers' behaviour towards Open Science becoming the norm. -[^95]: According to the Open Knowledge Foundation, a non-profit network - established in 2004 in the U.K., which aims to promote the idea of - open knowledge, sets out some some principles around the concept of - openness and defines it as follows: .\ - See +[^95]: According to the Open Knowledge Foundation, a non-profit network established in 2004 in the U.K., which aims to promote the idea of open knowledge, sets out some some principles around the concept of openness and defines it as follows: *‘Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness)’*. [^96]: Phrenosis in philosophy is related to [@oxford_english_dictionary_phronesis_2023]