The most interesting work that got completed in this phase was the Linked Data API, planned for last phase. All entities in the core catalog now support exports in linked data formats, and except for very specific, redundant fields all data is covered. Most of the catalog is described with the CIDOC-CRM ontology (Doerr et al., 2020; ICOM/CIDOC, 2011), though other schemas are used for bibliographical, archaeological, geospatial and linguistic information.
The linguistic information is generated with (Chiarcos & Fäth, 2017) and linked based on the work described in (Chiarcos et al., 2018). In fact, the entire model is mostly compatible with that work, except for some discrepancies caused by the ontological difference between physical artifacts and conceptual composites.
The general final blog post will contain more information on the contents of the linked data specifically.
I have also worked on smaller things, finishing previous objectives and working on the relatively smaller objectives of this phase.
Chiarcos, C., & Fäth, C. (2017). CoNLL-RDF: Linked Corpora Done in an NLP-Friendly Way. In J. Gracia, F. Bond, J. P. McCrae, P. Buitelaar, C. Chiarcos, & S. Hellmann (Eds.), Language, Data, and Knowledge (pp. 74–88). Springer International Publishing. https://doi.org/10.1007/978-3-319-59888-8_6
Chiarcos, C., Pagé-Perron, É., Khait, I., Schenk, N., & Reckling, L. (2018, May). Towards a Linked Open Data Edition of Sumerian Corpora. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). https://www.aclweb.org/anthology/L18-1387
Doerr, M., Light, R., & Hiebel, G. (2020, January). Implementing the CIDOC Conceptual Reference Model in RDF. 45th joint meeting of the CIDOC CRM SIG and SO/TC46/SC4/WG9; 38th FRBR – CIDOC CRM Harmonization, Heraklion. http://www.cidoc-crm.org/Issue/ID-443-implementing-the-cidoc-conceptual-reference-model-in-rdf
ICOM/CIDOC Documentation Standards Group, & CIDOC CRM Special Interest Group. (2011). Definition of the CIDOC Conceptual Reference Model. CIDOC-CRM. http://www.cidoc-crm.org/html/5.0.4/cidoc-crm.html
Week | Objectives | Deliverables | Status |
---|---|---|---|
9 | Setup search API | Functional search API for the catalogue | partially complete |
10 | Annotate HTML pages | HTML pages with microdata and metadata | partially complete |
11 | Document APIs | API documentation | complete |
12 | Quality assurance and/or SDK | Integration tests, simple SDK | partially complete |
I really enjoyed working on the linked data. In total I spent at least two weeks working on the final model, discussing with people inside and outside the project along the way. In the end I think I have good results, too.
Two deliverables are, as of writing, partially or completely blocked by other students’ deliverables. In some cases I could implement my functionality, but at this point in the project I do not want to cause merge conflicts, interfering with their work.
Getting non-PHP libraries to work from PHP was a bit of a hassle, but combined with a local cache my ideas from earlier in the project turned out to be good enough to be applied here as well.
I could unfortunately not compare my interpretation of the CIDOC-CRM ontology because the only major implementations I could find information about, the British Museum and the French MoDyCo, seem to have no actual information available since at least 2018. I asked the British Museum about this at the end of July, but they have not gotten back to me, nor acknowledged my message in any way. In fact, there is a Twitter bot made in October 2018 that tweets a variation of “no” every 10 hours or so, answering the simple question “Is the British Museum’s endpoint working?” Little to no hope there.