diff --git a/README.md b/README.md index a35c5e0794e8c57453b3ca7cd32997c94d8d5670..8d6dd1999c91497f055f629833a031b2f009ead1 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Codex palatinus graecus 23 -Ground Truth dataset for the Codex palatinus graecus 23 (Palatine Anthology), byzantine writing from the X^th^ century. +Ground Truth dataset for the Codex palatinus graecus 23 (Palatine Anthology), byzantine writing from the X<sup>th</sup> century. ## License @@ -50,6 +50,10 @@ All abbreviations have been transcribed in expanded form: for example, "ȣ" is t The training has been done with images of the codex palatinus graecus 23 digitized by the Universitätsbibliothek Heidelberg (where the first part of the manuscript is kept -- the second one being in the BNF, as Supplementum graecum 384), and then uploaded to eScriptorium using IIIF. Find the manuscript [here](https://doi.org/10.11588/diglit.3449). +## Segmentation + +The [SegmOnto](https://segmonto.github.io/) ontology was used to classify regions and lines of the manuscript. + ## How to cite This dataset was built and is maintained by Maxime Guénette (@mguenette), Mathilde Verstraete (@mverstraete), Alix Chagué (@achague), Marcello Vitali-Rosati (@marviro). The digitization is not copyright-free, but the transcription is. However, properly annotating a corpus takes time and is a task that should be recognized. If you use any item from this corpus of ground truth, cite the dataset using the following information: