Data and Tools list
Data
- Daily transliterations dump from cdli https://github.com/cdli-gh/data
- Daily text metadata dump from cdli https://github.com/cdli-gh/data
- Manually annotated morphology gold corpus https://github.com/cdli-gh/mtaac_gold_corpus/tree/workflow/morph/to_dict
- Syntactically pre-annotated texts https://github.com/cdli-gh/MTAAC_syntax_preannotated
- RDF Ur III texts https://github.com/cdli-gh/rdf_converted_data
- Data preprared fro SMT https://github.com/cdli-gh/mtaac_cdli_ur3_corpus
- MTAAC and Framework Google drive https://drive.google.com/drive/u/0/folders/0B8-deXARunnhU2FHVzVqLXA4N3M
Converters and checkers
- metadata converter (CSV to TTL-RDF) https://github.com/cdli-gh/mtaac_work/tree/master/lod/metadata
- C-ATF to CDLI-CoNLL converter https://github.com/cdli-gh/atf2conll-convertor
- ATF to TEI https://github.com/cdli-gh/atf2tei
- Morphology pre-annotation tool https://github.com/cdli-gh/morphology-pre-annotation-tool
- CDLI-CoNLL to CoNLL-U converter https://github.com/cdli-gh/CDLI-CoNLL-to-CoNLLU-Converter
- CoNLL-U to Brat standalone converter https://github.com/cdli-gh/conllu.py
- Brat standalone to CDLI-CoNLL converter https://github.com/cdli-gh/brat_to_cdli_conll_converter
- Conll2rdf https://github.com/acoli-repo/conll-rdf
- CDLI version of PyOracc https://github.com/cdli-gh/pyoracc (unfinished)
- JTF ATF checker (unfinished)
- ATF Normalizer https://github.com/cdli-gh/mtaac_work/tree/master/ATF_transliteration_processor
- ETCSRI corpus (metadata and annorations) to RDF https://github.com/cdli-gh/mtaac_work/tree/master/lod
- CDLI data conversion for MT ingestion https://github.com/cdli-gh/cdli-data-extractor
- CoNLL merge https://github.com/acoli-repo/conll-merge
Pre-annotators
- Syntax parser ? https://github.com/cdli-gh/mtaac_work/tree/master/parse
- Syntax pre-annotator https://github.com/cdli-gh/mtaac_syntax_pipeline
ML Tools and and Other Tools
- ATF parser? https://github.com/cdli-gh/mtaac-package
- SRL tool for Sumerian https://github.com/cdli-gh/Semantic-Role-Labeler
- Translation pipeline (POS and NE tagging, ATF in, ATF with translation ATF out, and POS/NE CoNLL out) https://github.com/cdli-gh/Sumerian-Translation-Pipeline
- NMT (Summer 2019) https://github.com/cdli-gh/Machine-Translation
- New MT models (Summer 2020) https://github.com/cdli-gh/Semi-Supervised-NMT-for-Sumerian-English
Visualization, search, browse
- Annotation assistant (for manual annotators) https://github.com/cdli-gh/annotation_assistant
- CDLI new framework (alpha) https://gitlab.com/cdli/framework
- Multilayer annotation query tool CQP4RDF https://github.com/cdli-gh/cqp4rdf
- Viz and manipulate CoNLL-U syntax https://github.com/cdli-gh/sumerian-syntax-tree
- Commodities visualization https://github.com/cdli-gh/cdli-accounting-viz
- CTS Server https://github.com/cdli-gh/cdli-cts-server
- Scaife viewer https://github.com/cdli-gh/scaife
- CDLI Framework API client https://github.com/cdli-gh/framework-api-client
Émilie Pagé-Perron