Data and Tools list

Data

  • Daily transliterations dump from cdli https://github.com/cdli-gh/data
  • Daily text metadata dump from cdli https://github.com/cdli-gh/data
  • Manually annotated morphology gold corpus https://github.com/cdli-gh/mtaac_gold_corpus/tree/workflow/morph/to_dict
  • Syntactically pre-annotated texts https://github.com/cdli-gh/MTAAC_syntax_preannotated
  • RDF Ur III texts https://github.com/cdli-gh/rdf_converted_data
  • Data preprared fro SMT https://github.com/cdli-gh/mtaac_cdli_ur3_corpus
  • MTAAC and Framework Google drive https://drive.google.com/drive/u/0/folders/0B8-deXARunnhU2FHVzVqLXA4N3M

Converters and checkers

  • metadata converter (CSV to TTL-RDF) https://github.com/cdli-gh/mtaac_work/tree/master/lod/metadata
  • C-ATF to CDLI-CoNLL converter https://github.com/cdli-gh/atf2conll-convertor
  • ATF to TEI https://github.com/cdli-gh/atf2tei
  • Morphology pre-annotation tool https://github.com/cdli-gh/morphology-pre-annotation-tool
  • CDLI-CoNLL to CoNLL-U converter https://github.com/cdli-gh/CDLI-CoNLL-to-CoNLLU-Converter
  • CoNLL-U to Brat standalone converter https://github.com/cdli-gh/conllu.py
  • Brat standalone to CDLI-CoNLL converter https://github.com/cdli-gh/brat_to_cdli_conll_converter
  • Conll2rdf https://github.com/acoli-repo/conll-rdf
  • CDLI version of PyOracc https://github.com/cdli-gh/pyoracc (unfinished)
  • JTF ATF checker (unfinished)
  • ATF Normalizer https://github.com/cdli-gh/mtaac_work/tree/master/ATF_transliteration_processor
  • ETCSRI corpus (metadata and annorations) to RDF https://github.com/cdli-gh/mtaac_work/tree/master/lod
  • CDLI data conversion for MT ingestion https://github.com/cdli-gh/cdli-data-extractor
  • CoNLL merge https://github.com/acoli-repo/conll-merge

Pre-annotators

  • Syntax parser ? https://github.com/cdli-gh/mtaac_work/tree/master/parse
  • Syntax pre-annotator https://github.com/cdli-gh/mtaac_syntax_pipeline

ML Tools and and Other Tools

  • ATF parser? https://github.com/cdli-gh/mtaac-package
  • SRL tool for Sumerian https://github.com/cdli-gh/Semantic-Role-Labeler
  • Translation pipeline (POS and NE tagging, ATF in, ATF with translation ATF out, and POS/NE CoNLL out) https://github.com/cdli-gh/Sumerian-Translation-Pipeline
  • NMT (Summer 2019) https://github.com/cdli-gh/Machine-Translation
  • New MT models (Summer 2020) https://github.com/cdli-gh/Semi-Supervised-NMT-for-Sumerian-English

Visualization, search, browse

  • Annotation assistant (for manual annotators) https://github.com/cdli-gh/annotation_assistant
  • CDLI new framework (alpha) https://gitlab.com/cdli/framework
  • Multilayer annotation query tool CQP4RDF https://github.com/cdli-gh/cqp4rdf
  • Viz and manipulate CoNLL-U syntax https://github.com/cdli-gh/sumerian-syntax-tree
  • Commodities visualization https://github.com/cdli-gh/cdli-accounting-viz
  • CTS Server https://github.com/cdli-gh/cdli-cts-server
  • Scaife viewer https://github.com/cdli-gh/scaife
  • CDLI Framework API client https://github.com/cdli-gh/framework-api-client

Émilie Pagé-Perron