Week 6- XLM and Pre-Training

by Rachit


Welcome of CDLI Blogs.

Please update the author name and add tags too.

This page should contain the report made for every week.

Replace Project# with your project name.

Week Summary

A complete report of the work done during the week must be written here.

Daily Work Update

# Day Date A short description of the work done
1 Monday 2020/06/01 Cloned Facebook’s implementation of XLM and understood code
2 Tuesday 2020/06/02 Re-wrote/Heavily modified data preperation code for sumerian-english texts
3 Wednesday 2020/06/03 Resolved all issues and errors, started pre-training on 1M sumerian and 20M english monolingual data (general texts)
4 Thursday 2020/06/04 Still pre-training, reached 200 epochs. Prepared scripts to be used for the next steps
5 Friday 2020/06/05 Training stopped and evaluated. Poor results, probably because of very out-of-domain English data. Created data_prep_2
6 Saturday 2020/06/06 Created data with English data from UrIII Admin texts and started training
7 Sunday 2020/06/07 Created end-to-end inference script for evaluation and getting translation for an input