Add the missing SRCMF texts
During the conversion from the original annotations to UD, some texts in SRCMF have not been manually checked and have not received the automatic corrections of the UD 2.6 version. For now they are only in the internal Dropbox of the project.
Generate the UD tools violation report
Apply manual corrections
Apply automatic corrections
Determine a standard split for UD
- Check the UD policy for data extensions: should we take the opportunity to generate whole new pseudo-random split of the corpus? It might be an opportunity to have full control (and doc!) on the split generation process.