Highlights of the Year

The main highlight for ALMAnaCH in 2019 is the publication of CamemBERT, a French neural language model trained on the French section of OSCAR, our very large multilingual web-based raw corpus. The publication of this first Transformer-based language model for French, which allowed us to improve the state of the art in several classical NLP tasks, met a large success both in the academic and industrial worlds. It is the topic of an article in the major French daily newspaper Le Monde and of a broadcast on the national radio France Culture.