صندلی اداری

AUTOMATIC TRANSLATION OF SQUAD AND RACE QUESTION ANSWERING DATASETS IN BULGARIAN LANGUAGE

Simeon Monov, Detelinka Trifonova, Nikolay Pavlov, Andrey Nikolov

Abstract


There are many question-answering (QA) datasets, used in different natural lan-guage processing (NLP) tasks with SQuAD one of the most popular QA dataset around.RACE dataset is popular dataset for Multi Choice Question Answering (MCQA) taskand used to evaluate and train MCQA models. These datasets are available in Englishlanguage only.We took these two datasets and translated them in Bulgarian language using au-tomated translation techniques. After that we evaluated the new translated datasetson Extractive QA and MCQA tasks. Experimental results show, that our datasets canbe effectively used to improve the performance of transformer models on QA tasks inBulgarian language.

Full Text: PDF

Refbacks

  • There are currently no refbacks.
گن لاغری

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.