Radius: Off
Radius:
km Set radius for geolocation
Search

NYU Abu Dhabi Researchers Develop First Of Its Kind Large-Scale Readability Leveled Thesaurus For Arabic

NYU Abu Dhabi Researchers Develop First Of Its Kind Large-Scale Readability Leveled Thesaurus For Arabic

Researchers from NYU Abu Dhabi (NYUAD) have developed an Online Readability Leveled Arabic Thesaurus. The work was conducted by Associate Professor of Practice of Arabic Language Muhamed Al Khalil in collaboration with Professor of Computer Science Nizar Habash, who also leads the Computational Approaches to Modeling Language (CAMeL) Lab.

The one-of-a-kind interface provides the possible roots, English glosses, related Arabic words and phrases, and readability on a five-level readability scale for a user-inputted Arabic word. It also connects multiple existing Arabic resources and processing tools, enabling Arabic speakers and learners to benefit from recent advances in Arabic computational linguistics technologies.

The interface is one of the products of the NYUAD-funded project Simplification of Arabic Masterpieces for Extensive Reading (SAMER), and a demo version of it is available for public use here.

A collaboration between NYUAD’s Arabic Studies Program and CAMeL Lab, SAMER seeks to create a standard for the simplification of modern fiction in Arabic to school-age learners and to use this standard to simplify a number of Arabic fiction masterpieces.

Commenting on the research paper, Al Khalil said: “Arabic is one of the UN’s six official languages; it is the language of hundreds of millions of people in the Arab world and beyond. It is extraordinarily rich linguistically but with that comes higher complexity and a steeper learning curve. Add to this the fact that the standard form of Arabic used in education and media is not the daily form spoken by modern-day Arabs who speak a variety of its dialects. As such, there is a great need to have user-friendly tools supporting Arabic teachers and learners. We hope this will be an important aid in filling this learning gap.”

Habash commented: “Arabic poses many difficulties for Artificial Intelligence, some of which are similar to those facing new learners: it has a very rich word structure, a highly ambiguous spelling system, and many dialects.  The resources we developed have great potential for developing smart technologies that can assist natives and learners interested in writing and reading in Arabic.”

Established in September 2014, CAMeL’s mission is research and education in artificial intelligence, specifically focusing on natural language processing, computational linguistics, and data science. The main lab research areas are Arabic natural language processing, machine translation, text analytics, and dialogue systems.

The interface was presented as part of the International Conference on Computational Linguistics (COLING) 2020. The paper entitled A Large-Scale Leveled Readability Lexicon for Standard Arabic, (presented at the 12th Language Resources and Evaluation Conference in Marseille, France) provides further research background on the thesaurus.