I am a PhD candidate at MAPi joint Doctoral Program in Computer Science offered by the University of Minho, the University of Aveiro, and the University of Porto. I am also a researcher at the Laboratory of Artificial Intelligence and Decision Support. LIAAD-INESTEC.
My current research interest focus on Natural Language Processing and Machine Learning. I am also interested in creating language resource for low-resource languages (especially Hausa). I received Master’s degree from the University of Manchester, UK, and a Batchelor’s Degree from Bayero University, Kano, Nigeria. I am also a faculty member at the Faculty of Computer Science and Information Technology, Bayero University, Kano-Nigeria. I spend my time reading books and playing table tennis.
PhD in Computer Science, (2018 - ongoing)
University of Porto , Portugal
MSc in Computer Science, 2013
University of Manchester , UK
BSc in Computer Science, 2008
Bayero University ,Kano, Nigeria
With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large, web-mined text datasets covering hundreds of languages. However, to date there has been no systematic analysis of the quality of these publicly available datasets, or whether the datasets actually contain content in the languages they claim to represent.