Dr. Marcos Zampieri is a research fellow at the Research Institute for Information and Language Processing (RIILP). Marcos received his PhD from Saarland University in Germany with a thesis on computational methods applied to varieties of pluricentric languages. He has an interdisciplinary profile with degrees in Information Systems, Translation Studies, and Natural Language Processing.
His research interests include language and dialect variation, low-resource languages, second language acquisition and educational NLP applications, author profiling and forensic linguistics, text categorization, and translation technology.
Identifying and Categorizing Offensive Language in Social Media
Offensive language is pervasive in social media. Individuals frequently take advantage of the perceived anonymity of computer-mediated communication, using this to engage in behaviour that many of them would not consider in real life. Online communities, social media platforms, and technology companies have been investing heavily in ways to cope with offensive language to prevent abusive behaviour in social media. One of the most effective strategies for tackling this problem is to use computational methods to identify offense, aggression, and hate speech in user-generated content (e.g. posts, comments, microblogs, etc.)
In this project, I propose the creation of a new high-quality manually annotated dataset focusing on the type and target of offensive language in social media posts. The dataset will be used to train robust computational methods for offensive language identification.