Hateful meme detection is a well-known research area that requires both visual and lin- guistic understanding. It matters because in today’s world information and opinions stem from multimedia. With people smartly disguising hateful intent behind apparently harmless images/text which when combined within cultural and societal context can hurt sentiments of various minority groups. Thus, there is a dire need to be able to detect such hateful multimedia in a multimodal setting. For this purpose, we have used Facebook’s hate meme detection data set specially anno- tated such that the unimodal priors are bound to fail, that is, the images and text individually don’t hold much signal. We have used ResNext and RoBERTa unimodal models as the base- lines. In order to explore the multimodality of the dataset, we used the early fusion approach by concatenating the ResNext embeddings of pure images (2047 dimensional) and RoBERTa embeddings of text (768 dimensional) and then subsequently performing classification using various fine-tuned models such as Shallow Feed Forward Network, Deep Feed Forward Net- work, CatBoost, LGBM, XGBoost and Logistic Regression.
Stars
4
Forks
0
Watchers
4
Open Issues
0
Overall repository health assessment
3
commits
1
commits
No package.json found
This might not be a Node.js project