A Hybrid Deep Learning Model with Squeeze-and-Excitation Gated Attention for Multilingual Hate Speech Detection in Code-Mixed Social Media
DOI:
https://doi.org/10.7492/v9hww194Abstract
Social media platforms have kind of turned into key centres for communication but lately they are also used more and more to spread hate speech HS especially in multilingual code mixed situations like Hinglish and that causes a lot of societal harm. Even with recent work the HS detection models still have trouble with accuracy scaling up and fairness mainly on diverse datasets that are imbalanced and often messy in a code mixed way. This paper presents CBLSEGA CNN BiLSTM with Squeeze and Excitation Gated Attention a lightweight hybrid deep learning framework for robust multilingual HS detection. Using XLMR multilingual embeddings CNN layers extract local hateful patterns language specific slurs while BiLSTM captures sequential context in code mixed text. The novel SEGA mechanism efficiently recalibrates features and adaptively fuses multi level representations maintaining minimal parameters less than 2M for scalability. Phased evaluation includes baselines CNN LSTM CNN BiLSTM attention integration and CBLSEGA ablation on code mixed datasets HASOC. CBLSEGA achieves superior macro F1 90.0 percent AUC ROC 0.87 for HS class and plus 30 percent F1 improvement over baselines demonstrating effectiveness for real world social network scenario. Social media platforms have become major centers for communication however they are continuously exploited to spread hate speech HS especially in mixed multilingual contexts such as Hinglish which harms society. Current HS detection models face difficulties in accuracy scalability and fairness when dealing with mixed language and imbalanced datasets. This paper presents CBLSEGA CNN BiLSTM with Squeeze and Excitation Gated Attention a lightweight hybrid deep learning framework for robust multilingual HS detection. Using multilingual XLMR embeddings CNN layers extract local hateful patterns language specific swear words while BiLSTM captures the sequential context in mixed language text. The new SEGA mechanism efficiently recalibrates features and adaptively integrates multi level representations while keeping the number of parameters minimal less than 2M for scalability. In preliminary evaluation on the mixed language HASOC dataset the baselines CNN LSTM CNN BiLSTM attention integration and CBLSEG were compared. CBLSEGA achieves higher macroF1 90.0 percent AUC ROC 0.87 for HS class and plus 30 percent F1 improvement over reference line demonstrating effectiveness for real life social network scenario.








