Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

A Machine Learning Approach to Regional Text Classification in Multilingual Social Media

Author : Tejasmani, Guhan, Raja S

Date of Publication :17th September 2025

Abstract: This paper presents a supervised natural language processing approach to detect the geographic region and implied user interests from social media text, specifically YouTube comments from India and China. Using a dataset of 10,000 region-labeled comments, we implement a DistilBERT-based classifier enhanced with data augmentation to address class imbalance and noisy, code-mixed inputs. Our model achieves a test accuracy of 91.2%, with recall above 85% for both regions. The extracted insights on user background enable personalized content recommendations, addressing cold-start challenges in recommendation systems. The study contributes an effective pipeline for region-aware user profiling in multilingual, noisy social media environments.

Reference :

Will Updated soon

Recent Article