Unveiling Exclusion by Machine Learning

The Case for North Korean Defectors

Contact

Introduction

North Korean defectors in South Korea face economic struggles, social stigma, and difficulties in cultural adaptation. This study explores whether defectors are genuinely embraced or if their acceptance is conditional on fulfilling certain favorable expectations, such as embodying positive stereotypes.

The study aims to prove the hypothesis that societal acceptance of North Korean defectors in South Korea is conditional. It investigates whether narratives that are positive and empowering enhance societal acceptance. The study aims to contribute to a broader discussion on migration and integration by analyzing these dynamics.

Methodology

This study collected data from Naver API for sentiment analysis (SA) and thematic analysis (TA). The SA focused on posts that mentioned "North Korean Defector" (탈북민/t'albungmin), collected using the Naver API. The data was translated into English for analysis.

The SA used the VADER library to capture sentiment trends. The TA employed advanced NLP techniques and machine learning models such as Logistic Regression, Support Vector Machines (SVM), and Latent Dirichlet Allocation (LDA) to classify the posts into political, empowering, and discriminating themes.

SA Results

The sentiment analysis results show varying percentages of positive sentiment each year.

Figure 1: Sentiment Analysis Results
Figure 1: Sentiment Analysis Results

TA Results

The thematic analysis reveals that the empowerment theme consistently emerged as the most prominent narrative but only explains a portion of the positive sentiment, indicating the presence of conditional inclusion.

Figure 2: Thematic Analysis Results
Figure 2: Thematic Analysis Results

Discussion

This study underscores the challenges of conditional inclusion, where defectors must conform to societal expectations to gain acceptance.

Despite the influence of empowering narratives, the significant role of non-empowerment themes suggests that societal acceptance remains conditional. This highlights the need for a societal framework that embraces defectors unconditionally, recognizing their inherent worth and contributions.

Limitations and Future Study

This study explores how public opinion affects the life satisfaction of North Korean defectors and the conditional inclusion they face in South Korean society. However, it has limitations: reliance on Naver blog posts may not fully capture societal views, and translation from Korean to English could distort sentiment and thematic analysis.

Future analyses will use KoBERT for direct Korean text assessment, improving accuracy. The thematic prediction model's low accuracy (less than 30%), discovered late in the research, suggests a need for a larger sample size—potentially over 15,000—to achieve better accuracy.

Acknowledgements

Thank you to Angela Perkins, the Director of Digital Scholarship Services at Lafayette College, for supporting me with my first machine learning research project. I am also grateful to Associate Professor Caleb Gallemore from the International Affairs Department at Lafayette College for his insights into statistical methods. Yazdan Basir, a Data Engineer at SEI, provided valuable advice on data analysis. I also thank Sidath Chandrasena, DHSS’2023 teaching fellow, and Tanushree Sow Mondal, DHSS'2024, for their persistent feedback. Lastly, I am grateful for the mentorship provided by Professor Seo-Hyun Park of the Government and Law Department and the Chair of the Asian Studies Program at Lafayette College, who generously shared her expertise and guidance.