Exploring lavender tongue from social media texts
Journal
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, ROCLING 2017
Pages
68-80
Date Issued
2017
Author(s)
Wu H.-H
Abstract
Under the issue of gender and Natural Language Processing (NLP), most papers aim at gender-norm language that spoken by biologically males and females with opposite-sex desires. However, from the point of view of sexual orientation, this study presents the first work in the task of Chinese homosexual identification. Firstly, we collect homosexual texts from social media, and secondly examine linguistic behavior found in gay and lesbian texts. In addition, we also provide sets of linguistic features to automatically predict homosexual language with the adoption of 5-fold cross-validation Support Vector Machine (SVM) and Naive Bayes (NB) models. Training procedure in the study resulted in promising f-score around 70% with the use of particular lexicon-based feature set. ? The Association for Computational Linguistics and Chinese Language Processing
Subjects
Barium compounds; Computational linguistics; Natural language processing systems; Speech processing; Support vector machines; Cross validation; Feature sets; Lexicon-based; Linguistic features; NAtural language processing; Sexual orientations; Social media; Training procedures; Social networking (online)
Type
conference paper
