Issues and perspectives from 10,000 annotated financial social media data
Journal
LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
Pages
6106-6110
Date Issued
2020
Author(s)
Abstract
In this paper, we investigate the annotation of financial social media data from several angles. We present Fin-SoMe, a dataset with 10,000 labeled financial tweets annotated by experts from both the front desk and the middle desk in a bank's treasury. These annotated results reveal that (1) writer-labeled market sentiment may be a misleading label; (2) writer's sentiment and market sentiment of an investor may be different; (3) most financial tweets provide unfounded analysis results; and (4) almost no investors write down the gain/loss results for their positions, which would otherwise greatly facilitate detailed evaluation of their performance. Based on these results, we address various open problems and suggest possible directions for future work on financial social media data. We also provide an experiment on the key snippet extraction task to compare the performance of using a general sentiment dictionary and using the domain-specific dictionary. The results echo our findings from the experts' annotations. ? European Language Resources Association (ELRA), licensed under CC-BY-NC
Subjects
Commerce; Social networking (online); Domain specific; Sentiment dictionaries; Social media datum; Investments
Type
conference paper
