Scalable System for Textual Analysis based Stock Market Prediction
Date Issued
2014
Date
2014
Author(s)
Lin, Roy Guanyu
Abstract
Stock Market Prediction is a problem that people deal with when they want to predict market trend. For short-term investment, news is one of the most important factors that has influence on stock price. Based on this idea, our target issue is to build a scalable stock market prediction system, which can process Chinese news articles in order to produce a prediction model. With this system, we can speed up the model training process and take into account more training source, e.g., posts from China’s microblog service, Sina Weibo. Also, with the emergence of cloud computing, a scalable system can lease more resources from cloud to serve the growing work. Our solution about building this system is using mature open source project, such as Hadoop for parallel computing, Mahout for scalable machine learning, and Jieba for Chinese text segmentation. We provide a basic algorithm for stock trend prediction, build the software stack, collect the news in Taiwan during March 2009 to May 2014 and also run some experiments to evaluate scalability of this system. The result shows that in this application, Jieba Chinese text Segmentation tool can scale well with multiprocessing, namely, 80 percent faster with four parallel processes compared to sequential mode. However, Mahout does not show significant speedup in this scenario.
Subjects
Distributed System
Scalability
Stock Market Prediction
Cloud Computing
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-103-R00922096-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):f829546e549c4b0689f03c2b20dcf9c9
