ºÝºÝߣ

ºÝºÝߣShare a Scribd company logo
Textual Analysis Learning Algorithm for Stock Market Prediction
                                                                       Harmeet Cheema, Roger He, Liyuan Li, Nathan Mak



Stock Predictions Today                                        A Textual Approach                                            article weightings. Regression analysis is conducted over
Financial analysts seek to create models that can be used                                                                    a training set using a least square error minimization.
to predict movements in the financial sector. Stock trading                                                                  ? MySQL database hosted on an Amazon¡¯s EC2 serv-
in particular, makes up a significant portion of the finan-                                                                  er provides a reliable base for running scraping and pro-
cial industry. Developments in modeling have led to vari-                                                                    cessing operations. The structure of the database contains
ous theories based on technical analysis.                                                                                    both learning algorithm dependent and independent sec-
                                                                                                                             tions for future algorithm flexibility.
                                                                                                                             ? A webpage heavily structured around dynamic java
                                                                                                                             widgets used to view results pulled from the database.

                                                                                                                             Performance
                                                                                                                             ?   Nearly 100 000 articles processed for only 10 stocks
                                                                                                                             ?   Clear correlation between news
One method of removing the randomness of stock move-                                                                         ?   71% predictive accuracy
ments is to observe its reaction to news. The public reads
the New York Times, watches CNN and attends company                                                                          Challeneges for the Future
announcements in hopes of gaining insight into the day¡¯s                                                                     ? More efficient algorithms for handling large dynamic
trading. Natural disasters, elections, quarterly reports and                                                                 training sets
other breaking news have the potential to create large
                                                                                                                             ? Filtering news based on industry, time, location, quan-
market movements. The cause of changes is not fully un-
                                                                                                                             tity, writing style and content type
derstood and is likely due to a combination of group psy-
chology and fundamental valuations. Regardless of cause,                                                                     ?   Incorporation of natural language processing
news presents a reliable instigator for change.
                                                               The project consists of four highly modular components:       References
Research Goals                                                 ? A News and financial data collection python script          Jang, J.-S. (1993). ANFIS: adaptive-network-based fuzzy
1. Develop a system to process vast amounts of news            includes basic text processing and metric calculations.       inference system . Systems, Man and Cybernetics, IEEE
and ticker data                                                Thousands of articles are scraped from Google.com news        Transactions on , 665-685.
                                                               and filtered with a blacklist. Day to day stock prices are
2. Find a correlation between published news and mar-
                                                               collected from Marketwatch.com. The results, regardless       Schumaker, R., & Chen, H., (2009) Textual Analysis of
ket changes
                                                               of source, are temporarily stored in an XML formatted file.   Stock Market Prediction Using Breaking Financial News.
3. Exploit this correlation to make predictions in stock                                                                     Association for Computing Machinery Transactions on In-
                                                               ? Learning algorithm which includes additional text pro-
movements                                                                                                                    formation Systems, 27(2)
                                                               cessing for calculating word count correlation and setting

More Related Content

Fydp poster type a rev 1.0

  • 1. Textual Analysis Learning Algorithm for Stock Market Prediction Harmeet Cheema, Roger He, Liyuan Li, Nathan Mak Stock Predictions Today A Textual Approach article weightings. Regression analysis is conducted over Financial analysts seek to create models that can be used a training set using a least square error minimization. to predict movements in the financial sector. Stock trading ? MySQL database hosted on an Amazon¡¯s EC2 serv- in particular, makes up a significant portion of the finan- er provides a reliable base for running scraping and pro- cial industry. Developments in modeling have led to vari- cessing operations. The structure of the database contains ous theories based on technical analysis. both learning algorithm dependent and independent sec- tions for future algorithm flexibility. ? A webpage heavily structured around dynamic java widgets used to view results pulled from the database. Performance ? Nearly 100 000 articles processed for only 10 stocks ? Clear correlation between news One method of removing the randomness of stock move- ? 71% predictive accuracy ments is to observe its reaction to news. The public reads the New York Times, watches CNN and attends company Challeneges for the Future announcements in hopes of gaining insight into the day¡¯s ? More efficient algorithms for handling large dynamic trading. Natural disasters, elections, quarterly reports and training sets other breaking news have the potential to create large ? Filtering news based on industry, time, location, quan- market movements. The cause of changes is not fully un- tity, writing style and content type derstood and is likely due to a combination of group psy- chology and fundamental valuations. Regardless of cause, ? Incorporation of natural language processing news presents a reliable instigator for change. The project consists of four highly modular components: References Research Goals ? A News and financial data collection python script Jang, J.-S. (1993). ANFIS: adaptive-network-based fuzzy 1. Develop a system to process vast amounts of news includes basic text processing and metric calculations. inference system . Systems, Man and Cybernetics, IEEE and ticker data Thousands of articles are scraped from Google.com news Transactions on , 665-685. and filtered with a blacklist. Day to day stock prices are 2. Find a correlation between published news and mar- collected from Marketwatch.com. The results, regardless Schumaker, R., & Chen, H., (2009) Textual Analysis of ket changes of source, are temporarily stored in an XML formatted file. Stock Market Prediction Using Breaking Financial News. 3. Exploit this correlation to make predictions in stock Association for Computing Machinery Transactions on In- ? Learning algorithm which includes additional text pro- movements formation Systems, 27(2) cessing for calculating word count correlation and setting