Prediction of Securities
This project contains various files that were generated during the time of creation of the course work
Project Structure
data/stocks
CSV Files: Various CSV files containing stock data and sentiment scores.
nytimes.csv
: sentiment scores from NYTimes.reuters.csv
: sentiment scores from Reuters.- final_data/: Contains final processed stock data for specific companies plus sentiments form NYT AND REUTERS. These files were used on Kaggle to optimise and test models.
AAPL.csv
: Apple Inc. stock data.JPM.csv
: JPMorgan Chase & Co. stock data.PG.csv
: Procter & Gamble Co. stock data.TM.csv
: Toyota Motor Corporation stock data.XOM.csv
: Exxon Mobil Corporation stock data.
Python Scripts: Scripts related to data preprocessing and sentiment analysis.
preprocessing.py
: Script for preprocessing stock data.stock_loader.py
: Script for loading stock data.__init__.py
: Initialization file for the package.
notebooks
- Local: Contains local Jupiter notebooks that were used for early stages of optimisation and testing
nyt_titles_loader.ipynb
: one of the files for web scraping, there were too many to include, also they were spread out across colab, kaggle- Other files showcase early attempts to use torch with optuna to tune RNNs
- Kaggle: Contains files from kaggle, later stages optimisation using GPU, Pruning callbakcs of Keras and XGBoost
regression_plots_and_metrics.ipynb
: final values and plots used in the reportclassification_plots_and_metrics.ipynb
: final values and plots used in the report
rnn_model
Using Keras: Contains RNN models implemented using Keras.
models.py
: Model gettersoptimise.py
: Optimisation for keras, only functions, the optimisation was done in Kaggle using their Tesla P100 GPU__init__.py
: Initialization file for the package.
Using Torch: Contains RNN models implemented using PyTorch.
classification.py
: Classification RNN models using PyTorch.early_stopping.py
: Early stopping utility for RNN models in PyTorch.loaders.py
: Data loaders for RNN models in PyTorch.optimise.py
: Optimization routines for RNN models in PyTorch.regression.py
: Regression RNN models using PyTorch.train_eval.py
: Training and evaluation scripts for RNN models in PyTorch.__init__.py
: Initialization file for the package.
utils
- Utility Scripts: Various utility scripts to support the main functionality.
sequences.py
: Utility functions for getting sequences.stock_loader_utils.py
: Utility functions for loading stock data.torch_train_util.py
: Utility functions for training PyTorch models.utils.py
: General utility functions.__init__.py
: Initialization file for the package.