Vineel Rayapati
Data Scientist
Unveiling Patterns: Neural Models for Stock Market Prediction
This project employs advanced neural network models to analyze the relationship between financial news sentiment and Nvidia’s stock prices. The primary models used are LSTMs (Long Short-Term Memory networks), GRUs (Gated Recurrent Units), and a CNN-LSTM hybrid model. Each model was chosen based on its ability to handle temporal data and capture patterns in time-series datasets, critical for understanding stock price movements influenced by news sentiment.
Why These Models?
-
LSTM:
-
LSTMs are specialized for sequential data, effectively capturing long-term dependencies, making them suitable for stock price prediction where past performance influences future trends.
-
They address the vanishing gradient problem, ensuring better learning of long-range dependencies in time-series data.
-
Strengths:
-
Robust for capturing long-term dependencies.
-
Effective at learning patterns in noisy financial data.
-
-
-
GRU:
-
GRUs are a simpler alternative to LSTMs, with fewer parameters and faster training. They maintain the ability to model temporal relationships, making them efficient for smaller datasets.
-
Strengths:
-
Faster training and inference compared to LSTMs.
-
Requires less computational power while delivering comparable results.
-
-
-
CNN-LSTM Hybrid:
-
This model combines convolutional layers (CNNs) for local feature extraction with LSTM layers for temporal modeling. The CNN extracts features from short-term patterns in the data, while the LSTM handles long-term dependencies.
-
Strengths:
-
Combines the strengths of CNNs (local pattern detection) and LSTMs (long-term modeling).
-
Particularly effective when sentiment and stock data contain both short-term spikes and long-term trends.
-
-
Model Implementation
LSTM Implementation
The LSTM model consists of:
-
Two LSTM layers (128 and 64 units) with tanh activation functions.
-
Dropout layers to reduce overfitting.
-
Dense layers for regression output (predicting stock close prices).​
GRU Implementation
The GRU model is similar to the LSTM but uses GRU layers instead, providing a simpler architecture with faster computation.
​
CNN-LSTM Hybrid
This model uses:
-
A Conv1D layer to extract features from sequences.
-
A MaxPooling1D layer for dimensionality reduction.
-
LSTM layers to process the sequence-level data.
Why Neural Networks?
Neural networks were chosen because of their ability to model complex, non-linear relationships between input features (e.g., sentiment, stock prices). The sequential nature of stock price data and the qualitative nature of sentiment analysis make neural networks ideal for this task.
Model Training and Optimization
Input Data Preparation:
-
Features: Sentiment scores, stock prices, volatility, and technical indicators.
-
Targets: Next-day stock close price.
-
Input sequences: Rolling windows of 5 timesteps (e.g., 5 days of data).
Training Details:
-
Optimizer: Adam optimizer with a learning rate of 0.001 ensures stable and efficient convergence.
-
Loss Function: Mean Squared Error (MSE) minimizes the difference between predicted and actual stock prices.
-
Validation: A portion of the training data is used for validation to monitor performance during training.
-
Callbacks: Early stopping prevents overfitting, while ReduceLROnPlateau adjusts the learning rate dynamically.
Model Visualizations
LSTM Architecture:
CNN-LSTM Architecture: