Sentiment-Driven Financial Market Forecasting Using VADER and Machine Learning Models
Keywords:
sentiment analysis, VADER, machine learning, financial forecasting, News HeadlinesAbstract
This study develops a sentiment-augmented machine learning framework to enhance short-term financial market forecasting using textual and numerical information. News headlines sourced from the Financial News Market Events Dataset for NLP 2025 on Kaggle were processed using the Valence Aware Dictionary and sEntiment Reasoner (VADER) to obtain compound sentiment scores. These scores were integrated with market indicators, including trading volume, event impact level, and percentage index change, to construct a supervised learning dataset. Preliminary correlation analysis indicates that sentiment polarity is positively associated with market direction, suggesting that headline tone contains actionable signals relevant to investor behaviour. To evaluate predictive performance, four machine learning algorithms which are Random Forest, Gradient Boosting, Support Vector Regression, and Long Short-Term Memory networks were trained and validated. Among the models tested, Random Forest achieved the strongest performance, producing an R² of 0.89 and a mean absolute error of 0.025. The LSTM model additionally captured sequential dependencies between news events and market responses, demonstrating the benefit of temporal modelling in sentiment-driven prediction tasks. Feature importance analysis further revealed that sentiment-derived variables contribute meaningfully alongside traditional numerical indicators. Overall, the findings demonstrate that incorporating transparent lexicon-based sentiment extraction within machine learning pipelines improves the accuracy and interpretability of short-horizon financial forecasting. The proposed framework provides a scalable foundation for future applications involving context-aware sentiment models, broader multi-market validation, and real-time decision-support systems.










