|
Volume 12, Issue 12 (December 2025), Pages: 266-279
----------------------------------------------
Original Research Paper
Prediction of TASI returns using sentiment analysis and hybrid modeling methods: ARIMAX, random forest, and XGBoost
Author(s):
Somaiyah Alalmai *
Affiliation(s):
Finance Department, Faculty of Economics and Administration, King Abdulaziz University, Jeddah, Saudi Arabia
Full text
Full Text - PDF
* Corresponding Author.
Corresponding author's ORCID profile: https://orcid.org/0000-0001-6623-3171
Digital Object Identifier (DOI)
https://doi.org/10.21833/ijaas.2025.12.024
Abstract
This study examines the predictive relationship between financial news sentiment and the performance of the Saudi stock market, measured by the Tadawul All Share Index (TASI). A news sentiment index is developed using financial headlines from the Saudi Gazette published between March 2017 and March 2025. The FinBERT model, a natural language processing tool designed for financial text, is used to calculate sentiment scores, which are then averaged on a monthly basis. These sentiment measures are combined with key macroeconomic and market variables, including crude oil prices, interest rates, inflation, exchange rates, and trading volume. For prediction, a hybrid modeling framework is applied, integrating ARIMAX, Random Forest, and XGBoost to capture both linear and nonlinear relationships between TASI returns and sentiment. Model performance is evaluated using root mean squared error (RMSE), mean absolute error (MAE), and the coefficient of determination (R²). The results show that news sentiment and oil price movements have a significant effect on market returns, with important implications for investors, analysts, and policymakers in sentiment-sensitive emerging markets such as Saudi Arabia.
© 2025 The Authors. Published by IASE.
This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords
Financial news sentiment, Saudi stock market, Tadawul index, Machine learning models, Market prediction
Article history
Received 31 July 2025, Received in revised form 25 November 2025, Accepted 5 December 2025
Acknowledgment
No Acknowledgment.
Compliance with ethical standards
Conflict of interest: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Citation:
Alalmai S (2025). Prediction of TASI returns using sentiment analysis and hybrid modeling methods: ARIMAX, random forest, and XGBoost. International Journal of Advanced and Applied Sciences, 12(12): 266-279
Permanent Link to this page
Figures
Fig. 1
Tables
Table 1 Table 2 Table 3
Table 4
Table 5
Table 6
Table 7
----------------------------------------------
References (44)
- Ang A and Bekaert G (2007). Stock return predictability: Is it there? The Review of Financial Studies, 20(3): 651-707. https://doi.org/10.1093/rfs/hhl021
[Google Scholar]
- Ang A, Hodrick RJ, Xing Y, and Zhang X (2009). High idiosyncratic volatility and low returns: International and further U.S. evidence. Journal of Financial Economics, 91(1): 1-23. https://doi.org/10.1016/j.jfineco.2007.12.005
[Google Scholar]
- Araci D (2019). FinBERT: Financial sentiment analysis with pre-trained language models. Arxiv Preprint Arxiv:1908.10063. https://doi.org/10.48550/arXiv.1908.10063
[Google Scholar]
- Arouri MEH and Rault C (2012). Oil prices and stock markets in GCC countries: Empirical evidence from panel analysis. International Journal of Finance & Economics, 17(3): 242-253. https://doi.org/10.1002/ijfe.443
[Google Scholar]
- Baker M and Wurgler J (2006). Investor sentiment and the cross‐section of stock returns. The Journal of Finance, 61(4): 1645-1680. https://doi.org/10.1111/j.1540-6261.2006.00885.x
[Google Scholar]
- Barberis N, Shleifer A, and Vishny R (1998). A model of investor sentiment. Journal of Financial Economics, 49(3): 307-343. https://doi.org/10.1016/S0304-405X(98)00027-0
[Google Scholar]
- Bernanke BS and Kuttner KN (2005). What explains the stock market's reaction to Federal Reserve policy? The Journal of Finance, 60(3): 1221-1257. https://doi.org/10.1111/j.1540-6261.2005.00760.x
[Google Scholar]
- Blankespoor E, deHaan E, and Marinovic I (2020). Disclosure processing costs, investors’ information choice, and equity market outcomes: A review. Journal of Accounting and Economics, 70(2-3): 101344. https://doi.org/10.1016/j.jacceco.2020.101344
[Google Scholar]
- Bollen J, Mao H, and Zeng X (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1): 1-8. https://doi.org/10.1016/j.jocs.2010.12.007
[Google Scholar]
- Breiman L (2001). Random forests. Machine Learning, 45: 5-32. https://doi.org/10.1023/A:1010933404324
[Google Scholar]
- Chen H, Chong TTL, and She Y (2014). A principal component approach to measuring investor sentiment in China. Quantitative Finance, 14(4): 573-579. https://doi.org/10.1080/14697688.2013.869698
[Google Scholar]
- Chen T and Guestrin C (2016). XGBoost: A scalable tree boosting system. In the Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, San Francisco, USA: 785-794. https://doi.org/10.1145/2939672.2939785
[Google Scholar]
- Chui AC, Titman S, and Wei KJ (2010). Individualism and momentum around the world. The Journal of Finance, 65(1): 361-392. https://doi.org/10.1111/j.1540-6261.2009.01532.x
[Google Scholar]
- De Long JB, Shleifer A, Summers LH, and Waldmann RJ (1990). Noise trader risk in financial markets. Journal of Political Economy, 98(4): 703-738. https://doi.org/10.1086/261703
[Google Scholar]
- Devlin J, Chang MW, Lee K, and Toutanova K (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In the Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, USA, 1: 4171-4186. https://doi.org/10.18653/v1/N19-1423
[Google Scholar]
- Fischer T and Krauss C (2018). Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 270(2): 654-669. https://doi.org/10.1016/j.ejor.2017.11.054
[Google Scholar]
- Garcia J (2025). Beyond the headlines: Sentiment divergence and financial distress. Global Finance Journal, 66: 101126. https://doi.org/10.1016/j.gfj.2025.101126
[Google Scholar]
- Genuer R, Poggi JM, and Tuleau C (2008). Random Forests: Some methodological insights. Arxiv Preprint Arxiv:0811.3619. https://doi.org/10.48550/arXiv.0811.3619
[Google Scholar]
- Griffin JM, Ji X, and Martin JS (2003). Momentum investing and business cycle risk: Evidence from pole to pole. The Journal of Finance, 58(6): 2515-2547. https://doi.org/10.1046/j.1540-6261.2003.00614.x
[Google Scholar]
- Hammoudeh S and Aleisa E (2004). Dynamic relationships among GCC stock markets and NYMEX oil futures. Contemporary Economic Policy, 22(2): 250-269. https://doi.org/10.1093/cep/byh018
[Google Scholar]
- He Y, Qu L, Wei R, and Zhao X (2022). Media-based investor sentiment and stock returns: A textual analysis based on newspapers. Applied Economics, 54(7): 774-792. https://doi.org/10.1080/00036846.2021.1966369
[Google Scholar]
- Huang D and Zhou G (2017). Upper bounds on return predictability. Journal of Financial and Quantitative Analysis, 52(2): 401-425. https://doi.org/10.1017/S0022109017000096
[Google Scholar]
- Huang D, Jiang F, Tu J, and Zhou G (2015). Investor sentiment aligned: A powerful predictor of stock returns. The Review of Financial Studies, 28(3): 791-837. https://doi.org/10.1093/rfs/hhu080
[Google Scholar]
- Hutto C and Gilbert E (2014). VADER: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the International AAAI Conference on Web and Social Media, 8(1): 216-225. https://doi.org/10.1609/icwsm.v8i1.14550
[Google Scholar]
- Jiang T and Zeng A (2023). Financial sentiment analysis using FinBERT with application in predicting stock movement. Arxiv Preprint Arxiv:2306.02136. https://doi.org/10.48550/arXiv.2306.02136
[Google Scholar]
- Keynes JM (1937). The general theory of employment. The Quarterly Journal of Economics, 51(2): 209-223. https://doi.org/10.2307/1882087
[Google Scholar]
- Kim Y and Lee KY (2022). Impact of investor sentiment on stock returns. Asia‐Pacific Journal of Financial Studies, 51(1): 132-162. https://doi.org/10.1111/ajfs.12362
[Google Scholar]
- Kräussl R and Mirgorodskaya E (2017). Media, sentiment and market performance in the long run. The European Journal of Finance, 23(11): 1059-1082. https://doi.org/10.1080/1351847X.2016.1226188
[Google Scholar]
- Lahmiri S and Bekiros S (2020). Intelligent forecasting with machine learning trading systems in chaotic intraday Bitcoin market. Chaos, Solitons & Fractals, 133: 109641. https://doi.org/10.1016/j.chaos.2020.109641
[Google Scholar]
- Lefèvre E (2018). Reminiscences of a stock operator. Courier Dover Publications, Mineola, USA.
[Google Scholar]
- Nassirtoussi AK, Aghabozorgi S, Wah TY, and Ngo DCL (2014). Text mining for market prediction: A systematic review. Expert Systems with Applications, 41(16): 7653-7670. https://doi.org/10.1016/j.eswa.2014.06.009
[Google Scholar]
- Nelson DB (1991). Conditional heteroskedasticity in asset returns: A new approach. Econometrica, 59(2): 347-370. https://doi.org/10.2307/2938260
[Google Scholar]
- Nguyen HH, Ngo VM, Pham LM, and Van Nguyen P (2025). Investor sentiment and market returns: A multi-horizon analysis. Research in International Business and Finance, 74: 102701. https://doi.org/10.1016/j.ribaf.2024.102701
[Google Scholar]
- Pankratz A (2012). Forecasting with dynamic regression models. John Wiley & Sons, Ltd., Hoboken, USA.
[Google Scholar]
- Patel J, Shah S, Thakkar P, and Kotecha K (2015). Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Systems with Applications, 42(1): 259-268. https://doi.org/10.1016/j.eswa.2014.07.040
[Google Scholar]
- Rahman A and Hasan MM (2017). Modeling and forecasting of carbon dioxide emissions in Bangladesh using autoregressive integrated moving average (ARIMA) models. Open Journal of Statistics, 7(4): 560-566. https://doi.org/10.4236/ojs.2017.74038
[Google Scholar]
- Schmeling M (2009). Investor sentiment and stock returns: Some international evidence. Journal of Empirical Finance, 16(3): 394-408. https://doi.org/10.1016/j.jempfin.2009.01.002
[Google Scholar]
- Scornet E, Biau G, and Vert JP (2015). Consistency of random forests. The Annals of Statistics, 43(4): 1716–1741. https://doi.org/10.1214/15-AOS1321
[Google Scholar]
- Sharma A, Tiwari P, Gupta A, and Garg P (2021). Use of LSTM and ARIMAX algorithms to analyze impact of sentiment analysis in stock market prediction. In: Hemanth J, Bestak R, and Chen JIZ (Eds.), Intelligent data communication technologies and internet of things. Lecture notes on data engineering and communications technologies, 57: 377-394. Springer, Singapore, Singapore. https://doi.org/10.1007/978-981-15-9509-7_32
[Google Scholar]
- Shiller RJ (2003). From efficient markets theory to behavioral finance. Journal of Economic Perspectives, 17(1): 83-104. https://doi.org/10.1257/089533003321164967
[Google Scholar]
- Smales LA (2017). The importance of fear: Investor sentiment and stock market returns. Applied Economics, 49(34): 3395-3421. https://doi.org/10.1080/00036846.2016.1259754
[Google Scholar]
- Tetlock PC (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3): 1139-1168. https://doi.org/10.1111/j.1540-6261.2007.01232.x
[Google Scholar]
- Zhang GP (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50: 159-175. https://doi.org/10.1016/S0925-2312(01)00702-0
[Google Scholar]
- Zhou G (2018). Measuring investor sentiment. Annual Review of Financial Economics, 10(1): 239-259. https://doi.org/10.1146/annurev-financial-110217-022725
[Google Scholar]
|