In the last decade the Artificial Intelligence and Data Science communities have paid an increasing attention to the problem of forecasting stock market movements. The abundance of stock-related data, including price series, news articles, financial reports, and social content has leveraged the use of Machine Learning techniques to drive quantitative stock trading. In this field, a huge body of work has been devoted to identifying the most predictive features and to select the best performing algorithms. However, since algorithm performance is heavily affected by the granularity of the analyzed time series as well as by the amount of historical data used to train the ML models, identifying the most appropriate time granularity and ML pipeline can be challenging. This paper studies the relationship between the granularity of time series data and ML performance. It compares also the performance of established ML pipelines in order to evaluate the pros and cons of periodically retraining the ML models. Furthermore, it performs a step beyond towards the integration of ML into real trading systems by studying how to conveniently set up the most established trading system char- acteristics. The results provide preliminary empirical evidences on how to profitably trade U.S. NASDAQ-100 stocks and leave room for further investigations.

Exploring the Use of Data at Multiple Granularity Levels in Machine Learning-Based Stock Trading / Fior, Jacopo; Cagliero, Luca. - STAMPA. - (2020), pp. 333-340. (Intervento presentato al convegno IEEE ICDM 2020 tenutosi a Sorrento (IT) nel November 17-20, 2020) [10.1109/ICDMW51313.2020.00053].

Exploring the Use of Data at Multiple Granularity Levels in Machine Learning-Based Stock Trading

Fior Jacopo;Cagliero Luca
2020

Abstract

In the last decade the Artificial Intelligence and Data Science communities have paid an increasing attention to the problem of forecasting stock market movements. The abundance of stock-related data, including price series, news articles, financial reports, and social content has leveraged the use of Machine Learning techniques to drive quantitative stock trading. In this field, a huge body of work has been devoted to identifying the most predictive features and to select the best performing algorithms. However, since algorithm performance is heavily affected by the granularity of the analyzed time series as well as by the amount of historical data used to train the ML models, identifying the most appropriate time granularity and ML pipeline can be challenging. This paper studies the relationship between the granularity of time series data and ML performance. It compares also the performance of established ML pipelines in order to evaluate the pros and cons of periodically retraining the ML models. Furthermore, it performs a step beyond towards the integration of ML into real trading systems by studying how to conveniently set up the most established trading system char- acteristics. The results provide preliminary empirical evidences on how to profitably trade U.S. NASDAQ-100 stocks and leave room for further investigations.
File in questo prodotto:
File Dimensione Formato  
Exploring_the_Use_of_Data_at_Multiple_Granularity_Levels_in_Machine_Learning-Based_Stock_Trading.pdf

non disponibili

Descrizione: Articolo
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 359.63 kB
Formato Adobe PDF
359.63 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2846300