In recent years, machine learning models for jet tagging in high-energy physics have gained considerable attention. However, many existing approaches overlook the physical invariants that jets must adhere to, particularly the fundamental spacetime symmetry governed by Lorentz transformations. In this study, we propose a model-agnostic training strategy that incorporates theory-guided data augmentation to simulate the effects of Lorentz transformations on jet data. We specifically focus on the state-of-the-art baseline ParticleNet, a neural network architecture designed for the direct processing of particle clouds for jet tagging. To evaluate the effectiveness of our approach, we conduct experiments with different augmentation strategies and assess the performance of the augmented models on the widely used top-tagging reference dataset. The results show that even a small application of the data augmentation strategy increases the robustness of the model to Lorentz boost attacks, i.e., high transformation β. While the accuracy of the baseline model decreases rapidly with increasing intensity of the transformation β, the augmented models exhibit more stable performance. Remarkably, models that underwent a moderate level of augmentation demonstrated a statistically significant performance boost on transformations beyond the ones seen at train time. This finding highlights the potential of the data augmentation strategy in enhancing model accuracy while preserving the essential physical properties of the jets.
Lorentz-invariant augmentation for high-energy physics / Monaco, Simone; Barresi, Sebastiano; Apiletti, Daniele. - ELETTRONICO. - (2023). (Intervento presentato al convegno European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases tenutosi a Turin (ITA) nel September 18-22 2023).
Lorentz-invariant augmentation for high-energy physics
Monaco,Simone;Barresi,Sebastiano;Apiletti,Daniele
2023
Abstract
In recent years, machine learning models for jet tagging in high-energy physics have gained considerable attention. However, many existing approaches overlook the physical invariants that jets must adhere to, particularly the fundamental spacetime symmetry governed by Lorentz transformations. In this study, we propose a model-agnostic training strategy that incorporates theory-guided data augmentation to simulate the effects of Lorentz transformations on jet data. We specifically focus on the state-of-the-art baseline ParticleNet, a neural network architecture designed for the direct processing of particle clouds for jet tagging. To evaluate the effectiveness of our approach, we conduct experiments with different augmentation strategies and assess the performance of the augmented models on the widely used top-tagging reference dataset. The results show that even a small application of the data augmentation strategy increases the robustness of the model to Lorentz boost attacks, i.e., high transformation β. While the accuracy of the baseline model decreases rapidly with increasing intensity of the transformation β, the augmented models exhibit more stable performance. Remarkably, models that underwent a moderate level of augmentation demonstrated a statistically significant performance boost on transformations beyond the ones seen at train time. This finding highlights the potential of the data augmentation strategy in enhancing model accuracy while preserving the essential physical properties of the jets.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2981734