The fast-paced evolution of cyberattacks to digital infrastructures requires new protection mechanisms to counterattack them. Malware attacks, a type of cyberattacks ranging from viruses and worms to ransomware and spyware, have been traditionally detected using signature-based methods. But with new versions of malware, this approach is not good enough, and new machine learning tools look promising. In this paper we present two methods to detect Linux malware using machine learning models: (1) a dynamic approach, that tracks the application executed instructions (opcodes) while they are being executed; and (2) a static approach, that inspects the binary application files before execution. We evaluate (1) five machine learning models (Support Vector Machine, k-Nearest Neighbor, Naive Bayes, Decision Tree and Random Forest) and (2) a deep neural network using a Long Short-Term Memory architecture with word embedding. We show the methodology, the initial dataset preparation, the infrastructure used to obtain the traces of executed instructions, and the evaluation of the results for the different models used. The obtained results show that the dynamic approach with a Random Forest classifier gets a 90% accuracy or higher, while the static approach obtains a 98% accuracy.

Automatic linux malware detection using binary inspection and runtime opcode tracing / Alonso, Martí; Gironés, Andreu; Costa, Juan-José; Morancho, Enric; Di Carlo, Stefano; Canal, Ramon. - In: MICROPROCESSORS AND MICROSYSTEMS. - ISSN 0141-9331. - ELETTRONICO. - 120:(2026), pp. 1-8. [10.1016/j.micpro.2025.105237]

Automatic linux malware detection using binary inspection and runtime opcode tracing

Di Carlo, Stefano;
2026

Abstract

The fast-paced evolution of cyberattacks to digital infrastructures requires new protection mechanisms to counterattack them. Malware attacks, a type of cyberattacks ranging from viruses and worms to ransomware and spyware, have been traditionally detected using signature-based methods. But with new versions of malware, this approach is not good enough, and new machine learning tools look promising. In this paper we present two methods to detect Linux malware using machine learning models: (1) a dynamic approach, that tracks the application executed instructions (opcodes) while they are being executed; and (2) a static approach, that inspects the binary application files before execution. We evaluate (1) five machine learning models (Support Vector Machine, k-Nearest Neighbor, Naive Bayes, Decision Tree and Random Forest) and (2) a deep neural network using a Long Short-Term Memory architecture with word embedding. We show the methodology, the initial dataset preparation, the infrastructure used to obtain the traces of executed instructions, and the evaluation of the results for the different models used. The obtained results show that the dynamic approach with a Random Forest classifier gets a 90% accuracy or higher, while the static approach obtains a 98% accuracy.
File in questo prodotto:
File Dimensione Formato  
2025_DFTS_extension__Malware_Attacks__Martý_Andreu_.pdf

embargo fino al 12/12/2027

Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Creative commons
Dimensione 524.17 kB
Formato Adobe PDF
524.17 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
1-s2.0-S0141933125001048-main.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 2.78 MB
Formato Adobe PDF
2.78 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/3005919