Sound-squatting is a squatting technique that exploits similarities in word pronunciation to trick users into accessing malicious resources. It is an understudied threat that has gained traction with the popularity of smart speakers and audio-only content, such as podcasts. The picture gets even more complex when multiple languages are involved. We here introduce X-squatter, a multi- and cross-language AI-based system that relies on a Transformer Neural Network for generating high- quality sound-squatting candidates. We illustrate the use of X-squatter by searching for domain name squatting abuse across hundreds of millions of issued TLS certiicates, alongside other squatting types. Key indings unveil that approximately 15% of generated sound-squatting candidates have associated TLS certiicates, well above the prevalence of other squatting types (7%). Furthermore, we employ X-squatter to assess the potential for abuse in PyPI packages, revealing the existence of hundreds of candidates within a three-year package history. Notably, our results suggest that the current platform checks cannot handle sound-squatting attacks, calling for better countermeasures. We believe X-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against the threat.

X-squatter: AI Multilingual Generation of Cross-Language Sound-squatting / VIEIRA VALENTIM, Rodolfo; Drago, Idilio; Mellia, Marco; Cerutti, Federico. - In: ACM TRANSACTIONS ON PRIVACY AND SECURITY. - ISSN 2471-2566. - ELETTRONICO. - (2024). [10.1145/3663569]

X-squatter: AI Multilingual Generation of Cross-Language Sound-squatting

Rodolfo Vieira Valentim;Idilio Drago;Marco Mellia;
2024

Abstract

Sound-squatting is a squatting technique that exploits similarities in word pronunciation to trick users into accessing malicious resources. It is an understudied threat that has gained traction with the popularity of smart speakers and audio-only content, such as podcasts. The picture gets even more complex when multiple languages are involved. We here introduce X-squatter, a multi- and cross-language AI-based system that relies on a Transformer Neural Network for generating high- quality sound-squatting candidates. We illustrate the use of X-squatter by searching for domain name squatting abuse across hundreds of millions of issued TLS certiicates, alongside other squatting types. Key indings unveil that approximately 15% of generated sound-squatting candidates have associated TLS certiicates, well above the prevalence of other squatting types (7%). Furthermore, we employ X-squatter to assess the potential for abuse in PyPI packages, revealing the existence of hundreds of candidates within a three-year package history. Notably, our results suggest that the current platform checks cannot handle sound-squatting attacks, calling for better countermeasures. We believe X-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against the threat.
File in questo prodotto:
File Dimensione Formato  
3663569.pdf

accesso aperto

Descrizione: just accepted
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 2.69 MB
Formato Adobe PDF
2.69 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2988528