Sound-squatting is a phishing attack that tricks users into accessing malicious resources by exploiting similarities in the pronunciation of words. It is an understudied threat that gains traction with the popularity of smart-speakers and the resurgence of content consumption exclusively via audio, such as podcasts. Defending against sound-squatting is complex, and existing solutions rely on manually curated lists of homophones, which limits the search to a few (and mostly existing) words only. We introduce Sound-squatter, a multi-language AI-based system that generates sound-squatting candidates for a proactive defence that covers over 80\% of exact homophones and further generates thousands of high-quality approximated homophones. Sound-squatter relies on a state-of-art Transformer Network to learn transliteration. We search for Sound-squatter generated cross-language sound-squatting domains over hundreds of millions of emitted TLS certificates compared with other types of squatting candidates. Our finding reveals that around 6% of generated sound-squatting candidates have emitted TLS certificates, compared to 8% of other types of squatting candidates. We believe \Sound-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against sound-squatting.
Lost in Translation: AI-based Generator of Cross-Language Sound-Squatting / Valentim, Rodolfo; Drago, Idilio; Mellia, Marco; Cerutti, Federico. - (2023), pp. 513-520. (Intervento presentato al convegno 2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW) tenutosi a Delft, Netherlands nel 2-7 July 2023) [10.1109/EuroSPW59978.2023.00063].
Lost in Translation: AI-based Generator of Cross-Language Sound-Squatting
Rodolfo Valentim;Idilio Drago;Marco Mellia;Federico Cerutti
2023
Abstract
Sound-squatting is a phishing attack that tricks users into accessing malicious resources by exploiting similarities in the pronunciation of words. It is an understudied threat that gains traction with the popularity of smart-speakers and the resurgence of content consumption exclusively via audio, such as podcasts. Defending against sound-squatting is complex, and existing solutions rely on manually curated lists of homophones, which limits the search to a few (and mostly existing) words only. We introduce Sound-squatter, a multi-language AI-based system that generates sound-squatting candidates for a proactive defence that covers over 80\% of exact homophones and further generates thousands of high-quality approximated homophones. Sound-squatter relies on a state-of-art Transformer Network to learn transliteration. We search for Sound-squatter generated cross-language sound-squatting domains over hundreds of millions of emitted TLS certificates compared with other types of squatting candidates. Our finding reveals that around 6% of generated sound-squatting candidates have emitted TLS certificates, compared to 8% of other types of squatting candidates. We believe \Sound-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against sound-squatting.File | Dimensione | Formato | |
---|---|---|---|
_Camera_Ready__WTMC_2023.pdf
accesso aperto
Descrizione: versione finale
Tipologia:
2. Post-print / Author's Accepted Manuscript
Licenza:
PUBBLICO - Tutti i diritti riservati
Dimensione
1.45 MB
Formato
Adobe PDF
|
1.45 MB | Adobe PDF | Visualizza/Apri |
Lost_in_Translation_AI-based_Generator_of_Cross-Language_Sound-squatting.pdf
non disponibili
Tipologia:
2a Post-print versione editoriale / Version of Record
Licenza:
Non Pubblico - Accesso privato/ristretto
Dimensione
310.93 kB
Formato
Adobe PDF
|
310.93 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11583/2980600