Marine mammal communication is a complex field, hindered by the diversity of vocalizations and environmental factors. Accurate classification of these vocalizations is critical for understanding species behavior, monitoring population trends, and assessing the impact of human activities on marine life. However, current classification approaches face significant challenges due to the wide range of vocalization types and environmental noise. The Watkins Marine Mammal Sound Database (WMMD) constitutes a comprehensive labeled dataset employed in machine learning applications. Nevertheless, the methodologies for data preparation, preprocessing, and classification documented in the literature exhibit considerable variability and are typically not applied to the dataset in its entirety. This study initially undertakes a concise review of the state-of-the-art benchmarks pertaining to the dataset, with a particular focus on clarifying data preparation and preprocessing techniques. Subsequently, we explore the utilization of the Wavelet Scattering Transform (WST) and Mel spectrogram as preprocessing mechanisms for feature extraction. In this paper, we introduce WhaleNet (Wavelet Highly Adaptive Learning Ensemble Network), a sophisticated deep ensemble architecture for the classification of marine mammal vocalizations, leveraging both WST and Mel spectrogram for enhanced feature discrimination. By addressing the inconsistencies in data preparation and by utilizing advanced preprocessing techniques, our approach provides a more robust framework for classifying marine mammal vocalizations, which is essential for conservation efforts and behavioral research. By integrating the insights derived from WST and Mel representations, we achieved an improvement in classification accuracy by 8-10% over existing architectures, corresponding to a classification accuracy of 97.61%.

WhaleNet: A Novel Deep Learning Architecture for Marine Mammals Vocalizations on Watkins Marine Mammal Sound Database / Licciardi, Alessandro; Carbone, Davide. - In: IEEE ACCESS. - ISSN 2169-3536. - 12:(2024), pp. 154182-154194. [10.1109/access.2024.3482117]

WhaleNet: A Novel Deep Learning Architecture for Marine Mammals Vocalizations on Watkins Marine Mammal Sound Database

Licciardi, Alessandro;
2024

Abstract

Marine mammal communication is a complex field, hindered by the diversity of vocalizations and environmental factors. Accurate classification of these vocalizations is critical for understanding species behavior, monitoring population trends, and assessing the impact of human activities on marine life. However, current classification approaches face significant challenges due to the wide range of vocalization types and environmental noise. The Watkins Marine Mammal Sound Database (WMMD) constitutes a comprehensive labeled dataset employed in machine learning applications. Nevertheless, the methodologies for data preparation, preprocessing, and classification documented in the literature exhibit considerable variability and are typically not applied to the dataset in its entirety. This study initially undertakes a concise review of the state-of-the-art benchmarks pertaining to the dataset, with a particular focus on clarifying data preparation and preprocessing techniques. Subsequently, we explore the utilization of the Wavelet Scattering Transform (WST) and Mel spectrogram as preprocessing mechanisms for feature extraction. In this paper, we introduce WhaleNet (Wavelet Highly Adaptive Learning Ensemble Network), a sophisticated deep ensemble architecture for the classification of marine mammal vocalizations, leveraging both WST and Mel spectrogram for enhanced feature discrimination. By addressing the inconsistencies in data preparation and by utilizing advanced preprocessing techniques, our approach provides a more robust framework for classifying marine mammal vocalizations, which is essential for conservation efforts and behavioral research. By integrating the insights derived from WST and Mel representations, we achieved an improvement in classification accuracy by 8-10% over existing architectures, corresponding to a classification accuracy of 97.61%.
2024
File in questo prodotto:
File Dimensione Formato  
UcSp8k-WhaleNet_A_Novel_Deep_Learning_Architecture_for_Marine_Mammals_Vocalizations_on_Watkins_Marine_Mammal_Sound_Database.pdf

accesso aperto

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Creative commons
Dimensione 2.3 MB
Formato Adobe PDF
2.3 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2993875