Trusting deep learning natural-language models via local and global explanations

Ventura, Francesco; Greco, Salvatore; Apiletti, Daniele; Cerquitelli, Tania

doi:10.1007/s10115-022-01690-9

Despite the high accuracy offered by state-of-the-art deep natural-language models (e.g., LSTM, BERT), their application in real-life settings is still widely limited, as they behave like a black-box to the end-user. Hence, explainability is rapidly becoming a fundamental requirement of future-generation data-driven systems based on deep-learning approaches. Several attempts to fulfill the existing gap between accuracy and interpretability have been made. However, robust and specialized eXplainable Artificial Intelligence solutions, tailored to deep natural-language models, are still missing. We propose a new framework, named T-EBANO, which provides innovative prediction-local and class-based model-global explanation strategies tailored to deep learning natural-language models. Given a deep NLP model and the textual input data, T-EBANO provides an objective, human-readable, domain-specific assessment of the reasons behind the automatic decision-making process. Specifically, the framework extracts sets of interpretable features mining the inner knowledge of the model. Then, it quantifies the influence of each feature during the prediction process by exploiting the normalized Perturbation Influence Relation index at the local level and the novel Global Absolute Influence and Global Relative Influence indexes at the global level. The effectiveness and the quality of the local and global explanations obtained with T-EBANO are proved on an extensive set of experiments addressing different tasks, such as a sentiment-analysis task performed by a fine-tuned BERT model and a toxic-comment classification task performed by an LSTM model. The quality of the explanations proposed by T-EBANO, and, specifically, the correlation between the influence index and human judgment, has been evaluated by humans in a survey with more than 4000 judgments. To prove the generality of T-EBANO and its model/task-independent methodology, experiments with other models (ALBERT, ULMFit) on popular public datasets (Ag News and Cola) are also discussed in detail.

Trusting deep learning natural-language models via local and global explanations / Ventura, Francesco; Greco, Salvatore; Apiletti, Daniele; Cerquitelli, Tania. - In: KNOWLEDGE AND INFORMATION SYSTEMS. - ISSN 0219-1377. - 64:(2022), pp. 1863-1907. [10.1007/s10115-022-01690-9]

Trusting deep learning natural-language models via local and global explanations

Ventura,Francesco;Greco,Salvatore;Apiletti,Daniele;Cerquitelli,Tania

2022

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del prodotto
	
			2022
		
	Codice DOI
	
			https://dx.doi.org/10.1007/s10115-022-01690-9
		
	Titolo della Rivista
	
			KNOWLEDGE AND INFORMATION SYSTEMS
		
	Appare nelle tipologie
	
			1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Trusting deep learning natural-language models via local and global explanations.pdf accesso aperto Tipologia: 2. Post-print / Author's Accepted Manuscript Licenza: Creative commons Dimensione 3.5 MB Formato Adobe PDF Visualizza/Apri	3.5 MB	Adobe PDF	Visualizza/Apri
s10115-022-01690-9.pdf accesso aperto Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 2.49 MB Formato Adobe PDF Visualizza/Apri	2.49 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2962266

PORTO @ Archivio Istituzionale della Ricerca

Trusting deep learning natural-language models via local and global explanations

Ventura,Francesco;Greco,Salvatore;Apiletti,Daniele;Cerquitelli,Tania

2022

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)