Metrics for Identifying Bias in Datasets

Simonetta, Alessandro; Trenta, Andrea; Paoletti, Maria Cristina; Vetrò, Antonio

Nowadays automated decision-making systems are pervasively used and more often, they are used for taking important decisions in sensitive areas such as the granting of a bank overdraft, the susceptibility of an individual to a virus infection, or even the likelihood of repeating a crime. The widespread use of these systems raises a growing ethical concern about the risk of a potential discriminatory impact. In particular, machine-learning systems trained on unbalanced data could rise to systematic discriminations in the real world. One of the most important challenges is to determine metrics capable of detecting when an unbalanced training dataset may lead to discriminatory behaviour of the model built on it. In this paper, we propose an approach based on the notion of data completeness using two different metrics: one based on the combinations of the values of the dataset, which will be our benchmark, and the second using frame theory, widely used among others for quality measures of control systems. It is important to remark that the use of metrics cannot be a substitute for a broader design that must take into account the columns that could lead to the presence of bias in the data. The line of research does not end with these activities but aims to continue the path towards a standardised register of measures.

Metrics for Identifying Bias in Datasets / Simonetta, Alessandro; Trenta, Andrea; Paoletti, Maria Cristina; Vetrò, Antonio. - ELETTRONICO. - 3118:(2021), pp. 10-17. ( ICYRIME 2021 International Conference of Yearly Reports on Informatics Mathematics, and Engineering 2021 online July 9, 2021).

Metrics for Identifying Bias in Datasets

Simonetta, Alessandro;Trenta, Andrea;Paoletti, Maria Cristina;Vetrò, Antonio

2021

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Anno del prodotto

2021

Appare nelle tipologie

4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
2021-ceur-ws.pdf accesso aperto Descrizione: cuer ws pdf su sito Tipologia: 2a Post-print versione editoriale / Version of Record Licenza: Creative commons Dimensione 1.27 MB Formato Adobe PDF Visualizza/Apri	1.27 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2961712

PORTO @ Archivio Istituzionale della Ricerca

Metrics for Identifying Bias in Datasets

Simonetta, Alessandro;Trenta, Andrea;Paoletti, Maria Cristina;Vetrò, Antonio

2021

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)