Data has become a fundamental element of the management and productive infrastructures of our society, fuelling digitization of organizational and decision-making processes at an impressive speed. This transition shows lights and shadows, and the “bias in-bias out” problem is one of the most relevant issues, which encompasses technical, ethical, and social perspectives. We address this field of research by investigating how the balance of protected attributes in training data can be used to assess the risk of algorithmic unfairness. We identify four balance measures and test their ability to detect the risk of discriminatory classification by applying them to the training set. The results of this proof of concept show that the indexes can properly detect unfairness of software output. However we found the choice of the balance measure has a relevant impact on the threshold to consider as risky; further work is necessary to deepen knowledge on this aspect.

Detecting Risk of Biased Output with Balance Measures / Mecati, Mariachiara; Vetro', Antonio; Torchiano, Marco. - In: ACM JOURNAL OF DATA AND INFORMATION QUALITY. - ISSN 1936-1955. - 14:4(2022). [10.1145/3530787]

Detecting Risk of Biased Output with Balance Measures

Mecati, Mariachiara;Vetro', Antonio;Torchiano, Marco
2022

Abstract

Data has become a fundamental element of the management and productive infrastructures of our society, fuelling digitization of organizational and decision-making processes at an impressive speed. This transition shows lights and shadows, and the “bias in-bias out” problem is one of the most relevant issues, which encompasses technical, ethical, and social perspectives. We address this field of research by investigating how the balance of protected attributes in training data can be used to assess the risk of algorithmic unfairness. We identify four balance measures and test their ability to detect the risk of discriminatory classification by applying them to the training set. The results of this proof of concept show that the indexes can properly detect unfairness of software output. However we found the choice of the balance measure has a relevant impact on the threshold to consider as risky; further work is necessary to deepen knowledge on this aspect.
File in questo prodotto:
File Dimensione Formato  
JDIQ_paper___Special_Issue_on_Data_Quality_and_Ethics.pdf

accesso aperto

Descrizione: accepted
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: Pubblico - Tutti i diritti riservati
Dimensione 634.89 kB
Formato Adobe PDF
634.89 kB Adobe PDF Visualizza/Apri
3530787.pdf

accesso riservato

Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 303.11 kB
Formato Adobe PDF
303.11 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2970220