Bias in the data used to train decision-making systems is a relevant socio-technical issue that emerged in recent years, and it still lacks a commonly accepted solution. Indeed, the "bias in-bias out" problem represents one of the most significant risks of discrimination, which encompasses technical fields, as well as ethical and social perspectives. We contribute to the current studies of the issue by proposing a data quality measurement approach combined with risk management, both defined in ISO/IEC standards. For this purpose, we investigate imbalance in a given dataset as a potential risk factor for detecting discrimination in the classification outcome: specifically, we aim to evaluate whether it is possible to identify the risk of bias in a classification output by measuring the level of (im)balance in the input data. We select four balance measures (the Gini, Shannon, Simpson, and Imbalance ratio indexes) and we test their capability to identify discriminatory classification outputs by applying such measures to protected attributes in the training set. The results of this analysis show that the proposed approach is suitable for the goal highlighted above: the balance measures properly detect unfairness of software output, even though the choice of the index has a relevant impact on the detection of discriminatory outcomes, therefore further work is required to test more in-depth the reliability of the balance measures as risk indicators. We believe that our approach for assessing the risk of discrimination should encourage to take more conscious and appropriate actions, as well as to prevent adverse effects caused by the "bias in-bias out" problem.

Detecting Discrimination Risk in Automated Decision-Making Systems with Balance Measures on Input Data / Mecati, Mariachiara; Vetro, Antonio; Torchiano, Marco. - (2021), pp. 4287-4296. (Intervento presentato al convegno 2021 IEEE International Conference on Big Data (IEEE BigData 2021) - First International Workshop on Data Science for equality, inclusion and well-being challenges (DS4EIW 2021) tenutosi a Orlando, FL, USA nel 15-18 Dec. 2021) [10.1109/BigData52589.2021.9671443].

Detecting Discrimination Risk in Automated Decision-Making Systems with Balance Measures on Input Data

Mecati, Mariachiara;Vetro, Antonio;Torchiano, Marco
2021

Abstract

Bias in the data used to train decision-making systems is a relevant socio-technical issue that emerged in recent years, and it still lacks a commonly accepted solution. Indeed, the "bias in-bias out" problem represents one of the most significant risks of discrimination, which encompasses technical fields, as well as ethical and social perspectives. We contribute to the current studies of the issue by proposing a data quality measurement approach combined with risk management, both defined in ISO/IEC standards. For this purpose, we investigate imbalance in a given dataset as a potential risk factor for detecting discrimination in the classification outcome: specifically, we aim to evaluate whether it is possible to identify the risk of bias in a classification output by measuring the level of (im)balance in the input data. We select four balance measures (the Gini, Shannon, Simpson, and Imbalance ratio indexes) and we test their capability to identify discriminatory classification outputs by applying such measures to protected attributes in the training set. The results of this analysis show that the proposed approach is suitable for the goal highlighted above: the balance measures properly detect unfairness of software output, even though the choice of the index has a relevant impact on the detection of discriminatory outcomes, therefore further work is required to test more in-depth the reliability of the balance measures as risk indicators. We believe that our approach for assessing the risk of discrimination should encourage to take more conscious and appropriate actions, as well as to prevent adverse effects caused by the "bias in-bias out" problem.
2021
978-1-6654-3902-2
File in questo prodotto:
File Dimensione Formato  
Detecting_Discrimination_Risk_in_Automated_Decision-Making_Systems_with_Balance_Measures_on_Input_Data.pdf

non disponibili

Descrizione: Articolo principale
Tipologia: 2a Post-print versione editoriale / Version of Record
Licenza: Non Pubblico - Accesso privato/ristretto
Dimensione 1.64 MB
Formato Adobe PDF
1.64 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
LAST-Paper_DS4EIW.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: 2. Post-print / Author's Accepted Manuscript
Licenza: PUBBLICO - Tutti i diritti riservati
Dimensione 1.07 MB
Formato Adobe PDF
1.07 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11583/2950474