Civil right laws worldwide prohibit discrimination on the basis of race, color, religion, nationality, sex, marital status, age and pregnancy in a number of settings, including: credit and insurance; sale, rental, and financing of housing; personnel selection and wages; access to public accommodations, education, nursing homes, adoptions, and health care. With the advent of automatic decision support systems, such as credit scoring systems, the ease of data collection opens several challenges to data analysts for the fight against discrimination. Discrimination discovery in databases consists in the actual discovery of discriminatory situations and practices hidden in the historical decision records under analysis. The process of data analysis must then be supported by tools that implement legally-grounded measures and reasonings.
Our approach to discrimination discovery is from a data analysis perspective, and it is based on extracting and reasoning about classification rules. The various concepts and analyses, originally implemented as a stand-alone program for achieving the best performances, have been re-designed around an Oracle database, storing extracted itemsets and rules, and a collection of functions, procedures and snippets of SQL queries that implement the various legal reasonings about discrimination analysis. The resulting implementation, called DCUBE, can be accessed and exploited by a wider audience if compared to a stand-alone monolithic application. Discrimination discovery is an interactive and iterative process, where analyses assume the form of deductive reasoning over extracted rules. An appropriately designed database, with optimized indexes, functions and query snippets, can be welcome by a large audience of users, including owners of socially-sensitive decision data, government anti-discrimination analysts, technical consultants in legal cases, researchers in social sciences, economics and law.