Human Compatible Artificial Intelligence with Guarantees
In this Horizon Europe funded project, we address the matter of transparency and explainability of AI using approaches inspired by control theory. Notably, we consider a comprehensive and flexible certification of properties of AI pipelines, certain closed-loops and more complicated interconnections. At one extreme, one could consider risk averse a priori guarantees via hard constraints on certain bias measures in the training process. At the other extreme, one could consider nuanced communication of the exact tradeoffs involved in AI pipeline choices and their effect on industrial and bias outcomes, post hoc. Both extremes offer little in terms of optimizing the pipeline and inflexibility in explaining the pipeline’s fairness-related qualities. Seeking the middle-ground, we suggest a priori certification of fairness-related qualities in AI pipelines via modular compositions of pre-processing, training, inference, and post-processing steps with certain properties. Furthermore, we present an extensive programme in explainability of fairness-related qualities. We seek to inform both the developer and the user thoroughly in regards to the possible algorithmic choices and their expected effects. Overall, this will effectively support the development of AI pipelines with guaranteed levels of performance, explained clearly. Three use cases (in Human Resources automation, Financial Technology, and Advertising) will be used to assess the effectiveness of our approaches.