Speaker: Judith Louis-Alexandre (193-02 Computer Graphics)
Natural Language Processing is a field of study dealing with handling, understanding, and generating human language by computers. Advanced word representations, word embeddings and transformer models, have been developed to handle human language based on the context of the words, to reflect the similarity between the words directly in their vector representations. These language models are trained on large text corpora which may contain biases, either direct, caused directly by sensitive features such as gender or race, or indirect, arising from apparently neutral features due to some correlation with sensitive features. These biases could be learned by the models and impact the results of downstream applications. Users should be aware of the existence of these biases to be able to use these models wisely and adapt their interpretations if necessary. The goal of this master thesis is to develop a method to reveal and quantify potential indirect biases, especially multiclass biases, learned by transformer models. Then, a visual exploration interface will be designed and implemented to enable the users to discover these potential indirect biases. This goes beyond current approaches, which mostly focus on direct and binary biases, and on word embeddings regarding the visualization tools.