A Crowdsourcing Methodology to Measure Algorithmic Bias in Black-box Systems: A Case Study with COVID-related Searches


Commercial software systems are typically opaque with regard to their inner workings. This makes it challenging to understand the nuances of complex systems, and to study their operation, in particular in the context of fairness and bias. We explore a methodology for studying aspects of the behavior of black box systems, focusing on a commercial search engine as a case study. A crowdsourcing platform is used to collect search engine result pages for a pre-defined set of queries related to the COVID-19 pandemic, to investigate whether the returned search results vary between individuals, and whether the returned results vary for the same individual when their information need is instantiated in a positive or a negative way. We observed that crowd workers tend to obtain different search results when using positive and negative query wording of the information needs, as well as different results for the same queries depending on the country in which they reside. These results indicate that using crowdsourcing platforms to study system behavior, in a way that preserves participant privacy, is a viable approach to obtain insights into black-box systems, supporting research investigations into particular aspects of system behavior.

Proceedings of the Third Workshop on Bias and Social Aspects in Search and Recommendation

Talk by Damiano Spina at the 4th Symposium on Biases in Human Computation and Crowdsourcing (BHCC 2022)