#7: Behavioral Testing with RecList for Recommenders with Jacopo Tagliabue
Episode number seven of Recsperts deals with behavioral testing for recommender systems. I talk to Jacopo Tagliabue, who is the founder of tooso and now director of artificial intelligence at Coveo. He made many contributions to various conferences like SIGIR, WWW, or RecSys. One of them is RecList, which provides behavioral, black-box testing for recommender systems.
In this episode we introduce behavioral testing for recommender systems and the corresponding framework RecList that was created by Jacopo and his co-authors. Behavioral testing goes beyond pure retrieval accuracy metrics and tries to uncover unintended behavior of recommender models. RecList is an adaption of CheckList that applies behavioral testing to NLP and which was proposed by Microsoft some time ago. RecList comes with an open-source framework with ready set datasets for different recommender use-cases like similar, sequence-based and complementary item recommendations. Furthermore, it offers some sample tests to make it easier for newcomers to get started with behavioral testing. We also briefly touch on the upcoming CIKM data challenge that is going to focus on the evaluation of recommender systems.
In the end of this episode Jacopo also shares his insights from years of building and using diverse ML Ops tools and talk about what he refers to as the "post-modern stack".
Enjoy this enriching episode of RECSPERTS - Recommender Systems Experts.
Links from the Episode:
- Jacopo Tagliabue on LinkedIn
- GitHub: RecList
- CIKM RecEval Analyticup 2022 (sign up!)
- GitHub: You Don't Need a Bigger Boat - end-to-end (Metaflow-based) implementation of an intent prediction (and session recommendation) flow
- Coveo SIGIR eCOM 2021 Data Challenge Dataset
- Blogposts: The Post-Modern Stack - Joining the modern data stack with the modern ML stack
- TensorFlow Recommenders
- NVIDIA Merlin
- Recommenders (by Microsoft)
- Chia et al. (2022): Beyond NDCG: behavioral testing of recommender systems with RecList
- Ribeiro et al. (2020): Beyond Accuracy: Behavioral Testing of NLP models with CheckList
- Bianchi et al. (2020): Fantastic Embeddings and How to Align Them: Zero-Shot Inference in a Multi-Shop Scenario