FDS Colloquium: Tong Wang, “ProtoX: Explaining a Reinforcement Learning Agent via Prototyping”


Yale Institute for Foundations of Data Science, Kline Tower 13th Floor, Room 1327, New Haven, CT 06511

Speaker: Tong Wang
Assistant Professor of Marketing,
School of Management, Yale University

Wednesday, October 11, 2023
Tea & Reception: 3:30 pm (Kitchen)
Talk: 4:00 pm (Room #1327)
at the Yale institute for Foundations of Data Science, Kline Tower, 13th Floor

Title: ProtoX: Explaining a Reinforcement Learning Agent via Prototyping

Abstract: While deep reinforcement learning has proven to be successful in solving control tasks, the
black-box” nature of an agent has received increasing concerns. We propose a prototype-based post-hoc policy explainer, ProtoX, that explains a black-box agent by prototyping the agent’s behaviors into scenarios, each represented by a prototypical state. When learning prototypes, ProtoX considers both visual similarity and scenario similarity. The latter is unique to the reinforcement learning context since it explains why the same action is taken in visually different states. To teach ProtoX about visual similarity, we pre-train an encoder using contrastive learning via self-supervised learning to recognize states as similar if they occur close together in time and receive the same action from the black-box agent. We then add an isometry layer to allow ProtoX to adapt scenario similarity to the downstream task. ProtoX is trained via imitation learning using behavior cloning, and thus requires no access to the environment or agent. In addition to explanation fidelity, we design different prototype shaping terms in the objective function to encourage better interpretability. We conduct various experiments to test ProtoX. Results show that ProtoX achieved high fidelity to the original black-box agent while providing meaningful and understandable explanations.

Speaker Bio: Tong Wang’s research interests are in developing machine learning solutions for business problems. Her work focuses on creating novel interpretable models that can effectively represent and analyze structured and unstructured data, such as texts and images. The orveraching objective of these interpretable models is to extract valuable insights from the data, empowering stakeholders to make well-informed decisions while also facilitating a clear understanding of the decision-making processes employed by the models.

Prior to joining Yale, Tong actively pursued research on machine learning solutions for various real-world challenges. Her work on crime pattern detection was included in Wikipedia Crime Analysis. The ideas from her algorithm was adopted by the New York Police Department’s application Patternizr and has been running live in NYC since 2016. Tong also contributed to the development of an interpretable model for the FICO challenge of credit risk assessment in 2018, outperforming black-box machine learning models and earning the FICO Recognition Award

Website: https://tongwang-ai.github.io/