Thu. Oct 23rd, 2025

Abstract

PROGRESSING TOWARDS PARTICIPATORY AI: A SAFETY EVALUATION FRAMEWORK

This study explores main problem areas that would help make progress on participatory AI models; namely, robustness, monitoring, alignment, and systemic safety. For each of the four problems, it discusses possible research directions and provides an overview of how to guard against extreme risks while developing and deploying a model. It also identifies new problems, such as emergent capabilities from massive pre-trained models grounded in recent progress in participatory AI models. It concludes that agency is an important property to evaluate given the central role of agency in various theories of AI risk.

Keywords: AI, fairness, machine learning, algorithms, computing, tech, software design