Purcell, Zoe, Köbis, Nils, Samuel, Andrew and Bonnefon, Jean-François (2026) Whistleblowers can contain the unethical externalities of human-AI delegation. PNAS Nexus. (In Press)

Full text not available from this repository.

Abstract

Prior work using controlled principal-agent experiments suggests two risks
of delegating tasks to AI systems: human principals are more likely to request
profit-maximizing misconduct from AI agents than from human agents, and AI
agents are more likely to comply. Here we test whether third-party observers
can contain the resulting harm. In an incentivized die-reporting paradigm, principals instructed either a human or an AI agent how strongly to prioritize profit over accuracy, creating potential financial harm to a charity. We first confirm, with human principals (N = 600) and three large language models as AI agents, that delegation to AI produces larger negative externalities than delegation to humans. We then study observers who could pay a personal cost to flag a principal’s instruction, cancelling the principal’s gain in favor of the charity, as a laboratory analogue of whistleblowing. In this observer study (N = 300), the probability of flagging increased with how unethical the principal’s request was, but did not depend on whether the request was directed to a human or an AI agent. Because principals made more unethical requests under AI delegation, flagging was more frequent under AI delegation. When combined with agent behavior, this increase in flagging fully neutralized the negative externalities of AI delegation in our experimental setting. These findings support institutional protections for whistleblowers as one potential organizational safeguard against the harms of human-AI delegation.

Item Type: Article
Language: English
Date: June 2026
Refereed: Yes
Subjects: B- ECONOMIE ET FINANCE
Divisions: TSE-R (Toulouse)
Site: UT1
Date Deposited: 02 Jul 2026 09:36
Last Modified: 02 Jul 2026 09:36
OAI Identifier: oai:tse-fr.eu:131922
URI: https://publications.ut-capitole.fr/id/eprint/53805
View Item