
Premium content
Access to this content requires a subscription. You must be a premium user to view this content.

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
This paper critiques common ways of doing machine ethics in Reinforcement Learning (RL) and argues for a virtue-focused approach. We see two recurring problems: (i) rule-based (deontological) methods that encode duties as constraints or shields often break in new or uncertain settings and don’t build lasting habits; and (ii) reward-based (consequentialist) methods squeeze many moral goals into one number, which invites gaming and hides real trade-offs. We instead treat ethics as policy-level dispositions (stable habits that hold up when incentives, partners, or contexts change) so evaluation should look beyond rule checks or single returns to include trait summaries, durability under interventions, and clear reporting of trade-offs. Our roadmap comprises four components: (1) leveraging social learning in multi-agent RL to acquire behavior from exemplary agents; (2) preserving value conflicts through multi-objective or constrained formulations, complemented by risk-aware criteria to guard against harm; (3) regularizing policies toward ‘virtuous’ priors to promote trait-like stability under distribution shift; and (4) operationalizing diverse ethical traditions as practical control signals.
