Ethical issues in multi-objective reinforcement learning

Markov Decision Processes (MDPs) and reinforcement learning (RL) are two very successful paradigms adopted in artificial intelligence for designing autonomous agents capable of dealing with sequential decision problems under uncertainty. The decision problem is formalized as a tuple with S the set of system states, A the set of possible actions, T the transition function and R the reward function. The transition function represents the non-deterministic dynamic of the system and gives the probability for the system to move from a state s to a state s’ when an action a is taken. The reward function models the objectives of the system and defines the reward obtained y the system when an action a is taken from a state s. The objective is then to maximize the expected reward of the system over the course of actions. Optimally solving an MDP consists in finding a strategy that dictates the action to take for each possible state of the system.

While traditionally the "reward function", a basic component of both MDPs and RL, is assumed to be known from the start, it has been noted that in many situations it can be hard to fully specify it. In the last decade several works proposed "preference-based" versions of MDPs and RL addressing the problem of dealing with incompletely specified reward function. The idea consists in elicitating the reward by questioning an expert when necessary during policy computation or execution. The fact that the reward is not specified a priori but it is acquired opens a number of crucial issues that need to be tackled.

One important issue is that, in an open world, any behavior can be learnt a priori, and therefore we should investigate how to avoid "unethical" behaviors.
In a recent foundational paper, Abel et al. envision the prospect of using reinforcement to model an idealized ethical artificial agent. Among the several issues that this approach raises, and identified by Abel et al., we can cite the problem of teaching the agent, and of making policies interpretable. Wu and Lin proposed an ethics shaping approach where the reward obtained by a RL agent is modified to include ethics knowledge. Hence, an additional reward is obtained for ethics actions and a negative reward is obtained for unethical actions. Ethics shaping performs quite well when the ethical behaviour is clearly identified and aligned with a single objective. However when the agent has to respect different ethical guidelines, combining several ethical rewards may decrease the performance. Moreover, it becomes difficult to analyse the resulting strategy and to verify that it follows the desired ethics guidelines.

This internship proposal follows these lines of research and investigates the prospect of using Multi-objective MDPs / RL policies respecting ethical requirements. We will investigate how different ethical objectives can be learnt and combine to make interpretable ethical behaviors.

Supervisors : Aurélie Beynier (, Nicolas Maudet (, Paolo Viappiani (

## Some references:

// Surveys, overviews (du plus général au plus spécifique RL)
* Survey paper: Building ethics into artificial intelligence. IJCAI-18.
* Building ethically bounded AI (Rossi and Mattei)
* RL as a framework for ethical decision-making. Abel, MacGlashan, Littman.
* A low-cost ethics shaping approach for designing RL agents. Wu and Lin. AAAI-18.

// Some approaches (2 exemples d'approches qui semblent intéressantes)
* Teaching AI Agents Ethical Values Using Reinforcement Learning and Policy Orchestration (Extended Abstract)
Ritesh Noothigattu, Djallel Bouneffouf, Nicholas Mattei, Rachita Chandra, Piyush Madan, Kush R. Varshney, Murray Campbell, Moninder Singh, and Francesca Rossi

LIP6, Sorbonne Université
Référent Universitaire: 
2 022

User login