Empathy is often defined as understanding another person’s experience by imagining oneself in that other person’s situation: One understands the other person’s experience as if it were being experienced by the self, but without the self actually experiencing it.
Hodges and Myers, Encyclopedia of Social Psychology
Asimov’s Three Laws have long governed behavior for robotics, in both science fiction and as an accepted truth in the sciences. At the core is that a machine should ‘do no harm’. What was left out of the Three Laws is exactly how a machine should evaluate harms.
Today’s ML decisions are largely driven through reinforcement learning. Reinforcement Learning is a modern AI technique where a system can learn a sequence of actions leading to the most optimal outcome for itself. It essentially works by giving the AI a set of objectives (and sometimes penalties), and allowing the machine to learn from its own experiences what actions work best at accomplishing the objective. It’s how AIs can teach themselves to play video games, solve complex problems, and perform more sophisticated tasks as well. Reinforcement Learning is likewise one of the more concerning areas where catastrophic value alignment failures can occur. This is because it is largely centered around simplified human abstractions of rewards and penalties. As far as the machine is concerned, its job is to find the most rewarding means of accomplishing a task, with penalties only being considered if they are explicitly enumerated. Yet if control theory has taught us much, it’s that hazards cannot always be sufficiently enumerated.