A game theoretic solution to population ethics

As mentioned previously, population ethics is probably the most important area in moral philosophy for effective altruists who want to do the most good. There are some very serious problems in population ethics that relate to crucial considerations about doing the most good. Here I explain an elegant solution to the population ethical problems, which I called variable critical level utilitarianism, with the help of a game theoretic analogy. This new theory solves for example the ‘very repugnant conclusion’ of future generations and the ‘happy meat’ problem.

For a mathematical description and a discussion of population ethical theories, see here.


The population ethics game

Variable critical level utilitarianism can be described as a strategic game. Consider a building with a lot of rooms. Outside the building are the players waiting. Before the players may enter the building, the game master explains the set-up. Only when a player’s name is written on the door of a room, the player is allowed in that room. In each room, a player (who is allowed to enter that room) can expect a room specific payoff[1]: a reward (the player receives an amount of money from the game master) or a punishment (the player has to pay an amount of money to the game master). Before they enter the building, all players are fully informed about the payoffs they and the other players can get in all the rooms in which they are allowed.

For each of the permissible rooms, a player can declare his or her avoidance amount: the willingness to pay to avoid that room. Players cannot declare an avoidance amount for the rooms in which they are not allowed. With these declared avoidance amounts, and the payoffs of the players in each room, the game master computes a net welfare value for each room: the sum of the payoffs of all allowed players in the room minus the sum of the declared avoidance amounts of those players.

The room with the highest net welfare level is selected (in case of a tie, one of the rooms is randomly selected). The game master opens this room and the players whose names are written on that door must enter that room and receive their reward or punishment. (As they do not avoid that room, they do not have to pay their avoidance amounts.)

However, there is a catch. When players declare infinitely high avoidance amounts, it could be the case that all rooms receive a negative infinite net welfare, and this makes room selection impossible. To avoid this problem, the game master sets an upper bound on the sum of the avoidance amounts, which is calculated as follows. Some players are allowed in all the rooms, i.e. their names are on all the doors. These are the ‘necessary players’, because they necessarily receive a payoff. The other players are the ‘contingent players’, because they only receive a payoff based on the contingent fact that their name is written on a door. For each room, the game master calculates the sum of the positive payoffs (rewards) of all the contingent players of that room. The upper bound on the sum of avoidance amounts is given by the maximum of the sums of positive payoffs of contingent players, where the maximum is taken over all the rooms. If there are no rooms with contingent players having positive payoffs, the maximum is simply set to zero.

The analogy with population ethics is as follows. The building corresponds with the choice set in the population ethical problem. Each room corresponds with a possible situation that can be chosen. Who exists in the future depends on our current choices. The rooms where player  is not allowed correspond with the situations where person  does not exist. The players who are allowed in all the rooms, correspond with the necessary people: they exist in all possible situations. These people include currently living people. The other players correspond with contingent people: their existence depends on the choice of situation. The payoffs are the utilities. A reward in room s corresponds with a positive utility, i.e. a life worth living in situation s. A punishment corresponds with a negative utility, e.g. a life with more negative than positive experiences. The declared avoidance amount of a player corresponds with the critical level chosen by a person.


Avoiding the very repugnant conclusion

This population ethics game is a strategic interaction between players, because the payoffs of the players depend on the strategies played by the other players. Each player can choose a strategy which consists of his or her declared avoidance amounts for the permissible rooms. As an example, consider a building with three rooms, two necessary players (An and Ben) and thousand contingent players. These contingent players are only allowed in room 3, where they each get a positive but minimal payoff of 1. In room 1, An receives a reward of 300, Ben receives 100. In room 2, the payoffs are reversed: An receives 100, Ben receives 300. In room 3, An and Ben receive a punishment of -100 each. The upper bound on the sum of avoidance amounts is given by the maximum of the sum of positive payoffs of the contingent players, which is 1000. In room 1, Ben can choose a positive avoidance amount of 1000 to maximally influence the selection of room 2, which is his favorite. Similarly, An can choose an avoidance amount of 1000 in room 2. In room 3, An and Ben are worst-off, so they set an infinitely high avoidance amount in order to avoid that situation. However, when they do that, the upper bound of 1000 will be used to calculate the net welfare. Hence, the net welfare values of the rooms 1 and 2 are 400 (the total payoffs of An and Ben) minus 1000 (the total avoidance amount), and for room 3 it is 800 (the total payoffs of An, Ben and the contingent players) minus 1000 (the maximum avoidance amount). As room 3 has the highest net welfare, this room will be chosen. But this room is strongly disliked by An and Ben.

In population ethics, this is called the very repugnant conclusion: a situation where everyone (An and Ben in the example) is very happy can be worse than a situation where all those people are very miserable, with extreme suffering, if the second situation contains a huge number of extra people with lives barely worth living. Total utilitarianism faces this very repugnant conclusion, because the total utility of all the barely happy extra people in the second situation can trump the extreme misery of the extremely miserable.

However, this conclusion can be avoided in variable critical level utilitarianism: if An and Ben chose a zero avoidance amount, they could manage the selection of their more preferred rooms 1 or 2. By choosing between two strategies, i.e. ‘zero avoidance amount’ and ‘maximum avoidance amount’, An and Ben are in fact playing a strategic game called ‘chicken’ or the ‘hawk-dove game’. This game contains two pure and one mixed Nash equilibria. Both An and Ben playing the strategy ‘maximum avoidance amount’, which results in the very repugnant conclusion, is not a Nash equilibrium.

We can add an extension to the game, such that the necessary players are forced to cooperate to avoid the repugnant conclusion, by putting the necessary players behind a veil of ignorance. Those players know the distributions of payoffs of all the necessary people in all the rooms, they know that they are necessary players (allowed in all the rooms), but they do not know which of those necessary people they are going to be. Those players will then prefer the room that maximizes the expected payoff of the necessary players.


The rules of variable critical level utilitarianism

Someone’s relative utility in situation S is his or her utility in that situation minus his or her declared critical level. The social welfare value of a situation S is the sum of the relative utilities of all people existing in S. If a person does not exist in situation s, both its utility and critical level are zero. The social optimum situation is the one with the maximum social welfare value of all possible situations.

A full variable critical level theory allows the individuals to be free to set their own critical levels. This maximally respects autonomy of individuals. Someone may choose different critical levels in different situations and when the set of possible situations changes (e.g. situations are no longer possible or new situations become possible), people may change their critical levels. However, there are three restrictions.

First, if a person does not exist in a situation, that non-existing person is not allowed (or able) to choose a critical level for that situation.

Second, an individual can only choose a non-negative critical level. This is a rationality constraint: if a person would choose a negative critical level, that person kind of acknowledges that his or her existence can improve the social welfare, even if that person would have a negative utility. Or in other words: an individual should be willing to accept a life with a utility equal to the chosen critical level, and no-one could reasonably accept a life with negative utility.

Third, the total critical level, i.e. the sum of all critical levels set by the individuals in a situation, has an upper bound, given by the maximum over all possible situations of the sum of positive utilities of the people who exist in that situation but do not exist in all possible situations. This restriction is required to avoid that people choose infinite critical levels.


Dynamic inconsistency

Variable critical level utilitarianism faces the possibility of dynamic inconsistency. Consider a choice set with three situations. Situation s contains N very happy people, with a high average per capita utility level UN(s). Situation s’ contains the same N people, who are slightly happier (utility UN(s’)>UN(s)) plus M extra people with low happiness (positive utility UM(s’)<UN(s)). Situation s’’ contains the same N people, slightly less happy than in situation s (utility UN(s’’)<UN(s)), and the extra M people who are very happy (utility UM(s’’)>UM(s’)).

The game consists of two choices or stages. The first choice involves the addition of the extra M people. The second stage occurs once the M people are chosen to be added, and involves choosing low or high utilities for those M people (i.e. situations s’ or s’’). This game can be solved with backward induction, where we first consider the final subgame, i.e. the stage when the M people are chosen to be added. Although situation s’ is the best for the N people, the M people in that situation can complain and prefer situation s’’, such that they choose maximum critical levels totaling M.UM(s’’). In situation s’’, the N people can complain and set maximum critical level also totaling M.UM(s’’), to turn the balance again in favor of situation s’. The critical levels cancel, so the situation with the highest total utility will be chosen. Suppose N.UN(s’’)+M.UM(s’’)>N.UN(s’)+M.UM(s’), then situation s’’ is chosen. However, this solution for the subgame is not an equilibrium in the complete game, because situation s is preferred to situation s’’ by the N people. In other words: the N people cannot accept the existence of the M people, because if they did so, they know that the end result will be a situation s’’ that has a lower payoff than the situation s without the M people. If the N people choose a maximum critical level in s, then they know the selected situation will be s’’, which they do not prefer. Therefore, they can choose to set a low or critical level in s, which means in the first stage of the game situation s will be selected. What is optimal in a subgame where the choice set consists of s’ and s’’, becomes suboptimal in the complete game with choice set {s,s’,s’’}. The optimal choice depends on the stage in the game. This is known as dynamic inconsistency.

Here we see again that variable critical level utilitarianism is choice set dependent. If situation s’’ was not possible (i.e. was not an element of the choice set), situation s’ could become better than s (because the M people in s’ can no longer complain that situation s’’ should have been chosen). The value of adding extra people depends on the possible situations that contain those people.


Examples of dynamic inconsistency

The dynamic inconsistency of variable critical level utilitarianism is rather a virtue than a vice, because it avoids an old problem concerning animal farming or slavery, called the ‘Logic of the larder’. Consider the case of ‘happy meat’, i.e. meat from a livestock animal that had a life worth living (a net-positive life with more positive than negative experiences). In situation s, meat consumption and livestock farming are not allowed and N humans need to eat vegan food. In situation s’, animals are raised at happy farms (no factory farms) where they have net-positive welfare, but they are killed prematurely so that humans become a little happier by enjoying the taste of meat. In situation s’’, those animals are not killed prematurely, but can live long happy lives at farm animal sanctuaries. Their happiness increases a lot, but now humans can no longer eat meat, and they have to take care of the animals (e.g. feeding them), which bears an extra cost. In this situation, humans get the lowest welfare (lower than in situation S), but still positive.

Henry Salt (1914) argued that eating happy meat (situation s’) is not allowed, by comparing the situation with human slavery: we are not allowed to breed human slaves, even if those slaves would have net-positive lives. It is better that those happy human slaves are not born, so Salt prefers situation S. This is also the outcome of variable critical level utilitarianism, due to the dynamic inconsistency.

Suppose the happy livestock animals or happy human slaves had such positive lives, that they prefer existence (as meat animals or slaves) above non-existence. When situation s’’ is part of the choice set, those animals or slaves could complain once they exist in situation s’. However, if they would complain, the already existing N humans would decide not to breed those people, because they want to avoid situation s’’. However, if it would be possible to exclude situation s’’ from the choice set, situation s’ could be chosen (by choosing lower critical values in s’). In games with dynamic inconsistency, this can be done with a commitment device. Suppose for example that we can genetically modify a cow such that the cow will die at the age of two years (when he normally gets slaughtered in situation s’). The cow can be raised on a farm sanctuary and is not killed, but after the cow dies, he can be eaten.

Another example of dynamic inconsistency is climate change. In situation s, the current generation (N people) invest enough in climate policies and clean energy such that harmful climate change is avoided and the next generation (L people) have very happy lives. In situation s’, the current generation does nothing about climate change, they are happier because they can consume more and worry less, but their decision to travel a lot with cars and airplanes influences the exact timing of fertilization of their future children. Having sex a second later, and a son instead of a daughter is born. As a consequence, the next generation is not the L people, but other people are born. These M people do not exist in situation s. Suppose the M people in situation s’ have to deal with the consequences of dangerous climate change, but they still have slightly positive lives. These people prefer a third situation s’’, where they exist and get huge compensation fees from the N people who caused climate change. In s’’, the M people are happier, but due to the compensation payments, the N people become worse off in s’’ than in situation s. In this case, variable critical level utilitarianism could pick situation s.

This consideration also influences the discount rate that is used in cost-benefit analyses of climate policies. If situation s is chosen, it implies that the welfare of the next generation should not be discounted much: it is better that the next generation is very happy (the L people in situation S) instead of slightly happy (the M people in situation s’). However, the welfare of generations in the more distant future can be strongly discounted according to variable critical level utilitarianism. For the more distant future, this population ethical theory can resemble the asymmetric person-affecting theories. This is because in the more distant future, the current generation no longer exists and hence is no longer able to pay compensation fees to e.g. the Q people of the fifth generation. Suppose those Q people had low but still positive welfare levels due to climate change. They cannot complain against the N people (the current generation), because if the N people chose policies to avoid climate change, the Q people would not be born. In other words, a situation analogous to s’’ for the Q people in the more distant future is impossible. That means the welfare of further generations in the more distant future can be strongly discounted (at least when they still have positive utilities: when they get a negative utility due to climate change, their negative relative utilities strongly decrease the welfare function because their critical levels cannot go below zero).


Variable critical level utilitarianism is a theory in population ethics that uses a welfare function composed of the sum of the relative utilities of all existing people, whereby a relative utility is the actual utility of a person in a situation, minus a critical level. These critical levels are variable: people are free to choose their own critical levels (up to a well-chosen maximum), and so these critical levels can differ between situations and can even depend on the choice sets of possible situations. Traditional population ethical theories are limiting cases of variable critical level utilitarianism, with constraints on the critical levels. Due to these restrictions of the critical levels, those traditional theories face counterintuitive implications such as the very repugnant conclusion.

The flexibility of variable critical level utilitarianism allows to avoid the population ethical problems. The fact that people can choose their own critical level and take into account the choices of other existing people, creates a strategic game. Variable critical level utilitarianism has a game theoretic dynamic inconsistency. Some examples (consuming meat from happy livestock animals, breeding happy human slaves, causing climate change) demonstrate that this dynamic inconsistency is a virtue rather than an vice: it can explain when and why breeding happy livestock animals or happy human slaves is not allowed, why we have to prevent climate change and why we should not strongly discount the welfare of at least the next few generations.

[1] A player’s payoff consists of the received reward or punishment, but can also include other considerations that are valued by the player. For example, if a player values equality of rewards, the player may prefer a room with a lower personal reward, if all players in that room receive the same reward. This can be compared with a situation in ethics. Suppose you can choose between two situations. In the first situation, you are very happy (a high utility level), but everyone else is miserable. In the second situation, everyone else becomes extremely happy, at the cost of a slightly lower happiness for you. With some altruistic inclination, you might prefer the second situation, even if you get a lower personal utility.

Dit bericht werd geplaatst in Artikels, Blog, English texts en getagged met , , , . Maak dit favoriet permalink.

Geef een reactie

Vul je gegevens in of klik op een icoon om in te loggen.

WordPress.com logo

Je reageert onder je WordPress.com account. Log uit /  Bijwerken )

Google photo

Je reageert onder je Google account. Log uit /  Bijwerken )


Je reageert onder je Twitter account. Log uit /  Bijwerken )

Facebook foto

Je reageert onder je Facebook account. Log uit /  Bijwerken )

Verbinden met %s