A nowhere near miss
I was involved in what was definitely not a traffic accident and it was my fault. Ok, let me try again. What I mean is that, while we were indeed far from having an accident, our actions were also not in equilibrium and that was my fault. I didn’t do anything legally wrong, I think, but the other person involved was right to be upset about my actions. In this post I will try to explain what exactly happened in game-theoretic terms.
The situation was this: I and another person were both approaching an intersection. I was on my bike, the other person in a car. I had a stop sign in front of me and she clearly had the right of way. I was fully aware of all this and I dutifully stopped on the fat white line. So, as I said, I didn’t do anything legally wrong. My opponent in this encounter, however, also stopped. And when she stopped, she stared at me accusingly, then shook her head and eventually drove on. Why did she stop and why was she mad at me? Well, because I didn’t look at her. No, she was not offended or anything like that. The problem with me not looking at her was that she didn’t know whether I had seen her. And she was worried that I will, like many bicyclists in Graz (where I live), think that I am alone on the road and ignore the traffic signs and just keep going. She didn’t want to run me over, so she stopped.
My lack of looking at her and, thereby, not openly acknowledging her presence meant that we did not have common knowledge of the fact that I did know that she was there. I knew that she was there and she, of course, knew that she was there, but she did not know that I knew that she was there. It feels good writing things like that. Anyway, the upshot was, that she, being apparently a very considerate and careful driver and being unsure of what I would do, stopped to avoid running me over. Her action was definitely caused by me not looking at her. If I had looked at her, she would not have stopped, as she would have seen that I had seen her and, thereby, most likely rightly, assumed that I will stop.
In the end, I guess one could say that we did not play the equilibrium action profile, which would have been for me to stop and for her to breeze through the intersection. My lack of looking at her made her stop when she didn’t need to. She lost valuable time because of me and I fully understand her anger.
Predicting the outcome of the world cup
There are many teams that can win the upcoming football world cup in Qatar and nobody can at this point know which of them will win. The best one can do is to provide probabilistic predictions. I will here explain how one can check the correctness of probabilistic predictions and will use this to argue that, quite possibly, the best predictions are those that one can derive from the sports betting market. My argument is based on using a large data set of betting odds to test the so-called efficient market hypothesis.
The outcome of one game (or one world cup) is not enough to test the predictive power of probabilistic predictions. Suppose you predict that, say, France beats Australia with probability 2/3 and I predict that France wins with probability 1/3. Suppose Australia wins. Was I correct and were you wrong?
It is essentially impossible to meaningfully test a single probabilistic prediction (unless you predict probabilities of, or very close to, zero or one), but you can test a (large) set of predictions. Betting markets provide such a set of predictions in terms of the betting odds that arise from people betting on the various outcomes of games.
Betting odds are the equivalent of prices in stock markets. They are also determined by supply and demand. But because they are a bit different from prices, I should probably briefly explain how they are typically determined by the betting companies. A football game between teams A and B can end in three key outcomes: team A wins, a draw, or team B wins. You can bet on any of these three events. In fact, you can bet on a lot more, but I will not pursue this here. When you are betting you are typically given odds. If the odds are then this means that, if you put one euro on this bet and you win, you get euros back. Of course, this means that odds can never be less than one. Let me denote by the odds for one game for the three events: team A wins, a draw, team B wins. Let me denote by the amounts of money put on these three events. Then the betting company sets odds, for any event from the set as close as possible to
where is the proportion of money that the betting company keeps, and is often around 5%. By doing so, the betting company has essentially no risk and simply makes around 5% of the total amount of money that was placed on bets, regardless of how the game ends.
If the odds that team A wins are then is the implicit probability that team A wins in the following sense: if were the “true” probability, you would expect a total gain from your bet given by (The for placing 1 Euro on this bet; and with probability you win an amount of while with the remaining probability you win nothing.) One could also call them the break-even probabilities. A risk-neutral or risk-averse better would only bet on an event if he or she thinks that the actual probability of this event is greater than this implicit break-even probability.
Let us now turn to the statistical analysis of these betting odds. I am using betting odds data for more than 200.000 professional football games from most of the major leagues in the world between 2006 and 2019 (downloaded from pinnacle.com in April 2019). I have chosen to focus on home team wins.
In the above figure I compare the actual observed win frequencies (fat line in red) with the implicit probabilities (dashed line in black) for all odds between 1 and 11 (on the x-axis). Isn’t it remarkable how close these two curves are? Look, for instance, at these two points: Of the 3138 games with betting odds of 1.2 with implied break-even probability of about 83%, the observed winning frequency is 82.3%. Of the 11045 games with betting odds of 2.2 and implied break-even probability of 45.5%, the observed win frequency is 43.2%. Note that the two curves come from entirely different data. The implicit probabilities are derived from the betting odds, which in turn, as I explained above, are derived exclusively from the betting behavior in the betting market. The observed win frequencies are derived from counting how often teams win their football games. Two entirely different things. And yet, they match almost perfectly. One could say that the implicit probabilities derived from the betting odds are really well calibrated: they are great estimates of the actual winning probabilities of, in this case, the home teams in football games.
This idea is, in fact, an implication of the so-called efficient market hypothesis: the hypothesis that market-prices aggregate all information that people may have, and, thus, deliver the best estimates for the underlying probabilities.
There are a few caveats to what I just wrote. Looking at the above figure we note that it is in fact true that almost all actual winning frequencies are below the implicit break-even probabilities. This derives from the fact that the betting company keeps its -proportion (typically around 5%) and could easily be adjusted. This also means that for all these odds if you had put one euro on every single one of these games you would have lost money.
But what about the few odds for which the implicit probability is above the observed win frequency? This happens for 6 out of 53 odds. It happens for odds of 4.4, 4.8, 6.5, 8, 8.5, and 9.5. Does this mean that we should bet on games with these particular odds in the future and, doing so, will make money on average? Does this contradict the efficient market hypothesis?
I don’t think so. With 2292 games in total there are not that many games with these betting odds in the data set. For the two betting odds of 4.4 and 9.5, for which the observed winning frequency is more clearly above the implied break-even probability, there are only 26 and 94 games. Even if the true winning probabilities are less than or equal to the implied break-even probabilities, one can relatively easily get higher observed win frequencies for at least some betting odds purely by chance. For the statisticians among you, one cannot reject the null hypothesis that all true win probabilities for all betting odds are less than or equal to the respective implied break-even probabilities with the given data (p-value of 0.189). Let me know if you want to know what statistical test I used.
So, we cannot reject the null hypothesis, implied by the efficient market hypothesis, that the betting odds are well calibrated. Does this mean that one couldn’t potentially produce better probabilistic predictions? Does this mean that the efficient market hypothesis is true? No. This is only one possible test of the efficient market hypothesis, which not only states that implicit probabilities should be well calibrated, but also that they include all the information that is available about the games out there in the world. There is a sense that we know that this is not true. Occasionally, there are games that are rigged. This means that there are some people who know how these games will end, and this information is often not reflected in the betting odds. But, on the whole, and this is still a bit of a leap of faith, I believe that it is hard to come up with more accurate probabilistic predictions of how football games end than those derived from the betting market.
Who will then win the World Cup? According to the betting odds of the 9th of November (from oddschecker.com), Brazil is the favorite with a win probability of 23.8% (betting odds 4.2). This is followed by Argentina with 16.1% (betting odds 6.2), France with 14.3% (betting odds 7), England with 11.4% (betting odds 8.8), Spain with 11.1% (betting odds 9) and Germany with 8.3% (betting odds 12). Saudi Arabia and Costa Rica have the lowest chances of winning with 0.1% each (betting odds 1000). However, these probability predictions can change at any time if there is relevant new information. For example, if Messi were to get injured, that would immediately change all betting odds and therefore all probability predictions. The betting market processes this information incredibly fast.
What game theory can and cannot do
Game theory offers a formal language to study strategic interaction. Is it possible to use game theory to predict the outcome of every possible strategic interaction in the real world? Of course not. There are, at least, two problems. One is that game theory when applied to some real-life strategic interaction, needs to be empirically well informed. The game theorist needs to know who the relevant decision makers (the players) are, what they can do (their strategies), and how they care about the various outcomes (their payoff function). The game theorist also needs to know what the relevant decision makers know about each other, what they know about each other’s motives, and what they know about what others know (the players’ information). Only if the game theorist knows all this, or has a good first-order approximation of all this, can they begin to try to come up with a prediction for the strategic interaction at hand.
For instance, it may be possible to use game theory to predict the outcome of the war in Ukraine for a very well-informed game theorist, but it is certainly not possible for me, as I wouldn’t even know where to begin modeling this conflict as an empirically reasonable game. If you would ask me what I believe the outcome of this conflict would be, I would first be asking you questions. And as you would probably not be able to answer most of them, I would not be able to give you a prediction. For a good game theoretic prediction, you would, therefore, need a game theorist teaming up with someone who has the appropriate empirical knowledge. That is why I, in this blog, tend to write about real-life strategic interaction, typically of a smallish nature, that I have personal experience with, so that I can be reasonably confident that I will make the appropriate assumptions when I build my game theoretic model. After all, I am no expert on any particular real-life strategic interaction, I am an expert (if at all) on how to formally model strategic interaction and how to mathematically “solve” these models.
But there is also a second problem. Suppose the game theorist does have a good idea of what the right game is for the real-life situation at hand. The game theorist may still struggle to come up with good predictions. Consider tic-tac-toe. You know, the game with crosses and circles. There is a 3 by 3 grid and players alternate one player placing crosses and the other circles in this grid. The first of the two players to have managed to place three of their symbols in a row, horizontally, vertically, or diagonally, wins (and the other player loses). If neither player can do so, it is a draw. Suppose both players want to win. With this knowledge, the game theorist has, one would think, all they need to be able to write down the appropriate game. Ok, one could argue, and in that case what I will write in a moment is just a version of the first problem, that the game theorist still doesn’t necessarily know how much the players understand the game. In fact, this is the problem that I wanted to come to. A game theorist would be able to come up with predictions for this game. This is a relatively simple game, with the feature that both players could, in principle, at least guarantee themselves a draw. So, Nash equilibrium, for instance, would predict that every such game ends in a draw.
I want to argue that, first, this is, of course, empirically wrong, yet, second, this is still very useful knowledge and helps us explain something about the world.
Ok, first, of course not every real-life tic-tac-toe game ends in a draw. I have often seen kids play this game and observed that sometimes one child and sometimes another wins this game. I have also involuntarily watched a grown man one row in front of me on the airplane repeatedly losing tic-tac-toe against the computer in the easy setting of the tic-tac-toe program on the back of the seat in front of him.
As a game theorist, you would really need to know a lot more than the players, strategies, and payoff functions, to be able to predict all these outcomes correctly every single time. You would probably need to know, what Jeeves (from the P.G. Wodehouse books – if you don’t know about this you are missing out), would call the “psychology of the individual.” I like this term because it stresses the term “individual.” It would not suffice to know something general about psychology, you need to know the individual, as obviously not everyone plays the game in the same way. In fact, you need to know even more, you need to know how the players feel at the moment and how motivated they are and what they know about and what they have learned about the game so far, and probably much more, to have any hope of coming up with a good prediction of their behavior every single time.
I believe that there are cases of real-life strategic interaction, such as some random kids playing tic-tac-toe, that are essentially unpredictable. This does not discourage me from using game theory. Why not? I still think the Nash equilibrium prediction (other ideas based on the elimination of dominated strategies would also come to the same prediction) is useful. It explains why we do not have high-prize competitive tic-tac-toe tournaments. If we had such tournaments, the incentives to win in these, given the high prizes, would be very high. People would have a strong incentive to learn to play this game well. Such learning will eventually, at least in simple games such as tic-tac-toe, lead to Nash equilibrium behavior. As this behavior is finally always the same it would be very boring. Nobody would come and watch a tic-tac-toe tournament where every game ends in a draw and the winner has to be decided by coin flip every single time.
While the Nash equilibrium prediction would, thus, be empirically wrong in many real-life settings and probably also in many potential lab experiments, it is still useful to understand something about our world.
Not all sports have strategic elements. There is not much strategizing in the 100-meter dash. You just run as fast as you can. That’s all there is to it. I have also never heard of a 100-meter racer employing a strategy coach. Mental coaching, yes, physical coaching, of course, yes, but strategic coaching, no. It is just not necessary. All they would say is: “Try to run as fast as possible” and you don’t need to pay someone to give you this kind of advice. There is probably also not much strategic concern in the long jump, the high jump, pole vaulting, and javelin. I guess there is little strategic thinking in most of athletics. But there are also lots of sports, even within or close to athletics, that do have a substantial element of strategic consideration.
Take the so-called “sprint” race of track cycling. The main rules are as simple as those of the 100-meter dash. To quote from a reliable source, “two riders race off over three laps to see who is fastest.”
The first time I watched such a race at some Olympics, I expected, now I know somewhat naively, that the race would be simply a cycling version of the 100-meter dash. The two cyclists would come out flying and race as fast as possible until one of them crosses the finish line. This is not what happens. What happens is that at the crack of the starting pistol essentially nothing happens. The two racers amble off, almost going as slowly as possible. You can see this for yourself. The front “racer” meanders around, sometimes veering left (or down the track, as the track is slanted) and sometimes veering right (or up the track). The other rider carefully mimics these movements. Sometimes they almost come to a standstill. This goes on for quite some time and all this time the front “racer” constantly turns their head to look back to check up on their opponent. At some, ex-ante unclear, point in time, one of the two, finally and suddenly, starts to pedal as hard as possible, with the other racer trying to catch this moment and trying to start pedaling as hard as possible as well. Finally, the race deserves its “sprint” name.
Why don’t we see what I expected to see, that both cyclists race off as fast as possible from the start? Well, the reason is that this is simply, in the language of game theory, not a Nash equilibrium. When your opponent does race off as fast as possible, you do not want to do the same. What you want to do, this is the best response behavior as game theorists would call it, is that you go fast, yes, but behind the other cyclist, in their slipstream. Because of the physical advantage that this gives you, you do not have to exert quite as much effort as the front cyclist does, and after you have been in the slipstream for about two of the three rounds, you can come out of it with an even higher speed and overtake your opponent just before the finish line. More often than not, with that strategy, and if your opponent just races as fast as possible, you will come out on top.
Because of this, nobody just races off like that. I wonder if the current way the race pans out is what the original inventors of the sport had in mind. I wouldn’t be surprised if the original inventors were quite as naïve as I in thinking that the two cyclists would simply race as fast as possible. I could imagine that, for a while, the cyclists would even have done this. Imagine two very unevenly matched cyclists. The stronger of the two could then probably win simply by going as fast as possible. The slipstream effect is not THAT strong. It only becomes an issue, I would think, when the two cyclists are pretty similar in their athletic ability, and, thus, only when the sport becomes quite competitive. I couldn’t find any good source for the history of this event, but there are some indications that it took a while for the currently observed behavior to unfold. First, the sport has seen occasional rule changes: at some point, rules were added that cyclists are not allowed to stop completely or go backward or put their feet on the floor. These are indications that a Nash equilibrium had gradually established itself (because perhaps of a higher degree of competitiveness and a certain (slow?) learning) in which the competitors did that. The rules were then introduced because otherwise, the sport would have become too boring. Perhaps there were races that instead of the minute or so that a race should last, took hours to complete, because none of the racers was willing to start the race properly, thinking they would lose the advantage by doing so.
Second, another sprint event, the individual pursuit, was introduced, quite a bit later if I understood this correctly, that can also be seen as a rule change. In this case, however, with both kinds of races remaining (somewhat) popular. The individual pursuit is very much like the sprint race, except that the two cyclists start from opposite positions (half a track apart) from each other. Slipstream effects cannot come into it now.
In any case, the sprint race is actually quite interesting to watch, mainly because it has interesting strategic considerations. I guess what we are seeing is pretty close to Nash equilibrium behavior. Moreover, this Nash equilibrium behavior is quite complex. As it also probably strongly depends on the relative strengths of the two competitors, it is also somewhat varied. You don’t get the same predictable outcome in every race. All this, I believe, explains why the track cycling sprint event has remained a (reasonably) popular and Olympic event, despite or perhaps even because it has such strong strategic considerations that seem in strong contrast to the perhaps originally intended pure athletic nature of the competition.
Winnie-the-Pooh’s research philosophy
This will sound like I am bragging – and I guess I am – but I follow the research methodology of Winnie-the-Pooh. Here is the excerpt from one of A A Milne’s Winnie-the-Pooh books that I am referring to:
“Hallo, Pooh,” said Rabbit.
“Hallo, Rabbit,” said Pooh dreamily.
“Did you make that song up?”
“Well, I sort of made it up,” said Pooh. “It isn’t Brain,” he went on humbly, “because You Know Why, Rabbit; but it comes to me sometimes.”
“Ah!” said Rabbit, who never let things come to him, but always went and fetched them.”
Mind you, I think it is extremely important that we have researchers who go and fetch things like Rabbit. I have recently been to a very nice presentation about a very good empirical paper. While I don’t completely remember what the paper was about exactly, I do remember that it required a lot of well-informed hard work. The authors needed to acquire two datasets (or maybe more) – I think it was about patents – that they then had to semi-automatically match to each other, as both data sets had some of the same patents (or was it firms?) in it but with different information and what the authors required was one data set with all the information. They could only partially automatize this process (and even writing the required program for that was labor intensive). After that, they had to manually go over thousands of rows in Excel to make corrections. To then identify the appropriate regression model, the authors had to create lots of variables, such as dummies, and interaction terms. As most variables did not exactly measure what they needed, they had to make lots of difficult decisions as to which is the best proxy from their set of variables for each variable they were really interested in. As they then had potential endogeneity problems, they had to consider and weigh arguments in favor of one or another instrumental variable for whatever they were really interested in. They then had to make many other regression modeling choices – I forget what they were – but something like: should it have country or state fixed effects, should it have year fixed effects, should it have a trend, should they take logs, and so on.
In the end, I came away with two feelings: One, that the authors did the best they could do given the data that was available to them; and two, that it must have been a lot of hard (and uninteresting) work. Mind you, I was also still left somewhat unconvinced about the actual take-away, but this is not what I want to talk about here.
While I applauded the work and I think it could hardly have been done better, it was not a paper that I would say I wished I had done. Well, let me clarify this. There may be people who enjoy running marathons, and those are the people that should certainly run marathons. Many more people, however, would probably have loved to have run a marathon, while not enjoying the actual process of running it. I felt a bit of that, of course. I mean I would be reasonably proud to have written the paper, as it was well done – as well done as it could have been done. But, actually, I probably wouldn’t even have enjoyed having run this marathon, because it is a marathon through mud and dirt and an unclear path, and I would remember all this pain for years to come. So no, I probably would not even have wanted to have written the paper. I do not envy the authors for their achievement. They have worked unpleasantly hard for it.
I much prefer Pooh’s method of letting things come to him instead of going to get them. I am lucky to be (mostly) a game theorist. If you are working empirically, it is hard to let things like the appropriate data just come to you. Nothing much will come, I fear, and you will eventually have to get off your armchair and go and fetch it. But theory, trying to come up with an explanation for something, might come to you in the right circumstances even while you are sitting in your armchair.
Of course, just sitting in your armchair day in and day out will, most likely, not work. To have interesting and novel things come to you, you probably need to immerse yourself in something. You can’t get something out of nothing. You can’t, for instance, come up with something interesting to say about the behavior of monks in monasteries, if you don’t even know that there are such things as monasteries. So, you probably do need to read. Perhaps even empirical work, so that you can learn what there is out there that you could explain. You could also read (or watch) fiction. For instance, that kind of fiction that offers pretty good and realistic depictions of what happens in the world. Having said that, even abstract art can do the same for you as well. Or you could go and observe the world, possibly while doing something pleasant yourself in it. Or a bit of all of the above.
Once you know what it is that you are after, game theory work typically eventually reduces to a math problem. You then often spend many hours or days trying to make some headway with some math problem. That is when Pooh’s statement that “[i]t isn’t Brain” seems quite apt. I often feel quite stupid during those hours and days. But, interestingly, after tackling some math problem for hours or days, and getting a bit frustrated, you might just wake up at 3 am when suddenly a new way of tackling the problem or even a solution presents itself to you. I don’t completely understand how it works, but I believe that without the seemingly pointless hours or days of tackling the problem, nothing would suddenly come to you in the small hours of the morning.
I do believe that you need both kinds of researchers: Rabbits and Poohs. Sometimes, it makes sense to organize a comprehensive and labor-intensive search, such as Rabbit does to try to find “Small”, one of Rabbit’s many “Friends and Relations”. Sometimes it makes sense to think more deeply about a problem and wait (with some perhaps unorthodox and seemingly unrelated work and activities involved) until a solution comes to you.
Having said that, in the case of finding “Small,” Pooh (and not one of the helpers that Rabbit has organized) does so by accident, by falling into the right pit in the ground (in which “Small” was) by following his stomach – he was hungry and went to this place because he thought there was a pot of honey there. I guess this was just luck, wasn’t it?
The law of how late people choose to be
An organizer of a small research workshop, observing that all the locals were late, once remarked that “the closer people are to the venue the later they come.” If I remember my school days correctly, the same could have been said of my classmates. Of course, this is not a precise law, as people differ in how much they dislike being late (or early) to things, but on the whole, there could well be some empirical truth in this statement. In this post, I will sketch a little decision theoretic model that delivers this law.
We want to model a person deciding when to leave from home to go to work (or school or a conference venue or whatever their destination is). Let me phrase this as the problem of choosing an intended delay in getting there. Let me call this chosen intended delay , which can be any real number. A negative delay means you are early. But is only the intended delay, the actual delay depends on traffic and such things. Call the actual delay , where is a random variable with zero mean and some non-negative variance . The variance is my attempt to capture the random traffic conditions. A person who lives far from their destination quite possibly faces a higher variance in their travel time than a person who lives closer to their destination. The law I am looking for would then state that the higher the variance that a person faces for their travel time the earlier this person will be on average.
The decision maker chooses the intended delay, but what is their goal? I think it makes sense to assume that they prefer not to be late, and that in such a way that the later they are the worse it is. But it is probably also not great to be too early, as they could use the waiting time better at home before they leave. One option would be to consider preferences as captured by the utility function (often used as the loss function in statistics)
This utility function captures both ideas: ideally people would show up exactly on time, that is the realized actual delay is zero, giving a utility of zero; they dislike being early and also being late, as in both cases the utility is negative. Also, the further away the realized actual delay is from zero (in both directions) the lower is this person’s utility.
However, a person with this utility function would choose to have zero expected delay regardless of the variance. To do so they would choose an intended delay of zero. This means that everybody will arrive on time on average, and those with a higher variance will be more random in their delay, sometimes being quite early, sometimes quite late. There is no law that links the variance of travel times to a person’s average delay.
This is because under the given utility function the decision maker cares equally about being early and being late, the two concerns exactly offsetting each other. This may not be such a reasonable assumption for most cases. In many cases people probably worry more about being late than about being early. A simple way to accommodate such preferences is to consider the utility function
The multiplicative term, the function is always positive, but is smaller for small (such as negative) delays and larger for large (such as positive) delays. Dividing by in the exponent is a kind of normalization, that makes my life easier below. Different such positive and increasing functions would lead to different, yet qualitatively similar, laws at the end.
So, suppose our decision maker behaves as if they had such preferences as captured by this utility function. What would be their optimal choice? Their optimal intended delay would be the one that maximizes their expected utility. This problem can be written as
where, indicates the expectation with respect to the random additional delay Equivalently, the problem can be written as
We can solve this by differentiating (finding the first order conditions, using the product rule) with respect to We obtain
Dividing by multiplying by noting that the expectation of a sum is the sum of expectations, and recalling that we get
By the fact that and noting that the variance of we get
Note that and, thus, the optimal intended delay is simply
In words the decision maker would optimally aim to arrive one standard deviation early and we have our law of how late people are as a function of how far they start from their destination, or more correctly as a function of how much variance they face or at least perceive for their travel time.
Our very specific law, finally, states that people aim to be one standard deviation (of their random travel time) early! 😉
Occasionally, I discuss strategic voting in some of my game theory classes. In particular, in past classes I have highlighted the trivial point that if people have only two choices (or candidates) to vote for, then it is a (weakly) dominant strategy to vote for your more preferred choice: voting for your more preferred choice can never lead to a worse outcome for you than voting for your less preferred choice. So, everyone will vote for their favorite choice and the vote winner reflects the majority preference. I regarded it as a bit of an embarrassment that such a voting game also has other equilibria, in which some people vote for their less preferred choice. For instance, if every one of the n-1 people other than you (with n ≥ 3) vote for choice A and you prefer choice B, it is immaterial if you vote for A or B because your vote will simply not matter. One can, in theory, even have the paradoxical situation that everyone prefers A over B and yet everyone votes for B. This is an equilibrium! Just not a very plausible one, I would have argued. I have recently learnt not to too quickly discard the weakly dominated equilibria of such a voting game.(more…)
A simple model of community enforcement
Here is the model for my previous blog post on (anonymous) community enforcement. I would call it a simplified symmetric (single-population) version of the model in the paper by Michihiro Kandori entitled “Social norms and community enforcement” in the Review of Economic Studies 59.1 (1992): 63-80. The point of this blog post is to demonstrate that what I have claimed in the previous post can be made logically coherent. I can provide a reasonable and simple artificial world in which we obtain cooperative behavior under the fear of triggering a tipping point as a subgame perfect Nash equilibrium, meaning a self-enforcing situation that is even self-enforcing when the tipping point has been triggered and there is no way back.
There are people involved. These are those who are interested in going up the mountain, the people on the train, or the users of the communal kitchen. Time is discrete and runs from time points until infinity. At every point in time one person is randomly drawn to undertake the activity (go up the mountain, use the bathroom, or the kitchen) so that each person has a probability of being drawn. The drawn person first observes (and only that) the state of the resource (the amount of rubbish on the mountain, or the state of uncleanliness of the bathroom or kitchen). Then this person (after using the resource) decides whether to clean up ( – for cooperate or clean up) after themselves or not ( – for defect to use the language of the well-known prisoners’ dilemma game).
The instantaneous utility that the drawn person then receives shall be given by , where is the state of the resource – let us assume here that it is simply equal to the number of people who have been drawn before this person and who have chosen action , that is not to clean up after themselves. This is the instantaneous utility this person receives when this person chooses . When this person chooses they get the same utility, plus a small but positive term for not having to clean up. Let us assume that , so this person receives less payoff the worse, that is the higher, the state of the resource is. As a function of , the function starts at for and then exponentially decays to the limit value of when tends to infinity. When a person is not drawn to use the resource at some point in time this person receives an instantaneous utility of . Every person discounts the future exponentially with a discount rate . This means that they evaluate streams of utils with the net present value , where the term is a convenient normalization.
For a well-defined game theoretic model, we need to identify players, their information, their strategies, and their payoffs. We have players and what they know and we have their payoffs. We have not quite yet defined their possible strategies, but we have specified their actions. To conclude the model we, thus, only have to define players’ possible strategies. These are all possible functions from the set of possible values of , that is the set , to the set of actions . In principle, we should allow a bit more, as our players should probably remember what the state was at previous times when they were using the resource and also what they themselves did at these points in time, but this does not add or change anything of interest in our present analysis.
My claim then was that, at least for certain parameter ranges (for , , , and ), the following strategy is a subgame perfect Nash equilibrium: Play if and play otherwise. This kind of strategy is often referred to, in the repeated game literature, as a “grim trigger” strategy. In order to see this, we need to check two things. First, suppose everyone uses this strategy, which means that the play path has everyone cooperating (keeping the resource clean), is it best for a drawn player to also do so? Second, suppose the “trigger” has been released by someone playing , that is by someone not cleaning up after themselves, is it best for a drawn person to then also play (to also not cleaning up after themselves)?
So, suppose first that everyone uses this strategy. Then a randomly drawn player at some point in time, that we can, without loss of generality, call time , finds the following payoff consequences for their two possible choices and for all time periods from then on:
The net present value for choosing at this point in time is then
For choosing at this point in time it is
The grim trigger strategy is a Nash equilibrium if and only if such a player would prefer
that is if and only if
According to my calculations, this is equivalent to
It is not straightforward to derive nice bounds for (as a function of the other parameters) so that this inequality is satisfied. But we can at least say that, if people are sufficiently patient, that is for close to , the inequality is satisfied provided as well, which I assumed anyway – I stated that I wanted positive and small.
For the second part of the argument, suppose that the “trigger” has been released and that everyone is playing . Suppose the drawn person at some given point in time faces a state of uncleanliness of . We can again reset the clock to zero without loss of generality. Then, the consequences of the two possible actions for this person are:
The net present value for choosing at this point in time is then
For choosing at this point in time it is
It is in the best interest of this person to choose rather than if and only if the latter is greater than or equal to the former, and this is the case, according to my calculations, if and only if
The right hand side is lowest for (among all integer values for ), when it is
Rearranging, one can see that the right hand side of this inequality is greater than or equal to one, meaning that there is in fact no restriction on , if and only if
Recall that . Then finally, this last inequality is satisfied if, for instance, is sufficiently large or is positive but very small.
All this together proves that, in the model given here, the strategy of cleaning up after yourself provided the resource is clean before you used it, and not cleaning up if the resource is not clean before you used it, is a subgame perfect Nash equilibrium: It is self-enforcing and the implicit threat of a tipping point in behavior is also self-enforcing. The model is very specific and many other versions would work just as well to make the same point.
Please leave the bathroom as you would like to find it
Many actions that we take affect other people that are not involved in the decision-making process. In economics, these effects are commonly referred to as “externalities” and the presence of externalities is one of the main concerns that may render free markets inefficient. “Inefficient” means that the ultimate outcome of people ignoring the externalities that they cause on other people is such that there is an alternative outcome that would be better (or at least as good) for all people! The presence of externalities is the main problem behind climate change and also at least one reason why we still have a problem with Covid-19. People, when making their holiday planning, car driving, air conditioning, car purchasing, et cetera decisions often ignore the effect their actions have on the environment and, thus, on all others. People make vaccination decisions weighing their own subjective assessment of the risks for themselves, without necessarily considering that with a vaccination they would also increase the protection from Covid-19 for everyone around them.
There are a variety of measures that governments or other organized groups of people can take to reduce the harm caused by people ignoring these externalities. One can, for instance, debate the (higher) taxation of fossil fuels or laws to force everyone to vaccinate. Often, such measures are probably necessary. There are (possibly rare) cases, however, in which even in fairly anonymous societies, the problem sorts itself out. It can do so through a mechanism of community enforcement. In this post I will describe an argument derived from a 1992 paper by Michihiro Kandori entitled “Social norms and community enforcement” in the Review of Economic Studies 59.1 (1992): 63-80. I will use the examples of mountain tops on which people do not leave their rubbish, bathrooms on trains that remain reasonably clean despite heavy usage, and communal kitchens (such as the one in my department) that despite a lack of regular professional cleaning service and despite a fair number of people using them, remain reasonably clean and usable.
I would first like to stress that in all three examples people are unlikely to observe your actions, so no one could punish you for bad behavior directly. When you are having your well-earned lunch or snack at the mountain top you are quite possibly alone at that moment. If you decide to leave some rubbish behind no one would see you doing it. I also hope that nobody can see what you do in the train bathrooms. And yes, occasionally you may not be alone in the department kitchen when you are making your coffee or warming up your lunch, but you are also often by yourself and unobserved. [It is, by the way, also not immediately clear what would happen if someone did observe your lack of adherence to the social norm of what is thought of as decent behavior (be it on the mountain top, the bathroom on the train, or the departmental kitchen). I often find that misbehavior in public may perhaps induce a fair bit of stern staring, but nothing much more than that. Nobody seems to want to engage in an altercation. An interesting phenomenon in its own right.] So why do people not leave rubbish on the mountain top, why do they clean up after themselves after using a bathroom, why do people wash, dry, and put away their dishes in the communal kitchen?
First, you might say, why wouldn’t you? Well, I guess the idea is that you would derive some benefit from not having to carry rubbish back down the mountain (after all it weighs something, also you might not have a good bag for your rubbish and it might soil all the other stuff you have in your backpack). You probably have to undertake some slightly unpleasant cleaning effort to keep the bathroom or the kitchen in a reasonable state. In fact, in your kitchen at home you might leave dirty dishes in the kitchen for quite some time, cleaning them later, while in a communal kitchen you probably do it (if at all) right away.
Then you might say, that ok, yes, it is a bit annoying having to do these things, but it is not too bad and anyway, you are a moral person. Maybe. I also would like to think that I am a moral person, but perhaps there is a more tangible reason behind our supposedly moral stance. [I generally don’t believe that people make all these decisions always so consciously. They may simply follow some more or less automatized protocols (perhaps as part of how they were raised as a child and now not often questioned). Then I will here provide a possible reason why such behavior might in fact be in your own self-interest despite the effort that is involved.]
Then you might say, and now you are on to something, that you have an interest in keeping the place clean, because you might want to use it yourself again. True, if it were your own mountain you would probably keep it clean. Perhaps you would not tidy up your kitchen immediately, but you would probably tidy it up at some point every day. But then you do not own the mountain and you are just one of many users. Wouldn’t this fact dilute your incentives to keep the place clean? Well, yes and no.
In fact, it seems quite plausible (and I have often observed this) that people do not clean up (much) after themselves if, for instance, the bathroom on the train is already in a bad state, even if they think that they might need to use it again. But they might well do so if the bathroom was clean when they started to use it. How can this be rationalized?
Let me sketch the model here (I will try and describe it in full detail in another post). Imagine you have a largish number of users of a place (such as the mountain top, the train bathroom, or the communal kitchen). Imagine that everyone uses this place infrequently but recurrently at random points in time. So, everyone always thinks that they may use this place again at some point in the future (I guess this works less well for the example of train bathrooms towards the end of the train ride – but then at that stage these bathrooms often are quite dirty). When they use it people can either be very clean (or clean up after themselves) or they can litter or soil the place. The instantaneous payoffs are such that people would (at that moment) prefer to litter or soil rather than clean. Finally, assume that nobody observes any actions of any other people, but everybody observes the state of the place when they use it (how much rubbish there is on the mountain, how clean the bathroom or kitchen is).
Then the following strategy, if employed by all people, can be made to be a subgame perfect Nash equilibrium of this symmetric stochastic repeated game, at least under some plausible conditions. [Subgame perfect Nash equilibrium means that this strategy is self-enforcing (everyone finds it in their interest to adhere to it when everyone else does) and does not involve a non-credible threat (any threats that are used to incentivize people to adhere to the strategy are also self-enforcing when they are supposed to be employed).] As long as the place is perfectly clean, keep the place clean (just as in the often-advertised statement “please leave the bathroom as you would like to find it”). If the bathroom is not perfectly clean, regardless of how bad it is, do not clean up after yourself.
If everyone follows this strategy, then the place would stay perfectly clean throughout. If, for some reason, somebody does not follow this strategy and does not clean up after themselves, this is a “tipping point” and the avalanche of dirt starts rolling: the place will just get dirtier and dirtier from then on. In reality, it may need more than one piece of rubbish on the mountain or more than just one sheet of loo paper on the floor to trigger the “tipping point”, and one could probably adapt the model (that you can find here) so that this would be the case. In any case we do get that people will behave very differently when the place is already a mess and when it is clean and it is, at least partly, the fear of triggering the tipping point that incentivizes people to behave and to internalize the externality that bad behavior would impose on others.