# Home

• ## What game theory can and cannot do

Game theory offers a formal language to study strategic interaction. Is it possible to use game theory to predict the outcome of every possible strategic interaction in the real world? Of course not. There are, at least, two problems. One is that game theory when applied to some real-life strategic interaction, needs to be empirically well informed. The game theorist needs to know who the relevant decision makers (the players) are, what they can do (their strategies), and how they care about the various outcomes (their payoff function). The game theorist also needs to know what the relevant decision makers know about each other, what they know about each other’s motives, and what they know about what others know (the players’ information). Only if the game theorist knows all this, or has a good first-order approximation of all this, can they begin to try to come up with a prediction for the strategic interaction at hand.

For instance, it may be possible to use game theory to predict the outcome of the war in Ukraine for a very well-informed game theorist, but it is certainly not possible for me, as I wouldn’t even know where to begin modeling this conflict as an empirically reasonable game. If you would ask me what I believe the outcome of this conflict would be, I would first be asking you questions. And as you would probably not be able to answer most of them, I would not be able to give you a prediction. For a good game theoretic prediction, you would, therefore, need a game theorist teaming up with someone who has the appropriate empirical knowledge. That is why I, in this blog, tend to write about real-life strategic interaction, typically of a smallish nature, that I have personal experience with, so that I can be reasonably confident that I will make the appropriate assumptions when I build my game theoretic model. After all, I am no expert on any particular real-life strategic interaction, I am an expert (if at all) on how to formally model strategic interaction and how to mathematically “solve” these models.

But there is also a second problem. Suppose the game theorist does have a good idea of what the right game is for the real-life situation at hand. The game theorist may still struggle to come up with good predictions. Consider tic-tac-toe. You know, the game with crosses and circles. There is a 3 by 3 grid and players alternate one player placing crosses and the other circles in this grid. The first of the two players to have managed to place three of their symbols in a row, horizontally, vertically, or diagonally, wins (and the other player loses). If neither player can do so, it is a draw. Suppose both players want to win. With this knowledge, the game theorist has, one would think, all they need to be able to write down the appropriate game. Ok, one could argue, and in that case what I will write in a moment is just a version of the first problem, that the game theorist still doesn’t necessarily know how much the players understand the game. In fact, this is the problem that I wanted to come to. A game theorist would be able to come up with predictions for this game. This is a relatively simple game, with the feature that both players could, in principle, at least guarantee themselves a draw. So, Nash equilibrium, for instance, would predict that every such game ends in a draw.

I want to argue that, first, this is, of course, empirically wrong, yet, second, this is still very useful knowledge and helps us explain something about the world.

Ok, first, of course not every real-life tic-tac-toe game ends in a draw. I have often seen kids play this game and observed that sometimes one child and sometimes another wins this game. I have also involuntarily watched a grown man one row in front of me on the airplane repeatedly losing tic-tac-toe against the computer in the easy setting of the tic-tac-toe program on the back of the seat in front of him.

As a game theorist, you would really need to know a lot more than the players, strategies, and payoff functions, to be able to predict all these outcomes correctly every single time. You would probably need to know, what Jeeves (from the P.G. Wodehouse books – if you don’t know about this you are missing out), would call the “psychology of the individual.” I like this term because it stresses the term “individual.” It would not suffice to know something general about psychology, you need to know the individual, as obviously not everyone plays the game in the same way. In fact, you need to know even more, you need to know how the players feel at the moment and how motivated they are and what they know about and what they have learned about the game so far, and probably much more, to have any hope of coming up with a good prediction of their behavior every single time.

I believe that there are cases of real-life strategic interaction, such as some random kids playing tic-tac-toe, that are essentially unpredictable. This does not discourage me from using game theory. Why not? I still think the Nash equilibrium prediction (other ideas based on the elimination of dominated strategies would also come to the same prediction) is useful. It explains why we do not have high-prize competitive tic-tac-toe tournaments. If we had such tournaments, the incentives to win in these, given the high prizes, would be very high. People would have a strong incentive to learn to play this game well. Such learning will eventually, at least in simple games such as tic-tac-toe, lead to Nash equilibrium behavior. As this behavior is finally always the same it would be very boring. Nobody would come and watch a tic-tac-toe tournament where every game ends in a draw and the winner has to be decided by coin flip every single time.

While the Nash equilibrium prediction would, thus, be empirically wrong in many real-life settings and probably also in many potential lab experiments, it is still useful to understand something about our world.

• ## Strategic sports

Not all sports have strategic elements. There is not much strategizing in the 100-meter dash. You just run as fast as you can. That’s all there is to it. I have also never heard of a 100-meter racer employing a strategy coach. Mental coaching, yes, physical coaching, of course, yes, but strategic coaching, no. It is just not necessary. All they would say is: “Try to run as fast as possible” and you don’t need to pay someone to give you this kind of advice. There is probably also not much strategic concern in the long jump, the high jump, pole vaulting, and javelin. I guess there is little strategic thinking in most of athletics. But there are also lots of sports, even within or close to athletics, that do have a substantial element of strategic consideration.

Take the so-called “sprint” race of track cycling. The main rules are as simple as those of the 100-meter dash. To quote from a reliable source, “two riders race off over three laps to see who is fastest.”

The first time I watched such a race at some Olympics, I expected, now I know somewhat naively, that the race would be simply a cycling version of the 100-meter dash. The two cyclists would come out flying and race as fast as possible until one of them crosses the finish line. This is not what happens. What happens is that at the crack of the starting pistol essentially nothing happens. The two racers amble off, almost going as slowly as possible. You can see this for yourself. The front “racer” meanders around, sometimes veering left (or down the track, as the track is slanted) and sometimes veering right (or up the track). The other rider carefully mimics these movements. Sometimes they almost come to a standstill. This goes on for quite some time and all this time the front “racer” constantly turns their head to look back to check up on their opponent. At some, ex-ante unclear, point in time, one of the two, finally and suddenly, starts to pedal as hard as possible, with the other racer trying to catch this moment and trying to start pedaling as hard as possible as well. Finally, the race deserves its “sprint” name.

Why don’t we see what I expected to see, that both cyclists race off as fast as possible from the start? Well, the reason is that this is simply, in the language of game theory, not a Nash equilibrium. When your opponent does race off as fast as possible, you do not want to do the same. What you want to do, this is the best response behavior as game theorists would call it, is that you go fast, yes, but behind the other cyclist, in their slipstream. Because of the physical advantage that this gives you, you do not have to exert quite as much effort as the front cyclist does, and after you have been in the slipstream for about two of the three rounds, you can come out of it with an even higher speed and overtake your opponent just before the finish line. More often than not, with that strategy, and if your opponent just races as fast as possible, you will come out on top.

Because of this, nobody just races off like that. I wonder if the current way the race pans out is what the original inventors of the sport had in mind. I wouldn’t be surprised if the original inventors were quite as naïve as I in thinking that the two cyclists would simply race as fast as possible. I could imagine that, for a while, the cyclists would even have done this. Imagine two very unevenly matched cyclists. The stronger of the two could then probably win simply by going as fast as possible. The slipstream effect is not THAT strong. It only becomes an issue, I would think, when the two cyclists are pretty similar in their athletic ability, and, thus, only when the sport becomes quite competitive. I couldn’t find any good source for the history of this event, but there are some indications that it took a while for the currently observed behavior to unfold. First, the sport has seen occasional rule changes: at some point, rules were added that cyclists are not allowed to stop completely or go backward or put their feet on the floor. These are indications that a Nash equilibrium had gradually established itself (because perhaps of a higher degree of competitiveness and a certain (slow?) learning) in which the competitors did that. The rules were then introduced because otherwise, the sport would have become too boring. Perhaps there were races that instead of the minute or so that a race should last, took hours to complete, because none of the racers was willing to start the race properly, thinking they would lose the advantage by doing so.

Second, another sprint event, the individual pursuit, was introduced, quite a bit later if I understood this correctly, that can also be seen as a rule change. In this case, however, with both kinds of races remaining (somewhat) popular. The individual pursuit is very much like the sprint race, except that the two cyclists start from opposite positions (half a track apart) from each other. Slipstream effects cannot come into it now.

In any case, the sprint race is actually quite interesting to watch, mainly because it has interesting strategic considerations. I guess what we are seeing is pretty close to Nash equilibrium behavior. Moreover, this Nash equilibrium behavior is quite complex. As it also probably strongly depends on the relative strengths of the two competitors, it is also somewhat varied. You don’t get the same predictable outcome in every race. All this, I believe, explains why the track cycling sprint event has remained a (reasonably) popular and Olympic event, despite or perhaps even because it has such strong strategic considerations that seem in strong contrast to the perhaps originally intended pure athletic nature of the competition.

• ## Winnie-the-Pooh’s research philosophy

This will sound like I am bragging – and I guess I am – but I follow the research methodology of Winnie-the-Pooh. Here is the excerpt from one of A A Milne’s Winnie-the-Pooh books that I am referring to:

“Hallo, Pooh,” said Rabbit.

“Hallo, Rabbit,” said Pooh dreamily.

“Did you make that song up?”

“Well, I sort of made it up,” said Pooh. “It isn’t Brain,” he went on humbly, “because You Know Why, Rabbit; but it comes to me sometimes.”

“Ah!” said Rabbit, who never let things come to him, but always went and fetched them.”

In the end, I came away with two feelings: One, that the authors did the best they could do given the data that was available to them; and two, that it must have been a lot of hard (and uninteresting) work. Mind you, I was also still left somewhat unconvinced about the actual take-away, but this is not what I want to talk about here.

While I applauded the work and I think it could hardly have been done better, it was not a paper that I would say I wished I had done. Well, let me clarify this. There may be people who enjoy running marathons, and those are the people that should certainly run marathons. Many more people, however, would probably have loved to have run a marathon, while not enjoying the actual process of running it. I felt a bit of that, of course. I mean I would be reasonably proud to have written the paper, as it was well done – as well done as it could have been done. But, actually, I probably wouldn’t even have enjoyed having run this marathon, because it is a marathon through mud and dirt and an unclear path, and I would remember all this pain for years to come. So no, I probably would not even have wanted to have written the paper. I do not envy the authors for their achievement. They have worked unpleasantly hard for it.

I much prefer Pooh’s method of letting things come to him instead of going to get them. I am lucky to be (mostly) a game theorist. If you are working empirically, it is hard to let things like the appropriate data just come to you. Nothing much will come, I fear, and you will eventually have to get off your armchair and go and fetch it. But theory, trying to come up with an explanation for something, might come to you in the right circumstances even while you are sitting in your armchair.

Of course, just sitting in your armchair day in and day out will, most likely, not work. To have interesting and novel things come to you, you probably need to immerse yourself in something. You can’t get something out of nothing. You can’t, for instance, come up with something interesting to say about the behavior of monks in monasteries, if you don’t even know that there are such things as monasteries. So, you probably do need to read. Perhaps even empirical work, so that you can learn what there is out there that you could explain. You could also read (or watch) fiction. For instance, that kind of fiction that offers pretty good and realistic depictions of what happens in the world. Having said that, even abstract art can do the same for you as well. Or you could go and observe the world, possibly while doing something pleasant yourself in it. Or a bit of all of the above.

Once you know what it is that you are after, game theory work typically eventually reduces to a math problem. You then often spend many hours or days trying to make some headway with some math problem. That is when Pooh’s statement that “[i]t isn’t Brain” seems quite apt. I often feel quite stupid during those hours and days. But, interestingly, after tackling some math problem for hours or days, and getting a bit frustrated, you might just wake up at 3 am when suddenly a new way of tackling the problem or even a solution presents itself to you. I don’t completely understand how it works, but I believe that without the seemingly pointless hours or days of tackling the problem, nothing would suddenly come to you in the small hours of the morning.

I do believe that you need both kinds of researchers: Rabbits and Poohs. Sometimes, it makes sense to organize a comprehensive and labor-intensive search, such as Rabbit does to try to find “Small”, one of Rabbit’s many “Friends and Relations”. Sometimes it makes sense to think more deeply about a problem and wait (with some perhaps unorthodox and seemingly unrelated work and activities involved) until a solution comes to you.

Having said that, in the case of finding “Small,” Pooh (and not one of the helpers that Rabbit has organized) does so by accident, by falling into the right pit in the ground (in which “Small” was) by following his stomach – he was hungry and went to this place because he thought there was a pot of honey there. I guess this was just luck, wasn’t it?

• ## The law of how late people choose to be

An organizer of a small research workshop, observing that all the locals were late, once remarked that “the closer people are to the venue the later they come.” If I remember my school days correctly, the same could have been said of my classmates. Of course, this is not a precise law, as people differ in how much they dislike being late (or early) to things, but on the whole, there could well be some empirical truth in this statement. In this post, I will sketch a little decision theoretic model that delivers this law.

We want to model a person deciding when to leave from home to go to work (or school or a conference venue or whatever their destination is). Let me phrase this as the problem of choosing an intended delay in getting there. Let me call this chosen intended delay $\displaystyle d$, which can be any real number. A negative delay means you are early. But $\displaystyle d$ is only the intended delay, the actual delay depends on traffic and such things. Call the actual delay $\displaystyle d + X$, where $\displaystyle X$ is a random variable with zero mean and some non-negative variance $\displaystyle \sigma^2$. The variance is my attempt to capture the random traffic conditions. A person who lives far from their destination quite possibly faces a higher variance in their travel time than a person who lives closer to their destination. The law I am looking for would then state that the higher the variance that a person faces for their travel time the earlier this person will be on average.

The decision maker chooses the intended delay, but what is their goal? I think it makes sense to assume that they prefer not to be late, and that in such a way that the later they are the worse it is. But it is probably also not great to be too early, as they could use the waiting time better at home before they leave. One option would be to consider preferences as captured by the utility function (often used as the loss function in statistics)

$\displaystyle v(d) = -(d+x)^2.$

This utility function captures both ideas: ideally people would show up exactly on time, that is the realized actual delay $\displaystyle d+x$ is zero, giving a utility of zero; they dislike being early and also being late, as in both cases the utility is negative. Also, the further away the realized actual delay is from zero (in both directions) the lower is this person’s utility.

However, a person with this utility function would choose to have zero expected delay regardless of the variance. To do so they would choose an intended delay $\displaystyle d$ of zero. This means that everybody will arrive on time on average, and those with a higher variance will be more random in their delay, sometimes being quite early, sometimes quite late. There is no law that links the variance of travel times to a person’s average delay.

This is because under the given utility function the decision maker cares equally about being early and being late, the two concerns exactly offsetting each other. This may not be such a reasonable assumption for most cases. In many cases people probably worry more about being late than about being early. A simple way to accommodate such preferences is to consider the utility function

$\displaystyle u(d) = -(d+x)^2 e^{\frac{d}{\sigma}}.$

The multiplicative term, the function $\displaystyle e^{\frac{d}{\sigma}},$ is always positive, but is smaller for small (such as negative) delays and larger for large (such as positive) delays. Dividing by $\displaystyle \sigma$ in the exponent is a kind of normalization, that makes my life easier below. Different such positive and increasing functions would lead to different, yet qualitatively similar, laws at the end.

So, suppose our decision maker behaves as if they had such preferences as captured by this utility function. What would be their optimal choice? Their optimal intended delay would be the one that maximizes their expected utility. This problem can be written as

$\displaystyle \max_{d} - \mathbb{E} \left[(d+X)^2 e^{\frac{d}{\sigma}}\right],$

where, $\displaystyle \mathbb{E}$ indicates the expectation with respect to the random additional delay $\displaystyle X.$ Equivalently, the problem can be written as

$\displaystyle \min_{d} \mathbb{E} \left[(d+X)^2 e^{\frac{d}{\sigma}}\right].$

We can solve this by differentiating (finding the first order conditions, using the product rule) with respect to $\displaystyle d.$ We obtain

$\displaystyle \mathbb{E} \left[2(d+X) e^{\frac{d}{\sigma}} + (d+X)^2 \frac{1}{\sigma} e^{\frac{d}{\sigma}} \right] = 0.$

Dividing by $\displaystyle e^{\frac{d}{\sigma}},$ multiplying by $\displaystyle \sigma,$ noting that the expectation of a sum is the sum of expectations, and recalling that $\displaystyle \mathbb{E} X = 0,$ we get

$\displaystyle 2 \sigma d + \mathbb{E} \left[(d+X)^2\right] = 0.$

By the fact that $\mathbb{E} \left[(d+X)^2\right] = \mathbb{E} \left[d^2+2dX+X^2\right] = \mathbb{E} \left[d^2+X^2\right],$ and noting that $\mathbb{E} \left[X^2\right] =\sigma^2,$ the variance of $X,$ we get

$\displaystyle 2 \sigma d + d^2 + \sigma^2 = 0.$

Note that $\displaystyle 2 \sigma d + d^2 + \sigma^2 = (d+\sigma)^2,$ and, thus, the optimal intended delay is simply

$d= - \sigma.$

In words the decision maker would optimally aim to arrive one standard deviation early and we have our law of how late people are as a function of how far they start from their destination, or more correctly as a function of how much variance they face or at least perceive for their travel time.

Our very specific law, finally, states that people aim to be one standard deviation (of their random travel time) early! 😉

• ## Recruiting when not all applicants want the job

In the German-speaking world recruiting professors at universities comes with an interesting complication. Many applicants are actually not interested in the job itself. They want to be offered the job only to use it to renegotiate their salary and general working conditions at their current university. In fact, in most universities, you can only get a pay rise if you have a job offer from another university. This creates a problem for the recruiting university for two reasons. It is not that easy to detect whether applicants are only interested in renegotiation at home and, in any case, university law does not allow discarding applicants just because the recruitment committee believes that they would not accept the job. In this post, I will explain how recruiters can circumvent these problems with the use of a little bit of game theory, more specifically, the theory of screening based on costly signaling.

Why is it important for the recruiting university to identify and discard applicants who would not accept the job in the end? Well, this is because the recruiting process is a long and labor-intensive one. It can take easily more than a year. It involves first meetings about the desired professorial profile, posting adverts, reading CVs and the applicants’ work, asking for external reviewers and giving them time to rank the candidates, inviting a list of candidates for hearings, holding these hearings and evaluating these, long negotiations about the salary and general working conditions, before finally, an applicant accepts or, more problematically, rejects the offer. Moreover, for whatever reasons, most universities only allow a final shortlist of three applicants, ranked as 1, 2, and 3, that will be offered the job sequentially. And, finally, on top of that, it is often the case that if none of the three candidates accepts the job, the university authorities reconsider whether the professorship position should be filled at all. The recruitment committee, thus, often has only one chance to fill this professorship position. So, if all three applicants on the list turn down the offer, it is a small disaster.

What can recruiters do to prevent this disaster? The starting point for solving our problem is the typically reasonable assumption that applicants who would accept the job offer tend to have a higher eagerness to get an offer than those who will just use it to renegotiate at home. And eager applicants would probably be prepared to do more hard work to get a job offer than the not so eager ones. But what work could this be? It has to be relevant for the job, otherwise, university law will not allow it. You can’t ask applicants to run 10 marathons as a requirement for becoming a professor of macroeconomic theory. You could ask them to present 10 of their research papers, but that they would love to do anyway, whether they want the job or not. But what you can do is to ask them to give a teaching presentation, and not just anyone, but one on a topic of your choosing. This is already not so pleasant for the applicants as it requires extra work that they cannot easily use for other purposes. On top of that, you can require applicants to explain what they would do if they were offered the job at this university, who of the people here they could see themselves working with, and how they would interpret and possibly change the current bachelor, master, and PhD curricula at this university. All this is unpleasant work for anyone, as it requires reading many pages of often slightly obtusely written curricula and often far from perfectly designed university websites. Moreover, this is work that the applicants cannot easily reuse for other purposes. But the eager ones would probably do it and would try to do it well. The not so eager ones will either not do it (and when they learn about what they have to do in the hearing, withdraw their application) or do it badly.

By asking applicants to do all this work, we have created what game theorists call a signaling game. Only the applicants themselves know their own eagerness. It is what game theorists call private information. Applicants can, however, signal their eagerness by doing all this work and doing it well. The recruiters’ aim is to construct this signaling game such that it has a so-called separating equilibrium: the eager applicants choose to do all this work and choose to do it well, while the non-eager ones either do not do it or do it badly. By creating a signaling game with a separating equilibrium, the recruitment committee is, thus, able to screen their applicants along this eagerness dimension.

A badly designed signaling game, from the recruitment committee’s point of view, has a pooling equilibrium, in which eager and non-eager applicants perform equally well. This would be the case if we only ask applicants to tell us about their research. As every researcher is happy to do that, we would in that case not be able to infer their eagerness for the job from their performance in the hearing.

Of course, even the best designed signaling game has its limitations. There is only so much we can reasonably demand from applicants and it could be that some applicants that are only interested in renegotiating at home are still sufficiently eager (to get their pay rise at home) that they do all this work regardless. But it typically helps at least to some extent.

• ## Non-secret voting

Occasionally, I discuss strategic voting in some of my game theory classes. In particular, in past classes I have highlighted the trivial point that if people have only two choices (or candidates) to vote for, then it is a (weakly) dominant strategy to vote for your more preferred choice: voting for your more preferred choice can never lead to a worse outcome for you than voting for your less preferred choice. So, everyone will vote for their favorite choice and the vote winner reflects the majority preference. I regarded it as a bit of an embarrassment that such a voting game also has other equilibria, in which some people vote for their less preferred choice. For instance, if every one of the n-1 people other than you (with n ≥ 3) vote for choice A and you prefer choice B, it is immaterial if you vote for A or B because your vote will simply not matter. One can, in theory, even have the paradoxical situation that everyone prefers A over B and yet everyone votes for B. This is an equilibrium! Just not a very plausible one, I would have argued. I have recently learnt not to too quickly discard the weakly dominated equilibria of such a voting game.

(more…)
• ## A simple model of community enforcement

Here is the model for my previous blog post on (anonymous) community enforcement. I would call it a simplified symmetric (single-population) version of the model in the paper by Michihiro Kandori entitled “Social norms and community enforcement” in the Review of Economic Studies 59.1 (1992): 63-80. The point of this blog post is to demonstrate that what I have claimed in the previous post can be made logically coherent. I can provide a reasonable and simple artificial world in which we obtain cooperative behavior under the fear of triggering a tipping point as a subgame perfect Nash equilibrium, meaning a self-enforcing situation that is even self-enforcing when the tipping point has been triggered and there is no way back.

There are $n$ people involved. These are those who are interested in going up the mountain, the people on the train, or the users of the communal kitchen. Time is discrete and runs from time points $0, 1, 2,$ until infinity. At every point in time $t$ one person is randomly drawn to undertake the activity (go up the mountain, use the bathroom, or the kitchen) so that each person has a $\frac{1}{n}$ probability of being drawn. The drawn person first observes (and only that) the state $x \in \{0,1,2,...\}$ of the resource (the amount of rubbish on the mountain, or the state of uncleanliness of the bathroom or kitchen). Then this person (after using the resource) decides whether to clean up ($C$ – for cooperate or clean up) after themselves or not ($D$ – for defect to use the language of the well-known prisoners’ dilemma game).

The instantaneous utility that the drawn person then receives shall be given by $\frac{1}{\lambda^x}$, where $x$ is the state of the resource – let us assume here that it is simply equal to the number of people who have been drawn before this person and who have chosen action $D$, that is not to clean up after themselves. This is the instantaneous utility this person receives when this person chooses $C$. When this person chooses $D$ they get the same utility, plus a small but positive term $d$ for not having to clean up. Let us assume that $\lambda > 1$, so this person receives less payoff the worse, that is the higher, the state of the resource $x$ is. As a function of $x$, the function $\frac{1}{\lambda^x}$ starts at $1$ for $x=0$ and then exponentially decays to the limit value of $0$ when $x$ tends to infinity.  When a person is not drawn to use the resource at some point in time this person receives an instantaneous utility of $0$. Every person discounts the future exponentially with a discount rate $\delta \in [0,1)$. This means that they evaluate streams of utils $u_t$ with the net present value $(1-\delta) \sum_{t=0}^{\infty} u_t \delta^t$, where the $(1-\delta)$ term is a convenient normalization.

For a well-defined game theoretic model, we need to identify players, their information, their strategies, and their payoffs. We have players and what they know and we have their payoffs. We have not quite yet defined their possible strategies, but we have specified their actions. To conclude the model we, thus, only have to define players’ possible strategies. These are all possible functions from the set of possible values of $x$, that is the set $\{0,1,2,...\}$, to the set of actions $\{C,D\}$. In principle, we should allow a bit more, as our players should probably remember what the state was at previous times when they were using the resource and also what they themselves did at these points in time, but this does not add or change anything of interest in our present analysis.

My claim then was that, at least for certain parameter ranges (for $\lambda$, $\delta$, $n$, and $d$), the following strategy is a subgame perfect Nash equilibrium: Play $C$ if $x=0$ and play $D$ otherwise. This kind of strategy is often referred to, in the repeated game literature, as a “grim trigger” strategy. In order to see this, we need to check two things. First, suppose everyone uses this strategy, which means that the play path has everyone cooperating (keeping the resource clean), is it best for a drawn player to also do so? Second, suppose the “trigger” has been released by someone playing $D$, that is by someone not cleaning up after themselves, is it best for a drawn person to then also play $D$ (to also not cleaning up after themselves)?

So, suppose first that everyone uses this strategy. Then a randomly drawn player at some point in time, that we can, without loss of generality, call time $0$, finds the following payoff consequences for their two possible choices and for all time periods from then on:

$\begin{array}{ccccc} \mbox{time} & 0 & 1 & 2 & ... \\ C & 1 & \frac{1}{n} 1 & \frac{1}{n} 1 & ... \\ D & 1 + d & \frac{1}{n} \left(\frac{1}{\lambda} + d\right) & \frac{1}{n} \left(\frac{1}{\lambda^2} + d\right) & ... \end{array}$

The net present value for choosing $C$ at this point in time is then
$(1-\delta) \left( 1 + \frac{1}{n} \sum_{t=1}^{\infty} \delta^t \right)$.
For choosing $D$ at this point in time it is
$(1-\delta) \left( 1 + d + \frac{1}{n} \sum_{t=1}^{\infty} \left(\frac{1}{\lambda^t} + d\right) \delta^t \right)$.
The grim trigger strategy is a Nash equilibrium if and only if such a player would prefer
$C$ over $D$,
that is if and only if
$(1-\delta) \left( 1 + \frac{1}{n} \sum_{t=1}^{\infty} \delta^t \right) \ge (1-\delta) \left( 1 + d + \frac{1}{n} \sum_{t=1}^{\infty} \left(\frac{1}{\lambda^t} + d\right) \delta^t \right)$.
According to my calculations, this is equivalent to
$\delta (\lambda-\delta) (1-d) \ge \left( d n (\lambda - \delta) \right) (1-\delta)$.

It is not straightforward to derive nice bounds for $\delta$ (as a function of the other parameters) so that this inequality is satisfied. But we can at least say that, if people are sufficiently patient, that is for $\delta$ close to $1$, the inequality is satisfied provided $d < 1$ as well, which I assumed anyway – I stated that I wanted $d$ positive and small.

For the second part of the argument, suppose that the “trigger” has been released and that everyone is playing $D$. Suppose the drawn person at some given point in time faces a state of uncleanliness of $x>0$. We can again reset the clock to zero without loss of generality. Then, the consequences of the two possible actions for this person are:

$\begin{array}{ccccc} \mbox{time} & 0 & 1 & 2 & ... \\ C & \frac{1}{\lambda^x} & \frac{1}{n} \left(\frac{1}{\lambda^x} + d\right) & \frac{1}{n} \left(\frac{1}{\lambda^{x+1}} + d \right) & ... \\ D & \frac{1}{\lambda^x} + d & \frac{1}{n} \left(\frac{1}{\lambda^{x+1}} + d\right) & \frac{1}{n} \left(\frac{1}{\lambda^{x+2}} + d \right) & ... \end{array}$

The net present value for choosing $C$ at this point in time is then
$(1-\delta) \left( \frac{1}{\lambda^x} + \frac{1}{n} \sum_{t=1}^{\infty} \left(\frac{1}{\lambda^{x+t-1}} + d\right) \delta^t \right)$.
For choosing $D$ at this point in time it is
$(1-\delta) \left( \frac{1}{\lambda^x} + d + \frac{1}{n} \sum_{t=1}^{\infty} \left(\frac{1}{\lambda^{x+t}} + d\right) \delta^t \right)$.

It is in the best interest of this person to choose $D$ rather than $C$ if and only if the latter is greater than or equal to the former, and this is the case, according to my calculations, if and only if
$\delta \le \frac{d n \lambda}{\lambda^{-x} (\lambda -1) + d n}.$
The right hand side is lowest for $x=1$ (among all integer values for $x > 0$), when it is
$\delta \le \frac{d n \lambda}{\lambda^{-1} (\lambda -1) + d n}.$
Rearranging, one can see that the right hand side of this inequality is greater than or equal to one, meaning that there is in fact no restriction on $\delta$, if and only if
$\lambda \ge \frac{1}{d n}.$
Recall that $\lambda > 1$. Then finally, this last inequality is satisfied if, for instance, $n$ is sufficiently large or $d$ is positive but very small.

All this together proves that, in the model given here, the strategy of cleaning up after yourself provided the resource is clean before you used it, and not cleaning up if the resource is not clean before you used it, is a subgame perfect Nash equilibrium: It is self-enforcing and the implicit threat of a tipping point in behavior is also self-enforcing. The model is very specific and many other versions would work just as well to make the same point.

• ## Please leave the bathroom as you would like to find it

Many actions that we take affect other people that are not involved in the decision-making process. In economics, these effects are commonly referred to as “externalities” and the presence of externalities is one of the main concerns that may render free markets inefficient. “Inefficient” means that the ultimate outcome of people ignoring the externalities that they cause on other people is such that there is an alternative outcome that would be better (or at least as good) for all people! The presence of externalities is the main problem behind climate change and also at least one reason why we still have a problem with Covid-19. People, when making their holiday planning, car driving, air conditioning, car purchasing, et cetera decisions often ignore the effect their actions have on the environment and, thus, on all others. People make vaccination decisions weighing their own subjective assessment of the risks for themselves, without necessarily considering that with a vaccination they would also increase the protection from Covid-19 for everyone around them.

There are a variety of measures that governments or other organized groups of people can take to reduce the harm caused by people ignoring these externalities. One can, for instance, debate the (higher) taxation of fossil fuels or laws to force everyone to vaccinate. Often, such measures are probably necessary. There are (possibly rare) cases, however, in which even in fairly anonymous societies, the problem sorts itself out. It can do so through a mechanism of community enforcement. In this post I will describe an argument derived from a 1992 paper by Michihiro Kandori entitled “Social norms and community enforcement” in the Review of Economic Studies 59.1 (1992): 63-80. I will use the examples of mountain tops on which people do not leave their rubbish, bathrooms on trains that remain reasonably clean despite heavy usage, and communal kitchens (such as the one in my department) that despite a lack of regular professional cleaning service and despite a fair number of people using them, remain reasonably clean and usable.

I would first like to stress that in all three examples people are unlikely to observe your actions, so no one could punish you for bad behavior directly. When you are having your well-earned lunch or snack at the mountain top you are quite possibly alone at that moment. If you decide to leave some rubbish behind no one would see you doing it. I also hope that nobody can see what you do in the train bathrooms. And yes, occasionally you may not be alone in the department kitchen when you are making your coffee or warming up your lunch, but you are also often by yourself and unobserved. [It is, by the way, also not immediately clear what would happen if someone did observe your lack of adherence to the social norm of what is thought of as decent behavior (be it on the mountain top, the bathroom on the train, or the departmental kitchen). I often find that misbehavior in public may perhaps induce a fair bit of stern staring, but nothing much more than that. Nobody seems to want to engage in an altercation. An interesting phenomenon in its own right.] So why do people not leave rubbish on the mountain top, why do they clean up after themselves after using a bathroom, why do people wash, dry, and put away their dishes in the communal kitchen?

First, you might say, why wouldn’t you? Well, I guess the idea is that you would derive some benefit from not having to carry rubbish back down the mountain (after all it weighs something, also you might not have a good bag for your rubbish and it might soil all the other stuff you have in your backpack). You probably have to undertake some slightly unpleasant cleaning effort to keep the bathroom or the kitchen in a reasonable state. In fact, in your kitchen at home you might leave dirty dishes in the kitchen for quite some time, cleaning them later, while in a communal kitchen you probably do it (if at all) right away.

Then you might say, that ok, yes, it is a bit annoying having to do these things, but it is not too bad and anyway, you are a moral person. Maybe. I also would like to think that I am a moral person, but perhaps there is a more tangible reason behind our supposedly moral stance. [I generally don’t believe that people make all these decisions always so consciously. They may simply follow some more or less automatized protocols (perhaps as part of how they were raised as a child and now not often questioned). Then I will here provide a possible reason why such behavior might in fact be in your own self-interest despite the effort that is involved.]

Then you might say, and now you are on to something, that you have an interest in keeping the place clean, because you might want to use it yourself again. True, if it were your own mountain you would probably keep it clean. Perhaps you would not tidy up your kitchen immediately, but you would probably tidy it up at some point every day. But then you do not own the mountain and you are just one of many users. Wouldn’t this fact dilute your incentives to keep the place clean? Well, yes and no.

In fact, it seems quite plausible (and I have often observed this) that people do not clean up (much) after themselves if, for instance, the bathroom on the train is already in a bad state, even if they think that they might need to use it again. But they might well do so if the bathroom was clean when they started to use it. How can this be rationalized?

Let me sketch the model here (I will try and describe it in full detail in another post). Imagine you have a largish number of users of a place (such as the mountain top, the train bathroom, or the communal kitchen). Imagine that everyone uses this place infrequently but recurrently at random points in time. So, everyone always thinks that they may use this place again at some point in the future (I guess this works less well for the example of train bathrooms towards the end of the train ride – but then at that stage these bathrooms often are quite dirty). When they use it people can either be very clean (or clean up after themselves) or they can litter or soil the place. The instantaneous payoffs are such that people would (at that moment) prefer to litter or soil rather than clean. Finally, assume that nobody observes any actions of any other people, but everybody observes the state of the place when they use it (how much rubbish there is on the mountain, how clean the bathroom or kitchen is).

Then the following strategy, if employed by all people, can be made to be a subgame perfect Nash equilibrium of this symmetric stochastic repeated game, at least under some plausible conditions. [Subgame perfect Nash equilibrium means that this strategy is self-enforcing (everyone finds it in their interest to adhere to it when everyone else does) and does not involve a non-credible threat (any threats that are used to incentivize people to adhere to the strategy are also self-enforcing when they are supposed to be employed).] As long as the place is perfectly clean, keep the place clean (just as in the often-advertised statement “please leave the bathroom as you would like to find it”). If the bathroom is not perfectly clean, regardless of how bad it is, do not clean up after yourself.

If everyone follows this strategy, then the place would stay perfectly clean throughout. If, for some reason, somebody does not follow this strategy and does not clean up after themselves, this is a “tipping point” and the avalanche of dirt starts rolling: the place will just get dirtier and dirtier from then on. In reality, it may need more than one piece of rubbish on the mountain or more than just one sheet of loo paper on the floor to trigger the “tipping point”, and one could probably adapt the model (that you can find here) so that this would be the case. In any case we do get that people will behave very differently when the place is already a mess and when it is clean and it is, at least partly, the fear of triggering the tipping point that incentivizes people to behave and to internalize the externality that bad behavior would impose on others.

• ## Allocating fruit among household members

This story is loosely based on real-life events. In the summer, members of my wife’s family all come together to stay with the “grandparents” in one large household. There are typically around 12 of us and we have many small problems. One of these is the allocation of fruit (and other snack items – but I will concentrate on the fruit problem as it is the easiest to describe) among household members. Mind you, this is not a problem that we talk about (much) in the household. Many of us perhaps do not even recognize it as a problem, but it is a problem and one that we solve inefficiently.

Consider peaches. Almost all household members like peaches, but we all vary in our exact preferences. Some of us would happily eat many peaches at various points in time, some would rather eat just one at a specific time of day. Some prefer peaches when they are still quite hard, some prefer them when they are essentially mush. [Some even like them slightly moldy, at least that’s what I infer from fruit choice observations that I made.]  So, we all have different bliss points as to the optimal stage in the ripening process when a peach should be eaten. I could probably show you a graph with peach ripening stage on the x-axis and derived pleasure from eating a peach on the y-axis and you would see a nice concave (for the mathematical economist) initially upward curve that reaches a peak at some point, the bliss point, and then bends down again. Each family member would have a somewhat different curve with different bliss points and different up- and downward curves before and after the bliss point, respectively.

Having drawn such a picture, one could go on to mark the level of derived pleasure from eating a peach, at which you would rather eat a different fruit, an apple, say, or even just nothing (rather than a moldy peach, for instance). Again, all this is different for all household members and would also depend on the ever-varying quality of apples (and that bit about the quality of apples that matters to whichever household member we are talking about). To make the fruit allocation problem even more complicated, we all seem to have variable preferences. Some days, when the weather is bad for instance, just don’t seem to be peach days, at least for some of us. Or you may have eaten or drunk something else today that does not go well with peaches. On top of that, peach quality seems to vary from batch to batch and so our preferences change accordingly as well.

Why am I talking so much about preferences? There are two reasons. One, we all want to be kind and let someone else eat a peach instead of us if they also want it. Second, if we think of the problem from the social (household) planner’s point of view, that person would like to allocate peaches to individuals in some sort of efficient way, give more peaches to people who like them more, perhaps also keep fairness in mind, etc. The problem now is that none of us perfectly know other household members’ personal peach preferences on any given day. This is both a problem for the social planner as well as for each of us when we consider eating a peach. [There is also the additional problem sometimes that the number of available peaches is not even always clear. Sometimes extra peaches are “hidden” in the larder, so there are more than you think. Sometimes some peaches are actually reserved to be made into a cake – and this fact may not be known to everyone – although sometimes the social planner labels peaches accordingly.]

So how do we solve this allocation problem, why is it inefficient, and what would other mechanisms look like? For us, it all begins with the purchasing decision (with nowadays online delivery – this also impacts peach batch quality, by the way) made once or twice a week by the head of household, in her capacity as household chief procurement officer (or CPO). The CPO makes these decisions, considering an amazing number of factors based on an incredibly high degree of empathy towards all household members. Yet, even the CPO is not fully aware of all aspects of the daily changing peach preference profile in the household. At the end of the day the CPO settles on some quantity of peaches and this is where our problem now begins.

There are many mechanisms that we could use to tackle our fruit allocation problem. Let me first unashamedly tell you that we do not use a market mechanism. What would a market mechanism look like? Well, we would all be given a share of all fruit and snack items that were purchased. We would all have something called “money” that we all accept in exchange for the various fruit and snack items. We would meet regularly in a “market”, the kitchen for instance, where we all set up shop and trade among us. Market prices would, we would hope, adapt on a daily and perhaps even hourly basis so that supply meets demand, so that no peach is left uneaten and no additional peach would be wanted to be eaten at these prices.  As we can assume that there are no worrying externalities when it comes to fruit and snack choices (except perhaps that parents sometimes worry about the kids’ sweet choices) such a market mechanism is expected to deliver a, so-called, Pareto optimal allocation, an allocation of fruit and snack items that is such that if we wanted to improve the material (fruit and snack) well-being of one individual by adjusting the allocation, we would have to reduce the material well-being of another. We could also aim for a reasonable degree of fairness by adjusting the initial allocation (before trade happens) of fruits and snacks.

But, surprisingly, this is not what we do. Our system instead is as follows. All the peaches (with some of the qualifications pointed out above) are simply displayed in the kitchen and anyone is free to take one at any time. I am pretty sure that this leads to an inefficient (not Pareto-efficient) allocation, at least in our case. Not for the reason you might think of at first. True, in a world full of people who are interested only in their immediate material (fruit and snack) well-being, like a world of small children perhaps, we might find that all the peaches are eaten by the first person who spots them. You might say that, while this may not be fair, this is ok from an efficiency point of view. But not necessarily. Imagine one person eating all the peaches, because he or she spotted them first, and another eating all the chocolates, because he or she spotted them first. They might have both been better off had they traded some of their peaches against chocolates and vice versa. But, in any case, this is not the problem we have. Our problem is that people are too altruistic, I believe, and too careful not to eat a peach that somebody else might also like, perhaps at a later stage in the ripening process. The sad result is that many peaches simply remain un-eaten (and thrown away) at the end of the day. Well, sometimes, to be fair, they are rescued and baked into a cake (at a point where I am happy not to know how advanced those peaches already were in their ripening process). But even in this latter case, some of us, perhaps all of us, might have preferred a peak peach over a post-peak peach cake.

You might say that communication should solve the problem. Maybe. However, we have many other problems (and not only problems) to discuss and the peach problem does not seem high up there on the list of problems. Also, even if we did, most of us would probably find it hard to articulate our exact peach preferences (we often don’t seem to have a good prediction of our future preferences ourselves). Also, we only come together for about a month in the summer every year, and for such a short time, it may be inefficient to spend hours solving the inefficiency in our peach allocation. So, as much as it pains me, as a trained economist, to accept inefficiencies, I guess I will just have to accept it.

• ## Estimating the proportion of Corona cases

This short note makes one simple point. If you are interested in estimating the proportion of Corona infected people in some country or region, there is a simple and better (more precise) estimate than the one you obtain by computing the sample proportion. You can also read this in German here (and here).

Setup

Consider taking a (completely) random sample of $n$ individuals in some population in order to estimate the proportion of people in this population who have the Corona virus. Let $p$ denote this true proportion. I here assume that we already know, through potentially non-random medical testing, that there is a certain fraction $q$ of the population who definitely have the virus (or have had it). I will refer to these people as those that were declared to have the virus. I assume that whatever medical test was used to obtain this number was perfect, at least in one direction: anyone who has been declared to have the virus this way also actually has it. As, thus, necessarily $q \le p,$ we can write $p=\mu q,$ where we interpret $\mu \ge 1$ as the multiplier or ratio of actual virus cases relative to the declared virus cases. I am here interested in estimating $\mu$ from the random sample knowing $q.$ If we have an estimate for $\mu$ we get one for $p$ by multiplying the $\mu$-estimate with $q$.

When we take the random sample, we collect two pieces of information from each person. One, we check (again, for the sake of simplicity, with a perfect medical test) whether or not they have the virus. Two, we ask them (and the subject answers truthfully) whether they have already been declared as having the virus. I will call $X$ the total number of virus cases in the sample and $Y \le X$ the total number of already declared virus cases in the sample.

Estimator

Many people would probably be tempted to use $\hat{p}=\frac{X}{n}$ as the standard estimator for $p,$ and, thus, indirectly $\hat{\mu}_S=\frac{X}{qn}$ as the standard estimator for $\mu$. It turns out that there is a better estimator that uses all available information. Let me call it the alternative estimator $\hat{\mu}_A$. It is given by
$\hat{\mu}_A=1+\frac{X-Y}{qn}.$

In the Appendix below I derive (in a few simple steps) this estimator as an approximation of the maximum-likelihood estimator for the present problem. It, therefore, does have all the nice properties that maximum likelihood estimators have. But even if you are a maximum likelihood skeptic, we can actually just directly compare the precision (for all sample sizes) of the two estimators, by looking at their variances.

First note that, like the standard estimator, the alternative estimator is unbiased as
$\mathbb{E}\left[\hat{\mu}_A\right]=1+\frac{\mu qn - qn}{qn} = \mu.$

The variance of the two estimators are
$\mathbb{V}\left[\hat{\mu}_S\right] = \frac{\mu(1-\mu q)}{qn} \approx \frac{\mu}{qn},$
and, as $X-Y$ is binomially distributed with number of trials $n$ and success probability $\mu q \left(1-\frac{1}{\mu}\right),$
$\mathbb{V}\left[\hat{\mu}_A\right] = \frac{(\mu-1)(1-q(\mu-1))}{qn} \approx \frac{\mu-1}{qn},$
where the approximation is good when $q$ and $\mu$ are sufficiently small.

In this case the ratio of the two variances is given by
$\frac{\mathbb{V}\left[\hat{\mu}_A\right]}{\mathbb{V}\left[\hat{\mu}_S\right]} = \frac{\mu-1}{\mu} < 1.$

Thus, especially, if $\mu$ is not much larger than 1, the alternative estimator is quite a bit more precise. Note also, that the alternative estimator can never be below 1.

Austrian Corona cases

In Austria, from 1st to 6th of April, a random sample of $n=1544$ was checked for the Corona virus. I will here ignore the disturbing sample selection problem that actually 2000 people were supposed to participate and 456 did not participate. Of those who participated the number of cases found, $X,$ was 5 and the number of already declared cases among them, $Y,$ was either 2 or 3. There was some weighting in these numbers which I am not fully informed about. I will ignore these issues here, but at least will look at both cases for $Y.$ At the same day the proportion $q=1/758$ (11383 declared cases among 8,636.364 people in Austria).

Using the, here also easily applicable, Clopper-Pearson method to compute 95\% confidence bounds, we get the following estimates and bounds derived from the two different estimators.

$\begin{array}{c|ccc} & \hat{\mu}_S & \hat{\mu}_A (Y=3) & \hat{\mu}_A (Y=2) \\ \hline \mbox{estimate } \mu & 2,46 & 1,98 & 2,47 \\ \mbox{lower bound } \mu & 0,87 & 1,12 & 1,30 \\ \mbox{upper bound } \mu & 5,72 & 4,54 & 5,30 \\ \mbox{lower bound cases } & 9866 & 12738 & 14845 \\ \mbox{estimated cases } & 27.968 & 22570 & 28164 \\ \mbox{upper bound cases } & 65126 & 51726 & 60331 \\ \end{array}$

As you can see, the confidence bounds are much narrower for the alternative estimator than for the standard estimator.

A Thought

If we could assume, which sadly we often probably cannot, that the proportionality factor $\mu$ is the same in all regions of interest, while $q$ is observably not, then one could take a specific random sample that would even be much better than a random sample of all people. In Austria, for instance, the $q$ for Landeck in Tirol is about $q_L=1/50,$ while in Neusiedl am See in Burgenland it is about $q_N=1/1000.$

Then a random sample of people in Landeck would produce a much more precise estimate for $\mu$ than a random sample of people in Neusiedl. The variance for the Neusiedl estimator would be 20 (the ratio of $q_L/q_N$) times as large as that for Landeck.

Another Thought

Of course, there is nothing specific about the setup here that makes it only applicable to counting virus cases. This estimator could be used in all cases in which we are interested in the true proportion of some attribute A in some population, when we know that only A’s can also have attribute B and we know how many B’s there are. Looking at it like that I am sure this estimator is known. So I am here just reminding you all about it.

Appendix

We here derive the alternative estimator as an approximation to the maximum likelihood estimator. Taking a truly random sample, we know that $X$ is binomially distributed with number of trials $n$ and success probability $\mu q.$ Conditional on $X$ we know that $Y$ is binomially distributed with number of trials $X$ and success probability $\frac{1}{\mu}.$ The likelihood function is, therefore, given by
$\mathcal{L}(\mu;X,Y) = {n \choose X} (\mu q)^X\left(1-\mu q\right)^{(n-X)} {X \choose Y} \left(\frac{1}{\mu}\right)^Y \left(1-\frac{1}{\mu}\right)^{(X-Y)}.$

The log-likelihood function is then proportional to
$\ell(\mu;X,Y)=X \ln(\mu q) + (n-X) \ln\left(1-\mu q\right) - Y\ln(\mu) + (X-Y) \ln\left(1-\frac{1}{\mu}\right).$

The maximum likelihood estimator, thus, has to satisfy
$\begin{array}{lll} \frac{X}{\mu q}q + \frac{n-X}{1-\mu q} (-q) - \frac{Y}{\mu} + \frac{X-Y}{1-\frac{1}{\mu}} \frac{1}{\mu^2} & = & 0 \\ \frac{X-Y}{\mu} - \frac{q(n-X)}{1-\mu q} + \frac{X-Y}{\mu(\mu-1)} & = & 0. \end{array}$

If $\mu q$ is small, we can approximate $1-\mu q$ by 1. We then get
$\mu=1+\frac{X-Y}{q(n-X)}.$
If $X$ is, in expectation, much smaller than $n,$ we can approximate this further to get
$\hat{\mu}_A=1+\frac{X-Y}{qn}.$