In the
previous pieces I described a theory about intelligence: the statements theory.
The starting point says behaviour is based on a gathering of statements. What
is the use of a theory about the stacking of statements?
In the
following I describe the prisoner’s dilemma. This is a situation from game
theory. It’s a two-players game, in which the participants have the options to
cooperae or defect. The best results are gained when both players cooperate.
Experiments show that in most cases players don’t act that way. The game
theory, taking the payoff matrix as the description of the game, does not
explain why players take the less-satisfying option. My description of the game
from the viewpoint of the statements theory makes it plausible why players get
to play less and less cooperatively.
When
played as a game the prisoner’s dilemma can be accomplished multiple times
sequentially. See also: Wikipedia Prisoner’s dilemma .
The
payoff matrix
Both
players must make their selection simultaneously: cooperative (C) or
non-cooperative (NC). The matrix above shows the payoffs of the game.
When both
play C, they get a reward equal to 3 both. When both play NC, they get a reward
equal to 1 both. When one plays C and the other plays NC, the cooperative
player gets nothing and the “traitor” gets 5.
When the
game is iterated (played multiple times) the highest payoffs are earned when
both players cooperate. Both get 3 per game. Nevertheless players seldomly
cooperate in experiments. Furthermore it is shown when repetitively playing,
players are getting less cooperative. You would expect subjects to obtain
insight in the game, and see that cooperation is the better strategy. Why
subjects fail to do so, is not quite clear.
Davis
(1970) writes that it is surprising to see that players tend to cooperate less
and less during a run of games.
Morton D.
Davis (1970) Game Theory, A Nontechnical Introduction.
Let’s
take a look at the game from the players’ point of view.
Player 1
plays based on his experiences. Every completed game is an experience, a
statement. The total of statements guides his playing selections. Player 2
plays tit for tat. Tit for tat is a succesful strategy, meaning he cooperates
in the first game and in the following games he selects the option his opponent
took before. For instance when player 1 selects C (cooperate), player 2 will
select C in the next game. When player 1 selects NC (non-cooperative), then
player 2 will select NC in the next game. Remember they have to make their
selection simultaneously.
The first
four games
Game 1. I
am player 1 and I select C. My opponent, player 2, selects C as well. Of course
I have no influence on the choice of my opponent. All I see is the result,
which is that cooperation pays me 3 and pays him 3: (C, me 3, he 3)
Game 2. I
want to try NC. My opponent plays the way I did the game before, but that’s one
thing I don’t know. So he plays C now. The result is that my NC-selection pays
me 5 and pays him 0: (NC, me 5, he 0).
Game 3.
My experience so far is that NC if beneficial compared to C, so I select NC
again. Player 2 selects NC, which is what I did the game before (but I don’t
notice that).
The
result is (NC, me 1, he 1).
Game 4.
The last 3 games resulted in an average payoff of 3 when I played C and 3 when
I played NC. I still have no preference and decide to play C one more time.
Player 2 plays NC. Result: (C, me 0, he 5).
The funny
thing is that we have all four possible outcomes, as stated in the matrix
above. My (player 1’s) information is complete, you’d expect me to have enough
statements to play rationally, to follow the best strategy.
The
succesive games in iterated playing (click to enlarge)
Game 5.
According to my experience, C pays me 1½ on average = (3 + 0) / 2. NC pays me 3
on average = (5 + 1) / 2. So I select NC. And I will persist the next 14 games!
My earns
are low, after each game my average payoff descends, still selecting NC. But only
after game 18 the average NC-payoff has sank to 1½.
Game 19.
I play C now and I instantly get punished for that, because player 2 plays NC
now. My average C-payoff is lowered to 1 now. The NC-payoff still is 1½ and it
will descend further if I’d stick to selection NC. However it will never get
below the average C-payoff. So I will never select C.
The runs
of contiguous games in which player 1 played NC, will get longer and longer,
until infinite. What we see is, using these strategies the players are getting
less cooperatively, even though this is not expected based on the payoff
matrix.
What if
player 2 plays based on his experiences too? Then this simulation shows they
will meet each other playing cooperatively after a while, which happens in game
11. However this is a delicate balance. After 10 more games C as well as NC pay
off 1½. What to do now? Play C (if C >=NC) or play NC (if C > NC)? Both
players are in two minds. If player 2 would start to cooperate one game later
than player 1 (game 12 in stead of 11), the would never again meet each other
in cooperation. Which means they stick to play NC. Also with this
based-on-experience strategy on both sides, there is a tendency to
non-cooperative playing.
Conclusion
Basing on
the payoff matrix, game theory can derive playing strategies. This approach
does not explain the fenomenon of more and more non-cooperative behaviour.
The
approach of stacking statements indeed explains why subjects do not follow
optimal strategies, in casu why they cooperate less then you’d logically
expect.
No comments:
Post a Comment