Prisoners' dilemma and Nash equilibrium (video) | Khan Academy
Description. This video explains how dominant strategies work, and how to reach a Nash equilibrium. We start by analysing dominant. In game theory, there are two kinds of strategic dominance: a strictly dominant strategy is that strategy that always provides greater utility to a. In game theory, strategic dominance (commonly called simply dominance) occurs when one Strictly dominated strategies cannot be a part of a Nash equilibrium, and as such, it is irrational for any player to play them. On the other hand, weakly . Edit links. This page was last edited on 28 December , at (UTC).
You just need to have one of them for it to not be a Nash equilibrium. Because Bill can have a gain by a change of strategy holding Al's strategy constant, so holding Al's strategy in the confession, then this is not a Nash equilibrium. So this is not Nash, because you have this movement can occur to a more favorable state for Bill holding Al constant.
Now, let's go to state 3. Let's think about this. So if we're in state 3, so this is Bill confessing and Al denying-- so let's first think about Al's point of view. If we assume Bill is constant in his confession, can Al improve his scenario?
He can go from denying, which is what would have to be in state 3, to confessing. So he could move in this direction right over here. And that by itself is enough evidence that this is not a Nash equilibrium.
Prisoners' dilemma and Nash equilibrium
We don't even have to think about Bill. There's actually nothing that Bill could do in this scenario, holding Al constant, that could improve things. Bill would not want to go from here to here. But just by the fact that Al could go from here to here, holding Bill constant, tells you that this is not a Nash equilibrium.
Now let's go to scenario 4. And you know where this is going, because you watched the last video. But it's a little-- I'm going through it in a little bit in more detail.
More on Nash equilibrium
In state 4, they are both confessing. Now let's look at it from Al's point of view. And we're going to hold Bill constant. We're going to hold Bill unchanged.
So we're going to have to stay in this column.
We're going to say that, assume that Bill's confessing. From Al's point of view, if we are in state 1, can he change his strategy to get a better outcome? Well, the only thing he could do is go from a confession to a denial. But that's not going to do good. He's going to go from three years to 10 years. So Al cannot gain by a change of strategy, as long as all the participants remain unchanged. If you deny and the other confesses now it switches around. You will get 10 years, because you're not cooperating.
And the other, your co-conspirator, will get a reduced sentence-- will get the one year.
So this is like telling Al, look, if you deny that you were the armed robber and Bill snitches you out, then you're going to get 10 years in prison. And Bill's only going to get one year in prison. And if both of you essentially confess, you will both get three years. So this scenario is called the prisoner's dilemma. Because we'll see in a second there is a globally optimal scenario for them where they both deny and they both get two years.
More on Nash equilibrium (video) | Khan Academy
But we'll see, based on their incentives, assuming they don't have any unusual loyalty to each other-- and these are hardened criminals here. They're not brothers or related to each other in any way. They don't have any kind of loyalty pact. We'll see that they will rationally pick, or they might rationally pick, a non-optimal scenario. And to understand that I'm going to draw something called a payoff matrix.
So let me do it right here for Bill. So Bill has two options. He can confess to the armed robbery or he can deny that he had anything-- that he knows anything about the armed robbery. And Al has the same two options. Al can confess and Al can deny.
And since it's called a payoff matrix, let me draw some grids here. Let me draw some grids and let's think about all of the different scenarios and what the payoffs would be. If Al confesses and Bill confesses then we're in scenario four. They both get three years in jail.
So they both will get three for Al and three for Bill. Now, if Al confesses and Bill denies, then we are in scenario two from Al's point of view. Al is only going to get one year. But Bill is going to get 10 years. Now, if the opposite thing happens, if Bill confesses and Al denies, then it goes the other way around. Al's going to get 10 years for not cooperating. And Bill's going to have a reduced sentence of one year for cooperating.
And then if they both deny, they're in scenario one, where they're both just going to get their time for the drug dealing. So Al will get two years, and Bill will get two years. Now, I alluded to this earlier in the video. What is the globally optimal scenario for them?
Well, it's this scenario, where they both deny having anything to do with the armed robbery. Then they both get two years. But what we'll see is actually somewhat rational, assuming that they don't have any strong loyalties to each other, or strong level of trust with the other party, to not go there. And it's actually rational for both of them to confess.
Some strategies—that were not dominated before—may be dominated in the smaller game. The first step is repeated, creating a new even smaller game, and so on.
The process stops when no dominated strategy is found for any player. This process is valid since it is assumed that rationality among players is common knowledgethat is, each player knows that the rest of the players are rational, and each player knows that the rest of the players know that he knows that the rest of the players are rational, and so on ad infinitum see Aumann, There are two versions of this process.
One version involves only eliminating strictly dominated strategies. If, after completing this process, there is only one strategy for each player remaining, that strategy set is the unique Nash equilibrium . C is strictly dominated by A for Player 1.
Therefore Player 1 will never play strategy C. Player 2 knows this.