PLoS ONE
Home Dynamic aspiration based on Win-Stay-Lose-Learn rule in spatial prisoner’s dilemma game
Dynamic aspiration based on Win-Stay-Lose-Learn rule in spatial prisoner’s dilemma game
Dynamic aspiration based on Win-Stay-Lose-Learn rule in spatial prisoner’s dilemma game

Competing Interests: The authors have declared that no competing interests exist.

Article Type: research-article Article History
Abstract

Prisoner’s dilemma game is the most commonly used model of spatial evolutionary game which is considered as a paradigm to portray competition among selfish individuals. In recent years, Win-Stay-Lose-Learn, a strategy updating rule base on aspiration, has been proved to be an effective model to promote cooperation in spatial prisoner’s dilemma game, which leads aspiration to receive lots of attention. In this paper, according to Expected Value Theory and Achievement Motivation Theory, we propose a dynamic aspiration model based on Win-Stay-Lose-Learn rule in which individual’s aspiration is inspired by its payoff. It is found that dynamic aspiration has a significant impact on the evolution process, and different initial aspirations lead to different results, which are called Stable Coexistence under Low Aspiration, Dependent Coexistence under Moderate aspiration and Defection Explosion under High Aspiration respectively. Furthermore, a deep analysis is performed on the local structures which cause defectors’ re-expansion, the concept of END- and EXP-periods are used to justify the mechanism of network reciprocity in view of time-evolution, typical feature nodes for defectors’ re-expansion called Infectors, Infected nodes and High-risk cooperators respectively are found. Compared to fixed aspiration model, dynamic aspiration introduces a more satisfactory explanation on population evolution laws and can promote deeper comprehension for the principle of prisoner’s dilemma.

Shi,Wei,Feng,Li,Zheng,and Tanimoto: Dynamic aspiration based on Win-Stay-Lose-Learn rule in spatial prisoner’s dilemma game

Introduction

The emergence and stability of cooperative behavior among selfish individuals is a challenging problem in biology, sociology and economics [1]. The prisoner’s dilemma(PD) game is considered as a paradigm to portray competition among selfish individuals [26]. For general parameter settings, defection is favoured by evolutionary selection, but we can easily observe numerous cooperation phenomenon in various scenarios, e.g., animals will cooperate to obtain food instead of preying alone [7]; companies will set appropriate commodity prices instead of maliciously cutting prices [8]; humans will choose to obey the order instead of jumping in line, etc [9]. Evolutionary game theory provides a practical framework to explain how the cooperation forms [1014]. Besides, five representative mechanisms considered as promoting cooperation have been investigated: kin selection, direct and indirect reciprocity, network reciprocity and group selection [15].

Since the pioneering work of Nowak and May [16], spatial games were proposed and have attracted ample attention of researchers, in which players are located on the spatially structured network and only interact with their neighbors. Since then, numerous studies have emerged to propose various mechanisms which explain the emergence and stability of cooperative behavior, such as punishment [1720], migration [2123], game organizers [24, 25], teaching ability [2628], and so on. In recent years, aspiration, a parameter representing individual’s expectation, has attracted many researchers’ attention [2932]. Win-Stay-Lose-Learn is a representative model based on aspiration, in which one will try to change its strategy only when its payoff is lower than aspiration [3335]. Liu and Chen investigated the Win-Stay-Lose-Learn rules in spatial prisoner’s dilemma game [33]; Chu and Liu added the voluntary participation into the Win-Stay-Lose-Learn rules [34]; Fu studied the stochastic Win-Stay-Lose-Learn rules in the spatial public goods game [35].

These research held the assumption that an individual player’s aspiration is fixed. However, according to Expected Value Theory and Achievement Motivation Theory proposed by Atkinson [36], one’s aspiration will be influenced by its previous payoff, and individuals tend to lower their aspirations in interactions in a crisis [37]. If one’s payoff is higher than its aspiration, the aspiration tends to be increased, otherwise decreased, some researchers also paid attention to this and had some related researches. [3739]. In this paper, a dynamic aspiration model is introduced based on Win-Stay-Lose-Learn rules, and the principle of defection’s expansion or cooperation’s survival under the dynamic aspiration model is investigated. The rest of our paper is organized as follows. First the detailed model of the dynamic aspiration based on Win-Stay-Lose-Learn rules is shown. Then the main results under our model is provided by four parts: Overview, Stable Coexistence under Low Aspiration, Dependent Coexistence under Moderate aspiration and Defection Explosion under High Aspiration. The concept of enduring (END) and expanding (EXP) periods [4042] are also used to justify the mechanism of network reciprocity in view of time evolution and find typical feature nodes called Infectors, Infected nodes and High-risk cooperators respectively. Finally the wider implications of our work and the direction of the future research are discussed.

Model

Our model is described as follows. We use the L × L square lattice with periodic boundary conditions. Each node represents a player who has one of the following two strategies: cooperation(C), or defection(D). The strategy of node i is represented as si, and node i’s aspiration is represented as Ai. At the beginning of the evolutionary process, each node is given [an] initial si and Ai. The evolutionary process is performed step by step until the network is stable. In each step, players synchronously update their strategies and aspirations as follows:

    (a)Rule of game: Each node i plays the prison’s dilemma game with its four neighbors and gets the payoff Pi=jΩiPij, where Ωi denotes the neighbors of node i. Pij is i’s payoff when i plays the game with j and it is got by Table 1, and if both of their strategies are C or D, they will get the reward R or punishment P. If i’s strategy is C and j’s strategy is D, i will get the sucker’s payoff S and the j will get the temptation value T, vice versa. In prison’s dilemma game, the above parameters meet T > R > P > S.

Table 1
Payoff matrix of prison’s dilemma game.
alternatives alternatives
alternatives RS
alternatives TP

Without loss of generality, we set R = 1 and P = 0. And to ensure single parameter, there are some typical representative sub-classes of PD game, e.g., Donor & Recipient (D & R) game which assumes T + S = 1 [4349] and boundary game which assumes t = b and S = 0 [16, 33]. In this paper, the boundary game is used as what we mainly study is the impact of T on evolution process. Thought it has S < 0 in PD games, our experiment shows that the Monte Carlo simulation result is almost the same as S = 0 when S is close to 0(for instance, S = −0.01), so it is assumed that S = 0 in boundary games.

    (b)Rule of strategy’s update: Each node i chooses one of its four neighbors j randomly with equal probability. If i’s payoff is lower than its aspiration Ai, i will be dissatisfied and choose to adopt j’s strategy with the probability:

    where K stands for the amplitude of noise [6]. Without loss of generality, we use K = 0.1 in our model [50, 51]. If i’s payoff is higher than or equal to its aspiration, i will be satisfied and not change its strategy.

    (c)Rule of aspiration’s update: Each node i updates its aspiration by the formula:

    where Ai(0) = A is given for all the nodes initially. Based on Achievement Motivation Theory [36], in a representative model, one’s aspiration is changed concerning its payoff linearly. So we introduce the dynamic aspiration by the parameter a, where a ∈ [0, 1] stands for the sensitivity of aspiration. Higher a means individual’s aspiration is changed more drastically by its payoff. Previous work with fixed aspiration is equivalent to the case a = 0, and a = 1 means that individual’s aspiration totally depends on its payoff from the last step. Considering that one’s aspiration should not be changed too drastically, we set a = 0.01 in our model.

The step (a)-(c) will repeat 100,000 times in one simulation. The fraction of cooperators and defectors at step t are denoted as rCt and rDt respectively. And for each parameter, we perform 20 independent simulations to get rC, the fraction of cooperators when stable, which is thought as the main index to measure a network’s cooperation level.

Results

0.1 Overview

Our experiment is performed on the 100 × 100 square lattice with periodic boundary conditions. In initial, cooperators and defectors are distributed uniformly at random occupying half of the square lattice respectively, and all the nodes are given the same initial aspiration A. As the main parameters, we consider the initial aspiration level A and the temptation to defect b. Fig 1 presents the fraction rC of cooperators when stable as a function of the b for different aspiration levels. It is found that the aspiration level has a significant influence on rC. Three different phases, Stable Coexistence under Low Aspiration, Dependent Coexistence under Moderate aspiration and Defection Explosion under High Aspiration could be observed for different values of A. For small values of A (A = 0 and A = 0.8), rC is equal/close to 0.5 no matter what value of b is. For large values of A (A ≥ 2.4), cooperators could hardly survive for any b > 1.0. An interesting phenomenon is discovered for moderate values of A (A = 1.6) that when b is lower than 1.6, cooperators could not survive and rC almost equals 0, but when b is higher than 1.6, cooperation could survive and rC is bigger than 0.2. According to common sense, the higher b is, the harder cooperators survive, which is inconsistent with our experimental results. Above unusual phenomenon is the focus in our paper.

Average fractions of cooperation when stable as a function of b for different values of the initial aspiration A, as obtained by means of simulations on square lattices.
Fig 1

Average fractions of cooperation when stable as a function of b for different values of the initial aspiration A, as obtained by means of simulations on square lattices.

0.2 Stable Coexistence under low aspiration (A ≤ 1.0)

For small values of A, individual’s aspiration is easy to be satisfied so cooperators and defectors can coexist. For A = 0, all the nodes are satisfied and never change their strategies, so rC will always keep 0.5. And for A = 0.8, there are only a few nodes whose four neighbors are all defectors will be dissatisfied: if it is a cooperator, it will change its strategy to D; if it is a defector, it will be always dissatisfied but can’t change its strategy since all his neighbors’ strategies are the same. At the beginning, r0=rD0=0.5, so the frequency of cooperators with four neighbors being all defectors could be calculated approximately:

which is consistent with the result rC=0.47 in the simulation experiment and rCrC0-r. In fact, the above result holds for all A ∈ (0, 1], where the proportion of cooperators dissatisfied is only rD04 and other cooperators will always be satisfied and survive. To conclude, for small values of A, initial cooperators could coexist with defectors and neither of them could expand. Low aspiration means both strategies and aspirations are long-term stable. This case is called Stable Coexistence under Low Aspiration.

0.3 Dependent Coexistence under moderate aspiration (1.0 < A ≤ 2.0)

For moderate values of A, cooperators can’t survive for small values of b. Fig 2 shows the spatial distributions of strategies and aspirations at different time steps t for A = 1.6 and b = 1.2. The evolution process can be divided into the following stages:

    At first, every node with strictly less than two C neighbors is dissatisfied. During this phase, defectors expand quickly and cooperators try to endure defectors’ invasion, which is the so-called END period [4042]. rC decreases and the aspirations of all the nodes are lower than 2.0.

    When t = 10, some cooperators still survive by forming some clusters in the END period. The defectors neighboring with the clusters are dissatisfied and have lower payoffs than their C neighbors, so these clusters could expand by converting the neighboring defectors to cooperators, which is called the EXP period [4042]. Dissatisfied defectors may evolve into cooperators gradually. Meanwhile, the aspirations of these cooperators gradually rise up.

    When t = 100, cooperators have expanded fully during the EXP period and rC reaches a maximum. However, during the process of cooperation’s expansion, cooperations’ aspirations rise up gradually. The aspirations of the nodes on the boundary of the cooperators’ clusters are higher than 2.0 now. These cooperators are no longer satisfied if they neighbor with two defectors so they will re-evolve into defectors. This leads to a repeated evolution process on the boundaries of the cooperators’ clusters where nodes evolve into cooperators and defectors repeatedly. At the same time, nodes inside the cooperators’ clusters are still satisfied so they keep their strategies and their aspirations rise up gradually.

    When t = 200, defectors gradually penetrated into the cooperators’ clusters. These cooperators’ aspirations are now close to 4.0 so once they neighbor with a defector, they will be dissatisfied and also change into defectors gradually. As a result, chain phenomenon happens that defectors almost occupy the entire network and the cooperators almost disappear rapidly.

Snapshots of typical distributions of strategies and aspirations at different time steps t for A = 1.6 and b = 1.2.
Fig 2

Snapshots of typical distributions of strategies and aspirations at different time steps t for A = 1.6 and b = 1.2.

(a) represents strategies, where cooperators are depicted white and defectors are depicted black. (b) represents aspirations. The steps of them are t = 0, 10, 100, 200, 500 and 1000 respectively.

One can see that although cooperators can survive in the END period and expand in the EXP period by forming clusters, defectors finally occupy the network when it is stable. The network reciprocity is undermined by dynamic aspirations. In dynamic aspiration models, cooperators’ aspirations will become too high to endure defectors’ re-expansion because of the long-term satisfaction. which is different from the fixed aspiration model. Fig 3 shows the probability that cooperators could survive as a function of the cooperators’ initial proportion rC0 for A = 1.6 and b = 1.2. For every different rC0, 100 independent experiments is performed. Under these parameters, cooperators are possible to survive only when rC0>0.95, and sure to survive only when rC0>0.99. In other words, defectors could expand and occupy the network finally even if they are very few initially. This result is different from the result of fixed aspiration model and it may be caused by some special local structures which could lead to the high aspirations of cooperators.

The probability that cooperators can survive as a function of the cooperators’ initial proportion rC0 for A = 1.6 and b = 1.2.
Fig 3

The probability that cooperators can survive as a function of the cooperators’ initial proportion rC0 for A = 1.6 and b = 1.2.

Cooperators are easier to survive when rC0 is higher. They are possible to survive only when rC0>0.95, and sure to survive only when rC0>0.99.

Fig 4 shows all possible local structures in the network for A = 1.6 and b = 1.2. When a node has two or less D neighbors, it is satisfied and won’t change its strategy and its aspiration will keep lower than its payoff. When a node has four D neighbors, it is always dissatisfied. But since all its neighbors are defectors, it can only be a defector by copying neighbors’ strategies and won’t change its strategy any more, and its aspiration will keep decreasing.

The local structures of strategies for A = 1.6 and b = 1.2.
Fig 4

The local structures of strategies for A = 1.6 and b = 1.2.

Each square corresponds to a single player, where cooperators are depicted blue and defectors are depicted red. Value denoted in the center square is the individual’s payoff. Smiling face represents satisfaction while crying face represents dissatisfaction.

Now we consider the structure that a node has three D neighbors and one C neighbor. Fig 5 shows the detailed principle for defectors’ expanding under this initial structure. The evolution process can be divided into the following stages:

    When t = 0, the only one node dissatisfied is node X because its aspiration is 1.0. Since it has three D neighbors and one C neighbor, X will evolve into cooperator and defector repeatedly. We call such node as an Infector. As a result, node Y’s payoff is sometimes 4.8 and sometimes 3.6, which we called an Infected node, and Y’s aspiration at step t can be got by the recursive equation:

    With t growing, we can easily prove that AY will be higher than 3.6. Next time when X evolves into a defector, PY = 3.6 < AY, so Y is dissatisfied. Y has three C neighbors, so it will be easy to evolve into a cooperator. But when Y evolves into cooperator, its payoff will decrease and it is still dissatisfied, so Y will also evolve into cooperator and defector repeatedly. In other words, the Infected node Y also becomes an Infector. For the same reason, node Z’s payoff is sometimes 4.0 and sometimes 3.0, Z becomes an Infected node and Z’s aspiration at step t can be got by the recursive equation:

    With t further growing, we can easily prove that AZ will be higher than 3.0. Next time when Y evolves into a defector, PZ = 3.0 < AZ, so Z is dissatisfied and may evolve into a defector in a few steps. Now Z’s other neighbors’ aspirations are all near to 4.0, which we call High-risk cooperator. Once Z evolves into a defector, their payoffs decrease to 3.0 so they are dissatisfied and may evolve into defectors, too.

    Furthermore, almost every node’s aspiration in the network is near to 4.0 because their payoffs have been keeping 4.0 for a long time. In other word, all the cooperators in the network have became High-risk cooperators. As a result, for each cooperator i, once one of i’s neighbors evolves into a defector, i may evolve into a defector soon, which is a chain phenomenon and causes defectors’ expanding.

The detailed principle for defectors’ expanding for A = 1.6 and b = 1.2.
Fig 5

The detailed principle for defectors’ expanding for A = 1.6 and b = 1.2.

A node is surrounded by three defectors and one cooperator initially. Smiling face represents satisfaction while crying face represents dissatisfaction.

Fig 6 shows the spatial distributions of strategies and aspirations at different time steps t for A = 1.6, b = 1.2 with the above initial structure, from which we can also get the expansion trajectory by the aspiration distribution.

Snapshots of typical distributions of strategies and aspirations at different time steps t under the initial structure shown in Fig 4 for A = 1.6 and b = 1.2.
Fig 6

Snapshots of typical distributions of strategies and aspirations at different time steps t under the initial structure shown in Fig 4 for A = 1.6 and b = 1.2.

(a) represents strategies, where cooperators are depicted white and defectors are depicted black. (b) represents aspirations. The steps of them are t = 0, 10, 100, 200, 500 and 1000 respectively.

In the network with random setup, during the END and EXP period, cooperators will survive and expand by the mechanism of network reciprocity. But once there is at least one Infector who has three D neighbors and one C neighbor initially, defectors will re-expand to the whole network. If rD0>0.05, such an Infector almost certainly exists, so defectors almost certainly expand to the whole network. Cooperators can survive only when there is no Infector initially in the network.

However, for large values of b, cooperators can partially survive. Fig 7 shows all possible local structures in the network for A = 1.6 and b = 1.7. Compared to Fig 4, if a cooperator has three D neighbors and one C neighbor, it will be dissatisfied and may evolve into a defector. But once it evolves into a defector, it becomes satisfied and doesn’t change any longer, and the network becomes stable. In other word, there is no Infector existing in the network. Fig 8 shows the spatial distributions of strategies and aspirations at different time steps t for A = 1.6, b = 1.7 with the above initial structure. Cooperators can survive by forming clusters, but as the values of b is higher, these clusters couldn’t expand. The above evolution process only goes through the END period and then the network has been stable.

The local structures of strategies for A = 1.6 and b = 1.7.
Fig 7

The local structures of strategies for A = 1.6 and b = 1.7.

Each square corresponds to a single player, where cooperators are depicted blue and defectors are depicted red. Value denoted in the center square is the individual’s payoff. Smiling face represents satisfaction while crying face represents dissatisfaction.

Snapshots of typical distributions of strategies and aspirations at different time steps t for A = 1.6 and b = 1.7.
Fig 8

Snapshots of typical distributions of strategies and aspirations at different time steps t for A = 1.6 and b = 1.7.

(a) represents strategies, where cooperators are depicted white and defectors are depicted black. (b) represents aspirations. The steps of them are t = 0, 10, 100, 200, 500 and 1000 respectively.

From the above we know the main difference between b < 1.6 and b ≥ 1.6 for A = 1.6 is whether a defector who has three D neighbors and one C neighbor is satisfied, more simply, whether an Infector exists. For b < 1.6, P = b < A, the defector is dissatisfied and will evolve into cooperator and defector repeatedly and it is an Infector, which causes the chain phenomenon leading to defectors’ expansion shown in Fig 5. But for b ≥ 1.6, P = bA, the defector is satisfied and the network will be stable soon so that cooperators can survive in the end. In fact, for all the A ≤ 2.0, the conclusion is the same and phase transition happens in b = A because A is the highest values of b which allows Infectors to exist.

To conclude, for moderate values of A, cooperators will survive and expand in the early stages of evolution when b is lower than A, which are END and EXP periods respectively. But according to our results, the existence of Infectors may lead to defectors’ re-expansion. The core reason for this phenomenon is that the cooperators increase their aspirations excessively and become the so-called High-risk cooperators, which needs to be vigilant in the evolution process of cooperation.

0.4 Defection Explosion under High Aspiration (A > 2.0)

For A = 2.4, cooperators can survive only when rC0 is high enough for any b > 1, on the contrary, only a few defectors initially can lead cooperators to die out. Fig 9 shows all possible local structures in the network for A = 2.4 and b = 1.7. If a cooperator has two D neighbors, it is dissatisfied and evolves into a defector soon, and then it is satisfied. If a cooperator or defector X has three D neighbors, it is dissatisfied and evolves into cooperator and defector repeatedly. X’s payoff is 1.0 when it is a cooperator and 1.7 when it is a defector. X’s aspiration at step t can be got by the recursive equation:

With t growing, we can easily prove that AX will be lower than 1.7. And next time when X evolves into a defector, it will be satisfied and don’t change its strategy any more. If a cooperator or defector has four D neighbors, it will be never satisfied, but it can only be a defector since all its neighbors are defectors. All the simple local structures can’t lead defectors to expand.

The local structures of strategies for A = 2.4 and b = 1.7.
Fig 9

The local structures of strategies for A = 2.4 and b = 1.7.

Each square corresponds to a single player, where cooperators are depicted blue and defectors are depicted red. Value denoted in the center square is the individual’s payoff. Smiling face represents satisfaction while crying face represents dissatisfaction.

Fig 10 shows the structure which causes the defectors’ expansion. In initial, nodes X1 and X2 are dissatisfied and may evolve into defectors. Once X1 evolves into a defector, nodes Y1 and Y2 become dissatisfied and may also evolve into defectors, so do the other five nodes. All the nine nodes are dissatisfied and evolve into cooperators and defectors repeatedly. In other word, they are all Infectors. As a result, colored cooperators’ aspiration will be higher than 3.0 as time goes so they are High-risk cooperators. Next time when one of the nodes evolves into a defector, the cooperator will be dissatisfied and evolve into a defector. Since the other nodes’ aspirations are close to 4.0 and have became High-risk cooperators, chain phenomenon occurs and defectors will occupy the whole network.

The detailed principle for defectors’ expanding for A = 2.4, b = 1.7.
Fig 10

The detailed principle for defectors’ expanding for A = 2.4, b = 1.7.

The initial local structure is shown in (a). Smiling face represents satisfaction while crying face represents dissatisfaction.

When b is lower, defectors’ expansion requires more strict requirement. When b = 1.2, for the same structure shown in Fig 10, defectors can’t expand. We find that before cooperators’ aspirations are higher than 3.0, the nine nodes will be all satisfied in a step so that the network becomes stable with high probability. The lower b makes the network stable soon if there are only nine nodes participating in the evolution. Fig 11 shows the initial structure which causes the defectors’ expansion, and similar to Fig 10, sixteen nodes participate in the evolution. More Infectors make the evolutionary process last longer, so the colored nodes have enough time to increase their aspirations to higher than 3.0 and all of them will become High-risk cooperators. Fig 12 shows the spatial distributions of strategies and aspirations at different time steps t for A = 2.4, b = 1.7 with the initial structure shown in Fig 10. The situation of A = 2.4, b = 1.2 with the initial structure shown in Fig 11 is almost the same. In fact, the above conclusion is suitable for all the 2.0 < A ≤ 3.0. Defectors’ expansion requires more defectors’ gathering when b is lower or A is higher, vice versa.

The initial structure that causes defectors’ expansion for A = 2.4, b = 1.2.
Fig 11

The initial structure that causes defectors’ expansion for A = 2.4, b = 1.2.

Smiling face represents satisfaction while crying face represents dissatisfaction.

Snapshots of typical distributions of strategies and aspirations at different time steps t for A = 2.4 and b = 1.7 with the initial structure shown in Fig 10.
Fig 12

Snapshots of typical distributions of strategies and aspirations at different time steps t for A = 2.4 and b = 1.7 with the initial structure shown in Fig 10.

(a) represents strategies, where cooperators are depicted white and defectors are depicted black. (b) represents aspirations. The steps of them are t = 0, 10, 100, 200, 500 and 1000 respectively.

For A > 3.0, the nodes which have at least one D neighbor are dissatisfied, in other word, all the cooperators are High-risk cooperators initially. so once there are at least one defector in the network, defectors will expand to the whole network soon for any b > 1. The higher b is, the faster defectors’ expansion will be. Fig 13 shows the spatial distributions of strategies and aspirations at different time steps t for A = 3.2, b = 1.2 with only one defector initially. Cooperators are almost impossible to survive under high aspirations.

Snapshots of typical distributions of strategies and aspirations at different time steps t for A = 3.2 and b = 1.2 with only one defector initially.
Fig 13

Snapshots of typical distributions of strategies and aspirations at different time steps t for A = 3.2 and b = 1.2 with only one defector initially.

(a) represents strategies, where cooperators are depicted white and defectors are depicted black. (b) represents aspirations. The steps of them are t = 0, 10, 100, 200, 500 and 1000 respectively.

In dynamic aspiration model, three different phases could be observed. The phase under low aspiration is similar to the fixed aspiration model because most nodes are always satisfied and their aspirations are changed in a small range. However, dynamic aspiration model plays a critical role under moderate aspiration and high aspiration, where some nodes called Infector are dissatisfied no matter they are cooperators or defectors and their strategies are changed repeatedly. As a result, their neighbors’ payoff are changed and aspirations will be influenced by the evolution process, and these neighbors act as Infected nodes. Their aspirations become higher gradually but their payoffs changed repeatedly, which results in their dissatisfaction and Infected nodes will become Infectors. Once a High-risk cooperator node becomes dissatisfied, chain phenomenon happens in High-risk cooperators and defectors will expand fast.

Conclusion

To conclude, the evolution process of the Win-Stay-Lose-Learn strategy updating rule on the prisoner’s dilemma game is studied in this paper. Based on the previous work, a dynamic aspiration model is proposed, in which players will not only change their strategies based on aspirations, but also change their aspirations due to their payoffs.

Three different phases are found. Cooperators and defectors can coexist for small values of A, which is called Stable Coexistence under Low Aspiration. Only a few cooperators will evolve into defectors then and the network will be stable immediately, which is not affected by the value of b. As a comparison, defectors will easily expand to the whole network for large values of A, which is called Defection Explosion under High Aspiration respectively. Two kinds of local structures which can lead to defectors’ expansion are found, depending on the values of b. The most interesting phenomenon is cooperators can survive for higher b(bA) and die out for lower b(b < A) when 1.0 < A ≤ 2.0, which is abnormal because higher b should have meant that it is harder for the cooperators to survive, and it is called Dependent Coexistence under Moderate aspiration. The local structure leading to the defectors’ expansion is that a cooperator is surrounded by one cooperator and three defectors. Dynamic aspiration plays an important role for the above results because a constantly changing individual(Infectors) may make its neighbors’ (Infected nodes) aspirations gradually rise up and they will become Infectors. At the same time, all the other cooperators’ aspirations gradually rise up and they become High-risk cooperators. When a High-risk cooperator neighbors with an Infector, it will become a defector soon and chain phenomenon happens.

Our work provides a new enlightening opinion for the Win-Stay-Lose-Learn strategy updating rule. Dynamic aspiration introduces a more satisfactory explanation on population evolution laws. Under the mechanism of network reciprocity, the defectors’ re-expansion is got attentions. How to avoid such unfavorable phenomenon under moderate aspirations is still a challenging problem. It is hoped that our work offers a valuable method that can help explore the principle behind prisoner’s dilemma better, especially when combining with other rules which use aspiration level for personal decision making such as myopic, other-regarding preference or Pavlov-rule [5257].

Acknowledgements

We thank Marco Antonio Amaral, Xin Wang and Yuanchen Guo for discussions and suggestions.

References

RAxelrod, WDHamilton. The evolution of cooperation. science. 1981;211(4489):13901396. 10.1126/science.7466396

ARapoport, AMChammah, CJOrwant. Prisoner’s dilemma: A study in conflict and cooperation. vol. 165 University of Michigan press; 1965.

RAxelrod. Effective choice in the prisoner’s dilemma. Journal of conflict resolution. 1980;24(1):325. 10.1177/002200278002400101

RAxelrod. More effective choice in the prisoner’s dilemma. Journal of conflict resolution. 1980;24(3):379403. 10.1177/002200278002400301

RAxelrod, et al The evolution of strategies in the iterated prisoner’s dilemma. The dynamics of norms. 1987; p. 116.

GSzabó, CTőke. Evolutionary prisoner’s dilemma game on a square lattice. Physical Review E. 1998;58(1):69 10.1103/PhysRevE.58.69

RSchuster, APerelberg. Why cooperate?: an economic perspective is not enough. Behavioural Processes. 2004;66(3):261277. 10.1016/j.beproc.2004.03.008

RSGibbons. Game theory for applied economists. Princeton University Press; 1992.

RBMyerson. Game Theory: Analysis of Conflict. Harvard University Press; 1991.

10 

JMSmith, JMMSmith. Evolution and the Theory of Games. Cambridge university press; 1982.

11 

RAxelrod, WDHamilton. The evolution of cooperation in biological systems. The evolution of cooperation. 1984; p. 88105.

12 

JWWeibull. Evolutionary game theory MIT Press. Cambridge, MA 1995;.

13 

JHofbauer, KSigmund, et al Evolutionary games and population dynamics. Cambridge university press; 1998.

14 

WSandholm. Ross Cressman, Evolutionary Dynamics and Extensive Form Games, MIT Press, Cambridge, MA (2003). International Review of Economics & Finance. 2006;15(1):136140.

15 

MANowak. Five rules for the evolution of cooperation. science. 2006;314(5805):15601563. 10.1126/science.1133755

16 

MANowak, RMMay. Evolutionary games and spatial chaos. Nature. 1992;359(6398):826829. 10.1038/359826a0

17 

BHerrmann, CThöni, SGächter. Antisocial punishment across societies. Science. 2008;319(5868):13621367. 10.1126/science.1153808

18 

DHelbing, ASzolnoki, MPerc, GSzabó. Punish, but not too hard: how costly punishment spreads in the spatial public goods game. New Journal of Physics. 2010;12(8):083005 10.1088/1367-2630/12/8/083005

19 

XChen, ASzolnoki, MPerc. Probabilistic sharing solves the problem of costly punishment. New Journal of Physics. 2014;16(8):083016 10.1088/1367-2630/16/8/083016

20 

XChen, TSasaki,ÅBrännström, UDieckmann. First carrot, then stick: how the adaptive hybridization of incentives promotes cooperation. Journal of the royal society interface. 2015;12(102):20140935 10.1098/rsif.2014.0935

21 

DHelbing, WYu. Migration as a mechanism to promote cooperation. Advances in Complex Systems. 2008;11(04):641652. 10.1142/S0219525908001866

22 

RCong, BWu, YQiu, LWang. Evolution of cooperation driven by reputation-based migration. PLoS One. 2012;7(5):e35776 10.1371/journal.pone.0035776

23 

GIchinose, MSaito, HSayama, DSWilson. Adaptive long-range migration promotes cooperation under tempting conditions. Scientific reports. 2013;3:2509 10.1038/srep02509

24 

ASzolnoki, MPerc. Conformity enhances network reciprocity in evolutionary social dilemmas. Journal of The Royal Society Interface. 2015;12(103):20141299 10.1098/rsif.2014.1299

25 

ASzolnoki, MPerc. Leaders should not be conformists in evolutionary social dilemmas. Scientific Reports. 2016;6:23633 10.1038/srep23633

26 

ASzolnoki, GSzabó. Cooperation enhanced by inhomogeneous activity of teaching for evolutionary Prisoner’s Dilemma games. EPL (Europhysics Letters). 2007;77(3):30004 10.1209/0295-5075/77/30004

27 

ASzolnoki, MPerc. Coevolution of teaching activity promotes cooperation. New Journal of Physics. 2008;10(4):043036 10.1088/1367-2630/10/4/043036

28 

GSzabó, ASzolnoki. Cooperation in spatial prisoner’s dilemma with two types of players for increasing number of neighbors. Physical Review E. 2009;79(1):016106 10.1103/PhysRevE.79.016106

29 

HXYang, ZRong, PMLu, YZZeng. Effects of aspiration on public cooperation in structured populations. Physica A Statal Mechanics & Its Applications. 2012;391(15):40434049. 10.1016/j.physa.2012.03.018

30 

TWu, FFu, LWang. Coevolutionary dynamics of aspiration and strategy in spatial repeated public goods games. New Journal of Physics. 2018;20(6). 10.1088/1367-2630/aac687

31 

CChu, CMu, JLiu, CLiu, SBoccaletti, LShi, et al Aspiration-based coevolution of node weights promotes cooperation in the spatial prisoner’s dilemma game. New Journal of Physics. 2019;. 10.1088/1367-2630/ab0999

32 

LZhang, CHuang, HLi, QDai. Aspiration-dependent strategy persistence promotes cooperation in spatial prisoner’s dilemma game. Epl. 2019;126(1):18001 10.1209/0295-5075/126/18001

33 

YLiu, XChen, LZhang, LWang, MPerc. Win-stay-lose-learn promotes cooperation in the spatial prisoner’s dilemma game. PloS one. 2012;7(2):e30689 10.1371/journal.pone.0030689

34 

CChu, JLiu, CShen, JJin, LShi. Win-stay-lose-learn promotes cooperation in the prisoner’s dilemma game with voluntary participation. Plos one. 2017;12(2):e0171680 10.1371/journal.pone.0171680

35 

MJFu, HXYang. Stochastic win-stay-lose-learn promotes cooperation in the spatial public goods game. International Journal of Modern Physics C. 2018;29(04):1850034 10.1142/S0129183118500341

36 

JWAtkinson, NTFeather. A theory of achievement motivation. Psychosomatics. 1974;8(4):247248.

37 

MAAmaral, LWardil, MPerc, JKda Silva. Stochastic win-stay-lose-shift strategy with dynamic aspirations in evolutionary social dilemmas. Physical Review E. 2016;94(3):032317 10.1103/PhysRevE.94.032317

38 

MPosch, APichler, KSigmund. The efficiency of adapting aspiration levels. Proceedings of the Royal Society of London Series B: Biological Sciences. 1999;266(1427):14271435. 10.1098/rspb.1999.0797

39 

MRArefin, JTanimoto. Evolution of cooperation in social dilemmas under the coexistence of aspiration and imitation mechanisms. Physical Review E. 2020;102(3):032120 10.1103/PhysRevE.102.032120

40 

ZWang, SKokubo, JTanimoto, EFukuda, KShigaki. Insight into the so-called spatial reciprocity. Physical Review E. 2013;88(4):042145 10.1103/PhysRevE.88.042145

41 

TOgasawara, JTanimoto, EFukuda, AHagishima, NIkegaya. Effect of a large gaming neighborhood and a strategy adaptation neighborhood for bolstering network reciprocity in a prisoner’s dilemma game. Journal of Statistical Mechanics: Theory and Experiment. 2014;2014(12):P12024 10.1088/1742-5468/2014/12/P12024

42 

KAKabir, JTanimoto, ZWang. Influence of bolstering network reciprocity in the evolutionary spatial prisoner’s dilemma game: A perspective. The European Physical Journal B. 2018;91(12):312 10.1140/epjb/e2018-90214-6

43 

Tanimoto J. Evolutionary Games With Sociophysics. Evolutionary Economics. 2019;.

44 

JTanimoto. Fundamentals of evolutionary game theory and its applications. Springer; 2015.

45 

JTanimoto, HSagara. Relationship between dilemma occurrence and the existence of a weakly dominant strategy in a two-player symmetric game. BioSystems. 2007;90(1):105114. 10.1016/j.biosystems.2006.07.005

46 

ZWang, SKokubo, MJusup, JTanimoto. Universal scaling for the dilemma strength in evolutionary games. Physics of life reviews. 2015;14:130. 10.1016/j.plrev.2015.04.033

47 

HIto, JTanimoto. Scaling the phase-planes of social dilemma strengths shows game-class changes in the five rules governing the evolution of cooperation. Royal Society open science. 2018;5(10):181085 10.1098/rsos.181085

48 

HIto, JTanimoto. Dynamic utility: the sixth reciprocity mechanism for the evolution of cooperation. Royal Society open science. 2020;7(8):200891 10.1098/rsos.200891

49 

MRArefin, KAKabir, MJusup, HIto, JTanimoto. Social efficiency deficit deciphers social dilemmas. Scientific reports. 2020;10(1):19. 10.1038/s41598-020-72971-y

50 

GSzabó, JVukov, ASzolnoki. Phase diagrams for an evolutionary prisoner’s dilemma game on two-dimensional lattices. Physical Review E. 2005;72(4):047107 10.1103/PhysRevE.72.047107

51 

MPerc. Coherence resonance in a spatial prisoner’s dilemma game. New Journal of Physics. 2006;8(2):22 10.1088/1367-2630/8/2/022

52 

MPosch. Win–stay, lose–shift strategies for repeated games memory length, aspiration levels and noise. Journal of theoretical biology. 1999;198(2):183195. 10.1006/jtbi.1999.0909

53 

HFort, SViola. Spatial patterns and scale freedom in Prisoner’s Dilemma cellular automata with Pavlovian strategies. Journal of Statistical Mechanics: Theory and Experiment. 2005;2005(01):P01010 10.1088/1742-5468/2005/01/P01010

54 

CTaylor, MANowak. Evolutionary game dynamics with non-uniform interaction rates. Theoretical population biology. 2006;69(3):243252. 10.1016/j.tpb.2005.06.009

55 

GSzabo, ASzolnoki, LCzako. Coexistence of fraternity and egoism for spatial social dilemmas. Journal of theoretical biology. 2013;317:126132. 10.1016/j.jtbi.2012.10.014

56 

TWu, FFu, LWang. Coevolutionary dynamics of aspiration and strategy in spatial repeated public goods games. New Journal of Physics. 2018;20(6):063007 10.1088/1367-2630/aac687

57 

CShen, CChu, LShi, MPerc, ZWang. Aspiration-based coevolution of link weight promotes cooperation in the spatial prisoner’s dilemma game. Royal Society open science. 2018;5(5):180199 10.1098/rsos.180199