Game Theory: The Mathematics of Strategic Interaction
Game theory is the mathematical study of strategic decision-making among rational agents. From Nash equilibrium to evolutionary dynamics, it provides the tools to analyze conflict, cooperation, auctions, voting, and any situation where outcomes depend on the choices of multiple decision-makers.
1. Normal Form Games
A normal form (or strategic form) game is the most fundamental representation in game theory. It captures a situation where players choose strategies simultaneously without observing the choices of others.
Formal Definition
A normal form game consists of three components: a finite set of players N (often N = 1, 2, ..., n), a strategy set S_i for each player i in N, and a payoff function u_i mapping strategy profiles to real numbers for each player i. A strategy profile is a tuple (s_1, s_2, ..., s_n) where each s_i is drawn from S_i.
Components of a Normal Form Game
- Players: N = 1, 2, ..., n
- Strategy sets: S_1, S_2, ..., S_n
- Payoff functions: u_i : S_1 x S_2 x ... x S_n to R
- Strategy profile: s = (s_1, ..., s_n) in S_1 x ... x S_n
Payoff Matrices
For two-player games, payoffs are conveniently represented as a matrix. Player 1 (the row player) chooses a row, Player 2 (the column player) chooses a column, and each cell contains the payoff pair (u_1, u_2). The convention is to list the row player's payoff first.
Generic 2x2 Payoff Matrix
| Strategy | Player 2: Left | Player 2: Right |
|---|---|---|
| Player 1: Up | (a, e) | (b, f) |
| Player 1: Down | (c, g) | (d, h) |
Each cell (x, y) gives payoff x to Player 1 and payoff y to Player 2.
Dominant Strategies
Strategy s_i strictly dominates strategy t_i if u_i of (s_i, s_-i) is greater than u_i of (t_i, s_-i) for every strategy profile s_-i of the other players. A rational player will never play a strictly dominated strategy. Weak dominance allows equality in some profiles.
Key Insight
A dominant strategy is a best response regardless of what opponents do. When a dominant strategy exists, a rational player will always choose it, making the analysis straightforward. Many games lack dominant strategies, requiring the full machinery of Nash equilibrium.
Iterated Elimination of Dominated Strategies (IEDS)
IEDS is a solution procedure that repeatedly removes dominated strategies. In each round, any strictly dominated strategy is eliminated for any player. The process continues until no further elimination is possible. The surviving strategies are called rationalizable. If IEDS yields a unique outcome, the game is said to be dominance-solvable.
Order of elimination does not matter for strict dominance: the set of strategies surviving IEDS is the same regardless of which dominated strategies are removed first. This order-independence fails for weak dominance.
IEDS Procedure
- Identify any strictly dominated strategy for any player
- Eliminate that strategy and update the game
- Repeat until no strictly dominated strategies remain
- Surviving strategies are the rationalizable strategies
- If a unique profile survives, it is the unique Nash equilibrium
2. Nash Equilibrium
Nash equilibrium is the central solution concept of non-cooperative game theory. Named after John Nash, who proved its existence in 1950, it describes a stable state of a game from which no player has a unilateral incentive to deviate.
Definition
A strategy profile s* = (s*_1, ..., s*_n) is a Nash equilibrium if for every player i and every alternative strategy t_i in S_i, the payoff of player i from the equilibrium strategy is at least as large as from any deviation:
u_i(s*_i, s*_-i) >= u_i(t_i, s*_-i) for all t_i in S_i
Each player i is playing a best response to the strategies of all other players. No player can gain by deviating unilaterally.
Best Response Correspondence
The best response of player i to the strategy profile s_-i of others is the set of strategies that maximize u_i given s_-i. A Nash equilibrium is precisely a profile where every player is playing a best response to the others simultaneously. The best response correspondence BR_i maps S_-i to subsets of S_i.
BR_i(s_-i) = argmax over s_i in S_i of u_i(s_i, s_-i)
Nash equilibrium: s* is NE if and only if s*_i is in BR_i(s*_-i) for all i.
Nash's Existence Theorem (1950)
Nash proved that every finite game has at least one Nash equilibrium, possibly in mixed strategies. The proof uses the Kakutani fixed-point theorem applied to the product of best-response correspondences. The key steps are:
- Extend strategies to mixed strategies (probability distributions over pure strategies)
- Show the expected payoff is linear (hence continuous) in each player's own mixing probabilities
- Show best responses form a convex-valued, upper hemicontinuous correspondence
- Apply Kakutani's theorem to guarantee a fixed point
- The fixed point is a Nash equilibrium in mixed strategies
Pure vs. Mixed Strategy Nash Equilibria
Pure Strategy NE
Each player selects a single deterministic strategy. Pure strategy Nash equilibria exist in many games but not all. They are easy to verify: check that no player benefits from switching to any other pure strategy.
Mixed Strategy NE
Each player randomizes over strategies according to a probability distribution. A player will mix only if every strategy in their support yields the same expected payoff. Nash's theorem guarantees existence in this broader class for all finite games.
Nash Equilibrium Properties
- Every finite game has at least one NE (Nash 1950)
- Every pure strategy NE is also a mixed strategy NE (degenerate distribution)
- NE does not require the outcome to be Pareto optimal or socially efficient
- Games may have multiple NE; refinements are used to select among them
- NE is a steady state: if players coordinate on it, no one deviates
3. Classic Games and Their Equilibria
Several canonical games have become essential benchmarks in game theory because they distill fundamental strategic tensions. Understanding these games builds intuition for equilibrium analysis.
The Prisoner's Dilemma
Two suspects are interrogated separately. Each can Cooperate (stay silent) or Defect (betray the other). The payoff structure creates a situation where individual rationality leads to mutual harm.
Prisoner's Dilemma Payoff Matrix
| Strategy | Player 2: Cooperate | Player 2: Defect |
|---|---|---|
| Player 1: Cooperate | (-1, -1) | (-3, 0) |
| Player 1: Defect | (0, -3) | (-2, -2) |
Payoffs represent years in prison (negative = bad). NE: (Defect, Defect) with payoff (-2, -2), though (Cooperate, Cooperate) with (-1, -1) is Pareto superior.
Defect is a strictly dominant strategy for both players: regardless of the opponent's choice, defecting yields a better outcome. The unique Nash equilibrium (Defect, Defect) is Pareto inferior — both players are worse off than if they had both cooperated. This tension between individual and collective rationality is the central message of the game.
Battle of the Sexes
A couple must decide between two activities — say, Opera (preferred by Player 1) and Football (preferred by Player 2) — but both prefer being together over being apart. This is a coordination game with two pure strategy Nash equilibria.
Battle of the Sexes Payoff Matrix
| Strategy | Player 2: Opera | Player 2: Football |
|---|---|---|
| Player 1: Opera | (2, 1) | (0, 0) |
| Player 1: Football | (0, 0) | (1, 2) |
Pure NE: (Opera, Opera) and (Football, Football). Mixed NE: Player 1 plays Opera with probability 2/3, Player 2 plays Opera with probability 1/3.
Stag Hunt
Two hunters can cooperate to hunt a stag (large payoff requiring coordination) or individually hunt a hare (smaller but guaranteed payoff). The Stag Hunt models trust and social cooperation: the cooperative outcome is Pareto superior but requires mutual trust.
Stag Hunt Payoff Matrix
| Strategy | Player 2: Stag | Player 2: Hare |
|---|---|---|
| Player 1: Stag | (4, 4) | (0, 3) |
| Player 1: Hare | (3, 0) | (3, 3) |
Two pure NE: (Stag, Stag) — payoff efficient — and (Hare, Hare) — risk dominant. (Stag, Stag) is Pareto superior; (Hare, Hare) is safer against an untrustworthy opponent.
Matching Pennies
Matching Pennies is the canonical zero-sum game. Player 1 wins if both pennies match; Player 2 wins if they differ. There is no pure strategy Nash equilibrium: whatever pure strategy one player uses, the other can exploit it. The unique Nash equilibrium is fully mixed.
Matching Pennies Payoff Matrix
| Strategy | Player 2: Heads | Player 2: Tails |
|---|---|---|
| Player 1: Heads | (1, -1) | (-1, 1) |
| Player 1: Tails | (-1, 1) | (1, -1) |
Unique NE: Both players mix 50/50 over Heads and Tails. Each player's expected payoff is 0.
4. Mixed Strategy Nash Equilibrium
A mixed strategy assigns a probability distribution over the pure strategies of a player. Mixed strategy Nash equilibria arise naturally in games without pure strategy equilibria and in games where players benefit from being unpredictable.
The Indifference Condition
The key to computing mixed strategy Nash equilibria is the indifference principle: a player will mix over a set of strategies only if all strategies in their mixing support yield the same expected payoff. If one strategy were strictly better, the player would deviate to playing it with probability 1.
Indifference Condition
In a mixed NE where player i mixes between strategies A and B with positive probability:
E[u_i(A, sigma_-i)] = E[u_i(B, sigma_-i)]
sigma_-i denotes the mixed strategy profile of all players other than i.
Computing Mixed Strategy NE: Step-by-Step
Consider Battle of the Sexes. Let Player 1 play Opera with probability p and Player 2 play Opera with probability q.
Computation for Battle of the Sexes
Step 1: Make Player 1 indifferent by choosing q (Player 2's probability).
E[u_1(Opera)] = 2q + 0(1-q) = 2q
E[u_1(Football)] = 0q + 1(1-q) = 1-q
Set equal: 2q = 1-q, so 3q = 1, q = 1/3
Step 2: Make Player 2 indifferent by choosing p (Player 1's probability).
E[u_2(Opera)] = 1p + 0(1-p) = p
E[u_2(Football)] = 0p + 2(1-p) = 2-2p
Set equal: p = 2-2p, so 3p = 2, p = 2/3
Mixed NE: Player 1 plays Opera with p=2/3, Player 2 plays Opera with q=1/3
Interpretation of Mixed Strategies
Mixed strategies can be interpreted in several ways. The literal interpretation is that players actually randomize. A more compelling interpretation for one-shot games is epistemic: the mixing probability represents the opponent's beliefs about what the player will do, and the player is indifferent over what to play given those beliefs. In repeated games, mixing can represent the empirical frequency of play across repetitions.
Counterintuitive Feature
In a mixed NE, each player's mixing probability is determined by the other player's payoffs, not their own. Player 1's mixing probability makes Player 2 indifferent; Player 2's probability makes Player 1 indifferent. A change in your own payoffs changes your opponent's equilibrium mixing, not necessarily your own.
5. Extensive Form Games and Subgame Perfect Equilibrium
Extensive form games capture sequential decision-making by representing the game as a tree. Players move in some order, and later players may or may not observe earlier actions. This representation is essential for modeling negotiations, auctions, and dynamic interactions.
Game Trees
An extensive form game consists of a tree of nodes. Each non-terminal node is a decision node belonging to some player (or to Nature for random events). Edges represent possible actions. Terminal nodes carry payoff vectors for all players. The game begins at the root node.
Extensive Form Components
- Decision nodes: each assigned to one player
- Actions: edges emanating from each decision node
- Terminal nodes: end of play, payoffs realized
- Information sets: nodes a player cannot distinguish (captures imperfect information)
- Strategies: a complete contingent plan specifying an action at every information set
Information Sets and Perfect vs. Imperfect Information
An information set groups decision nodes that a player cannot tell apart when it is their turn to move. In a perfect information game, each information set contains exactly one node — every player knows the full history of play. In imperfect information games, some information sets have multiple nodes, meaning the player cannot observe some prior actions.
A strategy in an extensive form game specifies an action at every information set of a player — even those information sets that would not be reached under the prescribed strategy. This is important: a strategy is a complete contingent plan, not just a plan for what is likely to happen on the equilibrium path.
Subgame Perfect Equilibrium
A subgame of an extensive form game is a subtree that starts at a single decision node and includes all of its successors, with the requirement that any information set is either entirely in the subgame or entirely outside it. A subgame perfect equilibrium (SPE) is a Nash equilibrium that induces a Nash equilibrium in every subgame.
Why SPE Refines Nash Equilibrium
Nash equilibrium allows non-credible threats: a player may threaten an action off the equilibrium path that they would never rationally carry out. SPE rules out such threats by requiring rational play in every subgame, including those never reached in equilibrium. This eliminates Nash equilibria sustained by empty threats.
Backward Induction
Backward induction is the algorithm for computing SPE in finite games of perfect information. Starting from the terminal nodes, determine the optimal action at every final decision node. Then fold back: replace each subtree with the payoff the player at the root of that subtree would achieve under optimal play. Continue until the entire tree is reduced to a single payoff vector at the root.
Centipede Game Insight
Backward induction sometimes yields counterintuitive results. In the Centipede Game, two players alternately can Continue (passing a growing pot) or Stop (taking most of the pot). Backward induction predicts the first player stops immediately, even though both could gain from cooperating through many rounds. This tension between theory and observed human behavior motivates behavioral game theory.
Ultimatum and Bargaining Games
In the Ultimatum Game, Player 1 proposes a split of a dollar; Player 2 accepts or rejects. Rejection gives both players zero. Backward induction predicts Player 1 offers the smallest positive amount and Player 2 accepts. Experimentally, offers below 30% are frequently rejected, demonstrating that real agents have preferences over fairness, not just money. The Rubinstein bargaining model extends this to alternating offers with discounting and obtains a unique SPE with an explicit formula for the equilibrium split as a function of the discount factors.
6. Repeated Games and the Folk Theorem
Repeated games model long-run interactions where a stage game is played multiple times by the same players. The shadow of future play can sustain cooperation that would be impossible in a one-shot game.
Finitely vs. Infinitely Repeated Games
In a finitely repeated game, backward induction applies: in the last period, no future punishment is possible, so players play the stage-game NE. Working backward, if the stage game has a unique NE, it is played in every period. This means cooperation cannot be sustained in the finitely repeated Prisoner's Dilemma by purely rational players — a sobering result called the end-game problem.
In an infinitely repeated game (or a game that ends with some probability each period), the future always matters. Players discount future payoffs by a factor delta in (0, 1) per period. High delta (patient players) makes cooperation sustainable.
Trigger Strategies
A trigger strategy specifies: cooperate as long as all players have cooperated in every prior period; if anyone has ever defected, defect forever. This is the grim trigger strategy. Under a grim trigger, the value of cooperating today must exceed the temptation gain from defecting plus the cost of permanent future defection.
Cooperation Condition in Infinitely Repeated PD
Payoffs: R = reward for mutual cooperation; T = temptation; P = mutual defection payoff
Value of cooperating: R / (1-delta)
Value of deviating: T + delta * P / (1-delta)
Condition: R / (1-delta) >= T + delta * P / (1-delta)
Simplifies to: delta >= (T - R) / (T - P)
Cooperation is SPE if and only if delta is large enough.
The Folk Theorem
The Folk Theorem is a family of results stating that essentially any individually rational and feasible payoff can be sustained as an SPE of the infinitely repeated game when players are sufficiently patient.
Folk Theorem (informal)
Let v*_i be the minmax payoff for player i — the lowest payoff that the other players can force upon i. A payoff vector v is individually rational if v_i is greater than v*_i for all i. For any individually rational and feasible payoff v, there exists a discount factor delta* less than 1 such that for all delta greater than delta*, the payoff v can be achieved as an SPE of the infinitely repeated game.
The Folk Theorem explains why long-term relationships sustain cooperation in markets, cartels, international agreements, and social norms. It also shows that repeated game theory has strong predictive power only when combined with equilibrium selection criteria, since the Folk Theorem admits a vast multiplicity of equilibria.
7. Bayesian Games and Incomplete Information
Bayesian games model situations where players have private information — their own characteristics (types) are known to themselves but not fully observed by others. Harsanyi's framework converts games of incomplete information into games of imperfect information by introducing a prior distribution over types.
Types and Beliefs
Each player i has a type theta_i drawn from a type space Theta_i. Types encode all private information — valuations, costs, capabilities. The joint type distribution P over all types is common knowledge (the prior). Player i knows only their own type and updates beliefs about others' types using Bayes' rule.
Bayesian Game Components
- Players: N = 1, ..., n
- Type spaces: Theta_i for each player i
- Prior distribution: P over the product of all type spaces
- Action spaces: A_i for each player i
- Payoff functions: u_i depending on all actions and all types
- Strategies: sigma_i maps types to actions (sigma_i : Theta_i to A_i)
Bayesian Nash Equilibrium
A Bayesian Nash equilibrium (BNE) is a strategy profile sigma* such that for each player i and each type theta_i, the strategy sigma*_i(theta_i) maximizes player i's expected payoff, taking expectations over the types of the other players and using their equilibrium strategies.
BNE Optimality Condition
sigma*_i(theta_i) = argmax over a_i of E[u_i(a_i, sigma*_-i(theta_-i), theta_i, theta_-i)]
Each player maximizes expected payoff taking expectations over others' types and using their equilibrium strategies.
Signaling Games
Signaling games are a class of Bayesian games where one player (the Sender) has private information and can send a costly signal to another player (the Receiver) who then takes an action. The canonical example is Spence's education model (Nobel Prize 2001): workers have unobservable productivity; education is a costly signal that can separate high-productivity from low-productivity workers even if education has no direct effect on productivity.
Equilibria in signaling games can be separating (different types send different signals) or pooling (all types send the same signal). The Receiver updates beliefs using Bayes' rule after observing the signal and responds optimally. The Perfect Bayesian Equilibrium (PBE) solution concept requires consistent beliefs and sequential rationality at every information set.
Applications of Signaling
- Education as a signal of worker ability (Spence 1973)
- Dividend payments as a signal of firm profitability
- Warranties as a signal of product quality
- Costly advertising as a signal of repeat-purchase confidence
- Constitutional design and commitment to policy
8. Mechanism Design and Auction Theory
Mechanism design is sometimes called reverse game theory: instead of analyzing the equilibria of a given game, the designer chooses the rules of the game to achieve a desired outcome. The designer must account for the fact that agents have private information about their preferences (types) and will behave strategically.
Mechanisms and Revelation Principle
A mechanism is a message space M_i for each agent and an outcome function g mapping messages to outcomes. The Revelation Principle states that any outcome achievable by any mechanism can also be achieved by a direct revelation mechanism — where agents report their types directly — in which truth-telling is a dominant strategy (or at least a Bayesian Nash equilibrium). This dramatically simplifies mechanism design: we only need to consider incentive-compatible mechanisms.
Incentive Compatibility (IC)
A mechanism is incentive-compatible (truth-telling is a dominant strategy) if reporting the true type theta_i weakly dominates any misreport m_i, regardless of what others report. Under IC, the designer can assume all agents truthfully reveal their types without loss of generality.
VCG Mechanism
The Vickrey-Clarke-Groves (VCG) mechanism is the canonical dominant-strategy incentive-compatible mechanism for social welfare maximization. It selects the outcome maximizing total social welfare and charges each agent the externality they impose on others — the reduction in others' welfare caused by the agent's presence.
VCG Transfer for Agent i
t_i = max over outcomes x of [sum of v_j(x) for j not equal i] minus [sum of v_j(x*) for j not equal i]
x* is the welfare-maximizing outcome. The transfer makes each agent the residual claimant of social surplus, aligning individual and social incentives.
Auction Theory: First-Price vs. Second-Price
Auctions are a primary application of mechanism design. In a single-item auction with independent private values, two common formats are:
First-Price Sealed-Bid Auction
Each bidder submits a sealed bid. Highest bidder wins and pays their bid. Equilibrium involves bid-shading: each bidder bids below their valuation to balance the probability of winning against the surplus captured. With n symmetric bidders and uniform valuations on [0,1], each bids (n-1)/n times their valuation. Truth-telling is NOT a dominant strategy.
Second-Price Sealed-Bid Auction (Vickrey)
Each bidder submits a sealed bid. Highest bidder wins but pays the second-highest bid. Truth-telling (bidding your true valuation) is a weakly dominant strategy. This is the VCG mechanism for single-item allocation. The simplicity of dominant strategy truth-telling makes second-price auctions widely used in practice (Google AdWords, eBay).
Revenue Equivalence Theorem
Under standard assumptions (risk-neutral bidders, independent private values, symmetric bidders, zero expected payment for the lowest type), all auction formats that allocate the item to the highest-value bidder yield the same expected revenue to the seller. This includes first-price, second-price, and many other formats. Revenue differences arise when these assumptions fail (risk aversion, correlated values, asymmetric bidders).
9. Cooperative Game Theory
Cooperative game theory studies how groups of players can coordinate to achieve better outcomes, and how the resulting gains should be distributed among coalition members. Unlike non-cooperative theory, it takes coalition formation as a primitive and asks what outcomes are stable and fair.
Characteristic Function and Coalitions
A cooperative game in characteristic function form consists of a player set N and a characteristic function v assigning a value v(S) to every coalition S (any subset of N). The value v(S) represents the total payoff the coalition S can guarantee its members, regardless of what players outside S do.
Superadditivity
Most economically interesting games are superadditive: for any two disjoint coalitions S and T,
v(S union T) >= v(S) + v(T)
Larger coalitions are at least as productive as the sum of smaller ones. Superadditivity implies the grand coalition N is efficient to form.
The Core
The core of a cooperative game is the set of payoff allocations x = (x_1, ..., x_n) satisfying two conditions: efficiency (the sum of all x_i equals v(N)) and coalitional rationality (for every coalition S, the sum of x_i for members of S is at least v(S)). An allocation in the core cannot be blocked by any coalition that can do better on its own.
The core may be empty (no stable allocation exists), a single point, or an entire region. Convex games always have a non-empty core. Many matching markets and competitive equilibria correspond to core allocations.
The Shapley Value
The Shapley value is a unique allocation satisfying four axioms that together characterize a fair division of the coalition's surplus. It assigns each player their expected marginal contribution averaged over all possible orderings in which players join the grand coalition.
Shapley Value Formula
phi_i(v) = sum over all coalitions S containing i of: [|S|-1]![n-|S|]! / n! times [v(S) - v(S minus player i)]
Equivalently: phi_i is the average over all orderings of the marginal contribution of player i when they join the coalition formed by all players listed before them.
Shapley Axioms
- Efficiency: the sum of phi_i over all i equals v(N) (total value is fully distributed)
- Symmetry: symmetric players receive equal Shapley values
- Dummy: a player contributing nothing to any coalition receives zero
- Additivity: phi_i(v + w) = phi_i(v) + phi_i(w) for any two games v and w
The Shapley value appears in many applications beyond game theory. SHAP values in machine learning interpret the contribution of each feature to a model's prediction using the Shapley axioms. In political science, the Shapley-Shubik index measures voting power. In cost sharing, the Shapley value distributes joint costs fairly according to the marginal contribution principle.
10. Evolutionary Game Theory
Evolutionary game theory, developed by Maynard Smith and Price in the 1970s, applies game-theoretic concepts to biological populations. Rather than assuming rational optimization, it asks which strategies survive the selection pressure of repeated interaction in a population.
Evolutionarily Stable Strategies (ESS)
A strategy s* is evolutionarily stable if a population playing s* cannot be invaded by a small group of mutants playing any alternative strategy m. Formally, s* is an ESS if for all m not equal to s*:
ESS Conditions
Condition 1 (Nash): u(s*, s*) >= u(m, s*)
Condition 2 (Stability): If u(s*, s*) = u(m, s*), then u(s*, m) > u(m, m)
Every ESS is a Nash equilibrium, but not every Nash equilibrium is an ESS.
Hawk-Dove Game
The Hawk-Dove game models competition over a resource of value V against the cost C of injury from fighting. Hawks always fight; Doves always display and retreat when faced with a Hawk. When V is less than C, there is no pure ESS — the stable equilibrium is a mixed population with fraction V/C playing Hawk.
Hawk-Dove Payoff Matrix
| Strategy | Opponent: Hawk | Opponent: Dove |
|---|---|---|
| Hawk | (V-C)/2 | V |
| Dove | 0 | V/2 |
When V is less than C: ESS is a mixed population with fraction p = V/C playing Hawk. When V is greater than C: Hawk is a pure ESS.
Replicator Dynamics
Replicator dynamics describe how the frequency of strategies evolves over time in a large population. Strategies with above-average fitness grow in frequency; strategies with below-average fitness decline. The replicator equation is:
Replicator Equation
dx_i/dt = x_i times [u(e_i, x) - u(x, x)]
x_i is the frequency of strategy i, u(e_i, x) is its fitness against the current population x, and u(x, x) is the average population fitness. Fixed points of replicator dynamics correspond to Nash equilibria. Asymptotically stable rest points correspond to evolutionarily stable strategies.
Replicator dynamics connect evolutionary game theory to dynamical systems. In the Hawk-Dove game with V less than C, the mixed ESS is a globally stable fixed point: any interior initial condition converges to the V/C mixing frequency. In the Prisoner's Dilemma, Defect is a globally stable fixed point under replicator dynamics — cooperation is washed out without additional mechanisms like kin selection, reciprocity, or spatial structure.
11. Zero-Sum Games and the Minimax Theorem
A zero-sum game is one where the payoffs sum to zero in every outcome: one player's gain is exactly the other's loss. Von Neumann's minimax theorem (1928) was the first major theorem in game theory, predating Nash by over two decades.
Minimax and Maximin
In a two-player zero-sum game, Player 1 (the maximizer) wants to maximize their payoff while Player 2 (the minimizer) wants to minimize it. Each player's cautious strategy is to optimize against the worst case:
Maximin (Player 1): max over sigma_1 of min over sigma_2 of u_1(sigma_1, sigma_2)
Minimax (Player 2): min over sigma_2 of max over sigma_1 of u_1(sigma_1, sigma_2)
Von Neumann's Minimax Theorem: maximin = minimax (in mixed strategies)
Saddle Points
A pure strategy Nash equilibrium in a zero-sum game corresponds to a saddle point of the payoff matrix: an entry that is simultaneously the minimum of its row (Player 2 cannot do better by switching columns) and the maximum of its column (Player 1 cannot do better by switching rows). Saddle points exist only in some games; when they do, no mixing is needed in equilibrium.
Linear Programming Duality
Finding the minimax strategy in a zero-sum game is equivalent to solving a linear program. The minimax theorem is equivalent to the LP duality theorem: the primal and dual programs have the same optimal value. This connection enables efficient computation of equilibria in zero-sum games via LP solvers, even for very large strategy spaces. Every Nash equilibrium in a two-player zero-sum game can be found by solving the corresponding LP.
Applications of Minimax
- Military strategy and adversarial planning
- Minimax decision rule in statistical decision theory
- AlphaGo and chess engines (minimax search with alpha-beta pruning)
- Adversarial machine learning (GAN training)
- Robust optimization against worst-case scenarios
12. Applications Across Disciplines
Game theory has transformed how researchers model strategic behavior across economics, political science, biology, and computer science. Below is a survey of major application domains.
Economics: Oligopoly and Industrial Organization
The Cournot and Bertrand models are the two canonical oligopoly games. In Cournot competition, firms simultaneously choose quantities; the Nash equilibrium price exceeds marginal cost, yielding supracompetitive profits that decline as the number of firms grows. In Bertrand competition, firms simultaneously choose prices; the Nash equilibrium drives prices to marginal cost even with only two firms — the Bertrand paradox. Differentiated products, capacity constraints, and repeated interaction modify these stark predictions.
Cournot Duopoly (Symmetric, Linear Demand)
Demand: P = a - b(q_1 + q_2); Marginal cost: c
Best response firm 1: q_1* = (a - c - b*q_2) / (2b)
Symmetric NE: q* = (a - c) / (3b) per firm
NE price: P* = (a + 2c) / 3; strictly above marginal cost c
Political Science: Voting and Elections
The median voter theorem (Hotelling-Downs model) shows that in a one-dimensional policy space with majority rule, both candidates converge to the position of the median voter as a Nash equilibrium. Arrow's Impossibility Theorem shows that no voting rule satisfies all of a small set of desirable axioms simultaneously (Pareto efficiency, non-dictatorship, independence of irrelevant alternatives). The Gibbard-Satterthwaite theorem shows that any deterministic voting rule satisfying mild conditions is subject to strategic manipulation.
Biology: Cooperation and Evolution
Game theory explains altruism and cooperation in biological populations through several mechanisms: kin selection (Hamilton's rule — cooperation evolves when relatedness r times benefit B exceeds cost C), reciprocal altruism (tit-for-tat in repeated interactions), group selection, and network reciprocity (spatial structure allowing cooperators to cluster). The evolution of cooperation in the Prisoner's Dilemma remains an active research area linking game theory, evolutionary biology, and complex systems.
Computer Science: Network Design and AI
Algorithmic game theory studies the computational aspects of strategic interaction. The price of anarchy measures the efficiency loss from selfish routing in networks: the ratio of worst-case Nash equilibrium social cost to optimal social cost. In Braess's paradox, adding a road to a network can increase everyone's travel time. Mechanism design underlies internet advertising auctions (Google, Facebook), spectrum auctions, and matching platforms (residency matching, school choice). In AI, multi-agent reinforcement learning, adversarial training of neural networks, and game-solving algorithms (AlphaGo, Libratus for poker) all apply game-theoretic principles.
Price of Anarchy
Ratio of worst Nash equilibrium cost to optimal cost. Measures efficiency loss from decentralized decision-making in routing, resource allocation, and network design.
Matching Theory
Gale-Shapley algorithm finds stable matchings. Applied to medical residency (NRMP), school choice, kidney exchange, and labor markets (Roth and Shapley, Nobel 2012).
Multi-Agent Learning
Reinforcement learning in multi-agent settings raises issues of convergence, equilibrium selection, and non-stationarity as all agents learn simultaneously.
13. Practice Problems with Solutions
Work through these problems to master game theory concepts. Click each problem to reveal the solution.
Problem 1: Finding Pure Strategy Nash Equilibria
Consider the following payoff matrix. Find all pure strategy Nash equilibria.
| L | C | R | |
|---|---|---|---|
| T | (3, 2) | (1, 4) | (2, 1) |
| M | (2, 3) | (4, 2) | (1, 5) |
| B | (1, 4) | (3, 1) | (2, 3) |
Solution
Mark each player's best responses. For each column, identify the maximum payoff for Player 1. For each row, identify the maximum payoff for Player 2.
Player 1's best responses: Column L — row T (payoff 3); Column C — row M (payoff 4); Column R — row M (payoff 1 ties with B at 2... actually B=2, M=1, T=2 — row T or B).
Correcting: Column R payoffs for P1: T=2, M=1, B=2. Best responses to R: T or B. Player 2's best responses: Row T — column C (payoff 4); Row M — column R (payoff 5); Row B — column L (payoff 4).
Nash equilibrium requires mutual best responses. Check (M, R): P1 best response to R includes T and B but not M (M gives 1). Not NE. Check (T, C): P2's best response to T is C. P1's best response to C is M (payoff 4), not T (payoff 1). Not NE. Check (M, C): P1's best response to C is M (payoff 4). P2's best response to M is R (payoff 5), not C. Not NE.
Continuing the search: (T, L): P1 best response to L is T (payoff 3). P2 best response to T is C (payoff 4), not L. Not NE. (B, R): P1 best response to R includes B (payoff 2). P2 best response to B is L (payoff 4), not R. Not NE.
This game has no pure strategy Nash equilibrium — a mixed strategy NE must be found. The systematic check confirms that for every pure strategy profile, at least one player can profitably deviate.
Problem 2: Mixed Strategy Nash Equilibrium in Matching Pennies
In Matching Pennies, verify that mixing 50-50 over Heads and Tails for both players is a Nash equilibrium. Show that no pure strategy NE exists.
Solution
Pure strategy NE check: If Player 1 plays Heads, Player 2 prefers Tails (payoff 1 vs -1). If Player 2 plays Tails, Player 1 prefers Tails (payoff 1 vs -1). If Player 1 plays Tails, Player 2 prefers Heads (payoff 1 vs -1). If Player 2 plays Heads, Player 1 prefers Heads (payoff 1 vs -1). Each pure strategy leads the other player to deviate — no pure NE exists.
Mixed NE: Suppose Player 2 plays Heads with probability q. Player 1's expected payoffs: E[Heads] = 1 times q + (-1) times (1-q) = 2q - 1; E[Tails] = (-1) times q + 1 times (1-q) = 1 - 2q. Indifference: 2q - 1 = 1 - 2q, so 4q = 2, q = 1/2. By symmetry, Player 1 also mixes 1/2 and 1/2.
At (1/2, 1/2): each player's expected payoff is 0. No deviation is profitable because both pure strategies yield the same expected payoff (0). This is the unique NE.
Problem 3: Backward Induction in a Sequential Game
In a sequential game, Player 1 moves first and chooses In or Out. If Out, payoffs are (2, 2). If In, Player 2 chooses Fight or Accommodate. Fight yields (0, 0); Accommodate yields (3, 1). Find the subgame perfect equilibrium and identify any Nash equilibria that are not subgame perfect.
Solution
Step 1: Analyze Player 2's subgame (when Player 1 enters). Player 2 chooses between Fight (payoff 0) and Accommodate (payoff 1). Rational Player 2 chooses Accommodate.
Step 2: Player 1 anticipates Accommodate if they Enter. Player 1 chooses between Out (payoff 2) and In followed by Accommodate (payoff 3). Rational Player 1 chooses In.
Nash equilibrium (Out, Fight) also exists: if Player 1 plays Out, Player 2's strategy is never tested, so Fight can be part of a Nash equilibrium. But this threat is not credible: if Player 1 deviates to In, Player 2 strictly prefers Accommodate. SPE rules out this non-credible threat.
SPE: Player 1 plays In; Player 2 plays Accommodate. Payoffs: (3, 1). The Nash equilibrium (Out, Fight) is not subgame perfect because it relies on a non-credible threat to Fight off the equilibrium path.
Problem 4: Cooperation Condition in the Repeated Prisoner's Dilemma
Suppose T = 5, R = 3, P = 1 in the Prisoner's Dilemma (T = temptation payoff, R = reward for mutual cooperation, P = punishment for mutual defection). Find the minimum discount factor delta for which grim trigger sustains cooperation as an SPE.
Solution
Under grim trigger: Cooperating each period yields R = 3. Present value of cooperating forever: R / (1 - delta) = 3 / (1 - delta).
Deviating: Get T = 5 this period, then P = 1 every period after (grim trigger punishment). Present value of deviation: T + delta times P / (1 - delta) = 5 + delta / (1 - delta).
Cooperation condition: 3 / (1 - delta) >= 5 + delta / (1 - delta). Multiply both sides by (1 - delta): 3 >= 5(1 - delta) + delta = 5 - 4 times delta. So 4 times delta >= 2, hence delta >= 1/2.
Minimum discount factor: delta* = 1/2. Using the formula: delta* = (T - R) / (T - P) = (5-3)/(5-1) = 2/4 = 1/2. Players with delta at least 1/2 can sustain cooperation under grim trigger.
Problem 5: Shapley Value Computation
Three players N = 1, 2, 3 face a cooperative game with characteristic function: v(1) = 0, v(2) = 0, v(3) = 0, v(1,2) = 4, v(1,3) = 3, v(2,3) = 5, v(1,2,3) = 9. Compute the Shapley value for each player.
Solution
There are 3! = 6 orderings. For each ordering, compute the marginal contribution of each player when they join:
Order 1-2-3: MC(1)=0, MC(2)=v(1,2)-v(1)=4, MC(3)=v(1,2,3)-v(1,2)=5
Order 1-3-2: MC(1)=0, MC(3)=v(1,3)-v(1)=3, MC(2)=v(1,2,3)-v(1,3)=6
Order 2-1-3: MC(2)=0, MC(1)=v(1,2)-v(2)=4, MC(3)=v(1,2,3)-v(1,2)=5
Order 2-3-1: MC(2)=0, MC(3)=v(2,3)-v(2)=5, MC(1)=v(1,2,3)-v(2,3)=4
Order 3-1-2: MC(3)=0, MC(1)=v(1,3)-v(3)=3, MC(2)=v(1,2,3)-v(1,3)=6
Order 3-2-1: MC(3)=0, MC(2)=v(2,3)-v(3)=5, MC(1)=v(1,2,3)-v(2,3)=4
Average marginal contributions:
Player 1: (0+0+4+4+3+4)/6 = 15/6 = 5/2
Player 2: (4+6+0+0+6+5)/6 = 21/6 = 7/2
Player 3: (5+3+5+5+0+0)/6 = 18/6 = 3
Shapley values: phi_1 = 5/2, phi_2 = 7/2, phi_3 = 3. Check: 5/2 + 7/2 + 3 = 9 = v(N). Efficiency holds.
Problem 6: Iterated Elimination of Dominated Strategies
Apply IEDS to the following game and find the Nash equilibrium:
| L | M | R | |
|---|---|---|---|
| U | (4, 3) | (5, 1) | (6, 2) |
| D | (2, 1) | (8, 4) | (3, 6) |
Solution
Step 1: Check if R strictly dominates M for Player 2. Player 2's payoffs: when P1 plays U, R gives 2 and M gives 1 (R is better); when P1 plays D, R gives 6 and M gives 4 (R is better). R strictly dominates M for Player 2. Eliminate M.
Step 2: Reduced game has columns L and R only. Player 1's payoffs: row U gives (4, 6) and row D gives (2, 3) for columns (L, R). U strictly dominates D for Player 1 (4 > 2 and 6 > 3). Eliminate D.
Step 3: Only row U remains. Player 2 now chooses between L (payoff 3) and R (payoff 2). Player 2 chooses L.
IEDS yields unique outcome (U, L) with payoffs (4, 3). This is the unique Nash equilibrium. Verify: at (U, L), P1 gets 4; deviating to D gives 2. P2 gets 3; deviating to M gives 1, to R gives 2. Neither player can profitably deviate.
Problem 7: ESS in the Hawk-Dove Game
In the Hawk-Dove game with resource value V = 4 and cost of injury C = 6, find all Nash equilibria and determine which are evolutionarily stable.
Solution
With V=4, C=6: Payoff matrix entries: Hawk vs Hawk = (V-C)/2 = -1; Hawk vs Dove = V = 4; Dove vs Hawk = 0; Dove vs Dove = V/2 = 2.
Pure NE check: (Hawk, Hawk) — payoff of Hawk vs Hawk is -1 while payoff of Dove vs Hawk is 0. Dove beats Hawk when facing Hawk, so (Hawk, Hawk) is NOT a NE. (Dove, Dove) — payoff of Hawk vs Dove is 4 while Dove vs Dove is 2. Hawk beats Dove when facing Dove, so (Dove, Dove) is NOT a NE.
Mixed NE: Let p = probability of Hawk. Indifference condition: u(Hawk) = -1 times p + 4 times (1-p) = 4 - 5p; u(Dove) = 0 times p + 2 times (1-p) = 2 - 2p. Set equal: 4 - 5p = 2 - 2p, so 2 = 3p, p = 2/3 = V/C.
ESS check for pure strategies: Pure Hawk fails because u(H,H) = -1 is less than u(D,H) = 0, violating ESS Condition 1. Pure Dove fails because u(D,D) = 2 is less than u(H,D) = 4, violating Condition 1. The mixed strategy p* = 2/3 Hawk satisfies both ESS conditions and is asymptotically stable under replicator dynamics.
The unique ESS is the mixed strategy p* = 2/3 Hawk, 1/3 Dove. In population terms: 2/3 of the population plays Hawk. Expected payoff at ESS: 4 - 5 times (2/3) = 4 - 10/3 = 2/3.
Quick Reference: Key Concepts and Formulas
| Concept | Key Formula / Condition | Application |
|---|---|---|
| Nash Equilibrium | u_i(s*) >= u_i(t_i, s*_-i) for all i, t_i | Any finite game (Nash 1950) |
| Mixed NE (indifference) | E[u_i(A)] = E[u_i(B)] in support | Matching Pennies, Hawk-Dove |
| Subgame Perfect Equilibrium | NE in every subgame | Sequential games, bargaining |
| Grim Trigger Cooperation | delta* = (T-R)/(T-P) | Repeated Prisoner's Dilemma |
| Shapley Value | Average marginal contribution over orderings | Cost sharing, SHAP in ML |
| ESS Condition | u(s*, s*) >= u(m, s*) plus stability | Evolutionary biology, Hawk-Dove |
| Minimax Theorem | max min = min max (mixed strategies) | Zero-sum games, LP duality |
| Revenue Equivalence | Same expected revenue across auction formats | Mechanism design, auctions |
| Hawk-Dove ESS | p* = V/C (fraction playing Hawk) | Population dynamics, biology |
| Cournot NE (symmetric) | q* = (a-c)/(3b) per firm (duopoly) | Industrial organization |
Further Reading and Resources
Foundational Papers
- Nash, J. (1950). Equilibrium points in n-person games. PNAS 36(1).
- Von Neumann and Morgenstern (1944). Theory of Games and Economic Behavior.
- Maynard Smith and Price (1973). The logic of animal conflict. Nature.
- Harsanyi (1967-68). Games with incomplete information. Management Science.
Standard Graduate Textbooks
- Fudenberg and Tirole: Game Theory (MIT Press, 1991)
- Osborne and Rubinstein: A Course in Game Theory (free online)
- Myerson: Game Theory: Analysis of Conflict (Harvard)
- Mas-Colell, Whinston, Green: Microeconomic Theory (chapters 7-9)
Nobel Prizes in Game Theory
- 1994: Nash, Harsanyi, Selten (NE, incomplete information, refinements)
- 2005: Aumann, Schelling (repeated games, commitment)
- 2007: Hurwicz, Maskin, Myerson (mechanism design)
- 2012: Roth, Shapley (matching theory, market design)
- 2020: Milgrom, Wilson (auction theory)
Topics to Explore Next
- Correlated equilibrium (Aumann 1974)
- Global games and equilibrium selection under uncertainty
- Behavioral game theory (Camerer, Thaler)
- Algorithmic game theory and the price of anarchy
- Network games, social learning, and information cascades