Can Commitment Resolve Political Inertia? An Impossibility Theorem

Can Commitment Resolve Political Inertia? An Impossibility Theorem Christian Roessler Sandro Shelegia Bruno Strulovici July 27, 2014 Abstract Dynamic collective decision making often entails inefficient policy choices and political inertia. This paper investigates whether long-term commitment can resolve this problem, and provides a mostly negative answer: Whenever a long-term commitment is preferred, by a majority of voters, to the majority choices in a dynamic voting equilibrium without commitment, a Condorcet cycle over all long-term policies arises. Thus, allowing for long-term commitment turns any inefficiency problem into one of indeterminacy. The result applies even when different decisions require different voting rules or, more generally, different winning coalitions, under a coherence condition which links the static problem of committing to a long-term policy to the dynamic problem of short-term choices. Violations of the coherence condition occur whenever initial decision makers are not perfect representatives of future decision makers. Thus, commitment has the power to improve outcomes only if there is time inconsistency through changing rules or preferences. Various examples illustrate the theorem, from the inadequate provision of a public good to the stability of unpopular dictatorships and failure to hire the best job candidate. JEL: D70, H41, C70 1 Introduction The inefficiency of dynamic equilibria, in the form of political inertia and other distortions, is a pervasive feature of policy analysis. It affects fiscal and monetary policy, the feasibility and gradualism of major reforms such as trade liberalization and privatization, the properties of stable institutions, voting rules, and constitutions, and the accessibility and composition of clubs and other organizations. 1 In these circumstances, committing to a state-contingent policy at the outset can often improve efficiency and resolve political We thank Wiola Dziuda, Georgy Egorov, Jeff Ely, Daniel Garcia, Michael Greinecker, Karl Schlag, Stephen Schmidt, Joel Watson, and various seminar audiences for their comments. California State University, East Bay University of Vienna Northwestern University 1 See, e.g., Kydland and Prescott (1977) and Battaglini and Coate (2008) for macroeconomic applications, Fernandez and Rodrik (1991), Strulovici (2010), Bai and Lagunoff (2011), and Acemoglu et al. (2012) for reform adoption, Barbera and Jackson (2004) for the stability of voting rules, and Roberts (1999) for the theory of clubs. Other examples of inefficiencies resulting from dynamic voting include Roberts (2007) and Penn (2009), who show that a static Condorcet winner may not be chosen in a dynamic voting game. 1

inertia. It is therefore of prime importance to understand when commitment should be considered, encouraged, and guaranteed by institutions. Many studies of dynamic collective decision making have ignored the possibility of commitment on the ground that commitments are either rarely observed in practice or infeasible. 2 The infeasibility argument is based on the observation that policymakers often have opportunities to modify their policy, or are replaced by other policymakers who can ignore or undo the resolutions of their predecessors. However, commitment may be imposed in ways such that breaking it requires the consent of all or multiple parties with potentially conflicting interests. Most policies create winners and losers, and it is a rare situation in which all parties involved may benefit from reneging on a previous commitment. Parties in favor of the commitment may hold other parties accountable for breaking their promise. For instance, countries of a monetary union or military organization may take sanctions against any member violating the statute of the union or organization. Constitutions are one form of commitment that takes a supermajority to undo. Policymakers, such as central banks and other regulators, may also be concerned about their reputation, so that reneging on a previous commitment would entail an important political cost. 3 In general, sophisticated contracts and agreements can be written and enforced, and if commitments were highly valuable, institutions guaranteeing their enforcement could be strengthened or developed. This paper provides an alternative explanation for the ineffectiveness and paucity of commitment to resolve political inefficiency. We address, specifically, the following puzzle: When a decisive group of voters prefers some plan of action to the outcome of shortterm decision making (often, some status quo), why doesn t it adopt the better plan as a law or contingent measure? Our main result shows that, whenever committing to a state-contingent policy is socially preferred to the dynamic collective decision equilibrium without commitment, there must exist a Condorcet cycle among all such policies, assuming the relative power of the decision makers is fixed. 4 That cycle includes both the equilibrium policy and the policy which dominates it. As a result, there cannot exist any Condorcet winner among state contingent policies. 2 Among recent contributions on this topic, Gomes and Jehiel (2005) observe that long-term contracts are (so) rare in political contexts and argue that [I]n the real world, legislators do not stay in office forever, and even if they do stay in office for a few legislatures, it is hardly conceivable that they could commit to some political actions to be taken next when there is some uncertainty as to whether they will still be in office. (Footnote 20). Acemoglu et al. (2012) motivate the assumption of a high discount factor by the observation that a new state, involving a different configuration of political power, can be changed immediately by those who have the power and base their approach on the natural lack of commitment in dynamic decision-making problems. Some papers, such as Strulovici (2010) and Dziuda and Loeper (2014) observe that some commitment can improve upon the inefficiency arising in the equilibrium, but they do not consider the set of all possible commitments and the potential cycles among them. 3 Many of the models cited above focus on Markov Perfect Equilibria, which de facto rule out reputationbased equilibria and the endogenous commitment that they imply. While this modeling choice can be motivated by tractability and other considerations, it cannot serve as a normative basis for rejecting commitment. 4 Any pair of policies is associated with a set of winning coalitions that can decide on the relative social ranking of those two policies. The simplest example is the simple majority rule, but many applications involve other decision rules, as discussed extensively in the present paper. The pairwise comparisons and cycles analyzed here are similar to those defined by Acemoglu et al. (2012) in their Assumption 2, i). The objects being compared are quite different (state-contingent policies vs. per-period states), but can be brought closer in a specialized setting of the paper. See Section 5.1. 2

Thus, allowing commitment replaces a problem of inefficiency by one of indeterminacy. Intuitively, allowing for commitment may enrich the policy space to such an extent that it becomes impossible to agree on anything, since there is always a preferred alternative. This turns out to be the case if and only if the equilibrium is dominated in the social ranking by some other policy. 5 To illustrate, consider the following example (revisited later in this paper). In Rome, there is a broad agreement that the metro system is underdeveloped, but attempts at further development have been stalled. 6 they face as follows: 7 Roman officials have explained the difficulties Building a new metro line is likely to result in the discovery of antique ruins, which a blocking majority of citizens may find too valuable to destroy, causing construction to be abandoned. The decisive majority may shift as a result of a discovery, and this risk may be high enough for those citizens who value the metro but not additional ruins in the city to give up on supporting construction. 8 Political inertia in the Roman metro problem could seemingly be resolved by an unconditional commitment to finishing the metro regardless of what is found underground. 9 As it turns out, such a commitment is majority-preferred to the status quo if and only if it is itself beaten by the state-contingent policy to start the metro line, but then abandon it, should valuable ruins be uncovered by construction. That policy is, in turn, dominated by the status quo, generating a Condorcet cycle. We state this result in a setting where each decision of the dynamic game is binary and the horizon is of arbitrary but of finite length, which guarantees the uniqueness of an equilibrium policy, computed by backward induction, as well as a well-defined collective decision in each period of the game. Equilibrium uniqueness and determinacy of decisions in each period shuts down any possible confusion regarding the source of indeterminacy: The potential cycles among state-contingent policies have nothing to do with possible cycles in any given period, since such cycles are ruled out. Equilibrium inefficiency is also unambiguous, since the equilibrium is uniquely defined. 10 Otherwise, the analysis is completely general. All players payoffs can depend in arbitrary ways on the history of decisions. Stochastic states are allowed and can follow any, 5 In important cases, the equilibrium is efficient and we provide conditions under which this happens. See Section 5 6 The Roman underground has two lines, 49 stations, that serve a metropolitan area of 3.4 million residents, and 9 million annual visitors. Berlin is similar in size, but has 173 subway stations. Madrid, which is about one-and-a-half times as large as Rome, has 300. Even in Oslo, where less than a million people live, the subway has 105 stops. 7 The chairman of Roma Metropolitane SpA, Enrico Testa, was quoted saying: There are treasures that are underground that would stay buried forever, but as soon as we uncover them, our work gets blocked. (Kahn (2007)) 8 Clearly, we are oversimplifying reality. There is a difference between being unaware of ruins underground and destroying ruins that have been unearthed. These complications are somewhat unique to the Roman metro example, which is merely illustrative of the type of voting problem we have in mind. 9 One could preserve the most valuable pieces; in fact, the argument does not rely on the destruction of any ruins, as long as preservation is costly. 10 Although a detailed analysis of the more general setting is beyond the scope of the paper, the logic of our main result also applies in infinite horizon settings, particularly Markovian settings in which the state converges to absorbing states, from which one can proceed by backward induction. Appendix D provides an illustration of this point to reform adoption, based on Strulovici (2010). Acemoglu et al. (2014) consider a setting in which a finite number of events can occur. The analysis of the present paper could likely be extended to such setting. 3

possibly non-markovian, process. Moreover, the power of each individual, as captured by the set of winning coalitions that determine the binary decision in any period, can depend in arbitrary ways on the current state. 11 The paper first considers, for clarity, a setting in which all decisions (for both binary actions and the pairwise comparison of policies) are made according to the simple majority rule. The theorem is then extended to general voting rules under a novel coalitional coherence condition, which plays a key role in the analysis. The condition says the following: Consider two state-contingent policies that differ only in the binary decision in a given period and state (or set of states) arising in that period. Then, the social ranking between these two policies must be determined by the same set of winning coalitions as the one arising in the dynamic game when that binary decision is reached. Coalitional coherence is not only sufficient, but also necessary for the result: Without it, one may find a preference profile for which there is a policy that dominates the equilibrium in the social ranking and is a Condorcet winner among all policies. The coherence condition has several normative and positive interpretations discussed in Section 4.3. For example, if some decisions concern only a minority of agents, or even a single individual, a form of liberalism would require that the social ranking of alternatives follow the preferences of this minority. This may of course lead to trade-offs between equilibrium efficiency and liberalism, and we in fact establish a link between the coherence condition and Amartya Sen s notion of liberalism (Sen (1970)). The coherence condition may also be justified based on fairness toward future decision makers. For example, coherence may prohibit current society members from committing to principles that are contrary to the interests of future society members. By contrast, when generations have time inconsistent preferences, and present decision makers act as social planners aggregating those preferences, commitment can help. One instance is the guarantee of personal liberties in the US constitution. Individual rights were deemed so important that they should be upheld regardless of the views of future parliamentarians. As a simpler illustration, consider an individual whose preferences change in the course of alcohol consumption, due to addiction. He may want to commit to drinking a single glass (by emptying out the rest of the bottle), since his intoxicated self (the new winning coalition after he has started to drink) will not be able to stop. In this case varying winning coalitions in the dynamic game are not respected in the decision to commit. This violates our coherence condition and makes it possible to improve outcomes, from an ex ante point of view, through commitment. Indeed, we establish (Theorem 3) that the coherence condition is necessary for our main result to hold, in the following sense: given any structure of (state-dependent) winning coalitions for the dynamic collective decision game and for the choices among commitment, one can find a preferences that will violate the conclusion of the theorem if and only if the coherence condition is violated. We discuss a number of applications including policy reforms, the stability of bad regimes, and hiring committees. For example, a government, A, may stay in power 11 The restriction to binary choices is only in form, not content, as long as larger one-period decision problems do not involve cycles. In that case, those larger decision problems can be broken down into sequences of binary decisions and fit into the framework of the paper. 4

against the wishes of a decisive majority, who prefers B (see Appendix C). The problem may be that B is not be attainable as an equilibrium: Once the current political power is replaced, the new leaders prefer to establish a third regime, C, that constituents deem worse than A. It seems a priori less clear, however, why opponents of A cannot come up with an agreement, backed by reputations or other commitment devices, guaranteeing that B will be implemented. Our theorem says, however, if B is preferred to A by a decisive majority, then commitment to B must be dominated by commitment to another regime ( D, perhaps, but not necessarily, the same as C), which is itself dominated in a chain of commitments leading us back to A. Since the rebels are unable to select a winner, A survives. Even though we assume a finite horizon through most of the analysis, the logic of the theorem applies to infinite-horizon problems as well, provided that the equilibrium is unambiguously defined. We illustrate this extension for the problem of adopting a reform with uncertainty regarding its winners and losers (see Appendix D). 2 Outline of the Arguments To explain the logic of theorem in the clearest possible way, it is useful to divide the discussion into two settings, with and without uncertainty. Most applications we have in mind involve uncertainty where political power shifts as a result of new information. However, uncertainty adds complexity that distracts from the basic logic. Therefore, we start without uncertainty. 2.1 Deterministic Setting I. Simple Majority Rule Consider a binary decision tree with two periods, as in Figure 1. Figure 1 goes here. Vote a 1 b1 Vote Vote a 2 b 2 a' 2 b' 2 Figure 1: A two stage voting game. At solid nodes decisions are made according to voting majorities. Hollow nodes are terminal. 5

Suppose that the voting equilibrium generates the decision sequence a 1 a 2. We refer to a complete plan of action, at all nodes where the population decides, as a policy or commitment. Because there is no uncertainty, each policy is outcome-equivalent to a path of the decision tree, and we identify policies with these paths for simplicity. There are four possible policies: committing to a 1 a 2, which is identical to what happens in equilibrium, as well as alternatives a 1 b 1, b 1 a 2, and b 1b 2. If a majority of voters prefers a different policy over the equilibrium, that policy cannot start with a 1 since, conditional on a 1, equilibrium voting has revealed that the majority preferred a 1 a 2 to a 1 b 2. Therefore, the policy must start with b 1. Without loss, let the plan generate the sequence b 1 b 2. Since b 1 was not chosen in equilibrium, it must be that a majority prefers a 2 to b 2 : otherwise, b 1b 2 would be the continuation equilibrium following b 1, and since a majority prefers b 1 b 2 to a 1a 2, it would have supported b 1 over a 1 initially. The implied ranking of policies according to the simple majority rule ( ) is: a 1 a 2 b 1 b 2 b 1 a 2. But, since the majority could obtain b 1 a 2 if they started with b 1, and instead chose a 1 in equilibrium, it must be that b 1 a 2 a 1a 2, which completes a cycle. This leads us to the following conclusion (generalized by Theorems 1 and 2): If a policy (e.g., b 1 b 2 ) is preferred by the majority over the equilibrium play without commitment (a 1 a 2 ), then there will be a preference cycle among policies. II. Beyond the Simple Majority Rule : Coherence and Fairness Actual decision procedures often deviate from majority voting. For instance, the binary decision {a 2, b 2 } might be made by a specialized committee, where only experts weigh into the decision. The above argument goes through as long as the committee has the same power when deciding between policies as in the dynamic voting game. This is our coalitional coherence assumption. Specifically, if the same experts decide between a 2 and b 2 in the dynamic game, and also between any two policies that differ only regarding the binary decision {a 2, b 2 } (say, the majority rule is used for all other pairwise comparisons), the same cycle will arise. It does not matter who makes the choice: The cycle arises because the choice is made consistently. A violation of coalitional coherence - ignoring the committee when voting on policies - would distort the representativeness of the commitment. To make this point starker, suppose that the committee is in fact a minority that is much more concerned with the choice between a 2 and b 2 than the rest of the population. Then, if the dynamic game gives that minority control over a 2 versus b 2, it would seem fair that it still has that power in the decision on commitments. Another interpretation of coalitional coherence is in terms of future generations: the minority may be voters in the second period, who are only born if a 1 was chosen in the first period (for example, a 1 incentivizes families to have more children). Perhaps a 2 and b 2 are alternative environmental policies that would affect lives in the next generation. Violating the coherence condition (by not taking the interests of the unborn into account) 6

may be seen as unfair when, for instance, the current population opts to increase births and at the same time tolerate long-run deterioration in air quality. Fairness would be restored either by keeping the air clean until the next generation can vote, or by anticipating its preferences and letting them influence today s environmental standards (i.e., respecting coherence in committing to a long-term policy). 2.2 Voting with Uncertainty Uncertainty adds conceptual and technical layers to the previous argument. The main difference is that a policy cannot be identified with a terminal outcome of the decision tree. A policy is now described by a dynamic state-contingent decision process. As a result, the equilibrium play can be dominated - so that the conditions of our theorem are activated - when there is no social preference cycle over the possible outcomes. To illustrate, we investigate more closely the popular theory about the Roman metro system that was mentioned in the introduction. How does political inertia arise in the absence of a cycle over outcomes? Why will attempts to resolve it through commitment fail? I. Uncertainty and Political Inertia We return to the political inertia mentioned in the introduction, where Rome has the option to build a new metro line, but antique ruins may be found on the construction path. The discovery of valuable ruins has a probability q < 1. If no discovery is made, the metro line is completed. If a discovery is made, it triggers a vote on whether to preserve the ruins or to proceed with the project (and destroy the ruins, or make costly accommodations). Suppose for simplicity that all decisions are made according to the majority rule. (A more realistic portrayal of the Roman metro problem, based on other decision criteria, could be accommodated by the more general analysis that imposes only coalitional coherence.) Because metro construction is costly, some citizens may assign a negative value to this project. Suppose that the population consists of three types, A, B, and C, who are equal in numbers. A wants to build the metro, but only if it will not be abandoned. B prefers to start the construction of the metro and then abandon it if the ruins are found. Finally, C does not want to build the metro line but, should construction be started anyway, prefers to abandon it later in order to preserve the ruins. The strategic situation is represented on the following game tree, where Y and N represent yes and no in the initial vote to start the project, M is the decision to complete the metro line, and T is the decision to preserve the antique treasure. Figure 2 goes here. In the abstract deterministic game, the equilibrium play could be dominated by a policy only if there was a preference cycle over policies. Since every policy is associated with a particular outcome in the deterministic setting, there had to be a cycle over outcomes as 7

Vote Y N Nature nothing found antiquity found Vote 0 0 0 1 2-2 M T 1 2-2 -2 3 1 Figure 2: The Roman Metro Game. At the circular nodes, decisions are made according to voting majorities. The square node is a chance node. well. The Roman metro game has clearly ranked outcomes. 12 Antiquity is preferred by a majority (types B and C) to the metro, and the metro line is preferred by a majority (types A and B) to the status quo. As a result, if q = 0 (i.e. there is no possibility of finding ruins), the metro line has majority support; if q = 1, antiquity has majority support. If finding ruins is truly a random event, voters base their preferences on expected payoffs, taking into account that a majority would vote in favor of preserving the antiquity if called on to decide. We solve the game by backward induction, using elimination of weakly dominated strategies as a refinement. If construction takes place, the ruins are discovered, and a majority consisting of types B and C votes to preserve them at the expense of the metro. Anticipating this, A initially votes for the project if: E(u A ) = 2q + 1 q 0 q 1 3. B votes for the project regardless of q, because B benefits whether or not an antiquity is found. C votes for the project if: E(u C ) = q 2 (1 q) 0 q 2 3. Overall, there is a majority in favor of the project at the outset if q 1/3 (in which case, it is supported by A and B) or q 2/3 (then, the project is supported by B and C). But in case 1/3 < q < 2/3, A and C join forces, so that a majority opposes the project. For any q, the total expected utility from the project is 2q + 1 q = 1 + q. 12 There are four terminal nodes, but for simplicity we treat completion of the metro line as the same outcome regardless of whether anything was discovered or not. This 8

is always positive, so a utilitarian social planner would start the project regardless of q, and stop if an antique treasure were found. By construction, total utility is aligned with majority preference, and therefore preserving the antiquity if discovered is in every sense a valuable option - a majority prefers it to the metro line, and moreover it offers higher total utility. 13 Paradoxically, when the option to abandon the metro line for something better is introduced, the majority shifts its support from the metro project to the inefficient status quo (at intermediate values of q). 14 II. Commitment It is striking that a socially superior option (starting and finishing the metro) is defeated, even though it is always in the collective power of the voters to implement it. The problem is, of course, that voters expect a majority to force abandonment of the project in case an antiquity is discovered. Since reserving the option to do so leads to an undesirable outcome, it is natural to expect that a commitment to finishing the metro line would allow the project to go forward. Since types A and B prefer having the metro over the status quo, there is scope for improvement by allowing voters to commit to destroying the ruins. In other words, A and B both favor the policy Y M to the status quo, irrespective of q. However, this is not a satisfactory way to think about voting with commitment: once we consider a particular policy, it seems natural to consider any available policy. There are now three alternatives: never start the project (denote by N), start and finish it (Y M), and start and stop when ruins are discovered (Y T ). Unless some policy can beat all possible policies, it is not clear how voters could agree which policy to pit against the status quo. Since Y T has majority support from B and C over Y M, 15 there is a Condorcet cycle: Y M is beaten by Y T that is itself beaten by N that in turn is beaten by Y M. Thus, trying to commit - rather than facilitating a desirable project - leads to indecision. In fact, existence of the Condorcet cycle in static voting over policies is equivalent to the bad equilibrium property that produces the status quo in dynamic voting, while the majority prefers the metro line over the status quo. This leads us to the first statement of our result in this specific context. Proposition 1. In the Roman metro game, a majority opposes the project if and only if there exists a Condorcet cycle over the set of all policies. 13 Although the status quo is Pareto-efficient here, the outcome of the dynamic voting equilibrium is Pareto-dominated by another policy in other cases; see Example 1 in Section 5. 14 This three-type example can easily be extended to an arbitrary number of types where a majority prefers metro over status quo, and antiquity over metro. It is crucial that the majority that favors metro over status quo can have a different composition from the majority that prefers antiquity over metro. If a majority of citizens individually preferred both metro and antiquity over status quo, the initial vote would clearly be in favor of the project. Changing voting blocks result directly from the admission of new voting members in Barbera et al. (2001) and Jack and Lagunoff (2006). This too can lead to inefficiency, when intrinsically less desirable newcomers are allowed in because of their expected voting behavior in the future. 15 These two policies differ only if ruins are found, in which case a majority prefers the policy that preserves them. 9

Proof. If a Condorcet cycle exists, then it has to take the form that Y M is majoritypreferred to N, Y T is majority-preferred to Y M, and N is majority-preferred to Y T. The reason is that Y M delivers the metro line with certainty, and by assumption a majority prefers ending up with the metro line over nothing. Moreover, Y T is a lottery between the metro line and the antiquity, which is majority-preferred to Y M because a majority prefers ending up with the antiquity over the metro line. The only way to get a Condorcet cycle is then for N to beat Y T in majority voting. But this is precisely the choice voters make in the Roman metro game when they forego the project. Conversely, if a majority opposes the project, then N is majority-preferred to Y T. Because Y T is majority-preferred to Y M, and Y M is majority-preferred to N from the assumptions, we have a Condorcet cycle. Whenever a majority opposes construction, there is no Condorcet winner among the feasible policies, so one cannot guarantee a better outcome by committing. If there existed a Condorcet winner among the feasible policies, a majority would support the actions in dynamic voting, i.e. even without commitment. 2.3 Relation to Agenda Setting Our primary interest in this paper is collective decision making under uncertainty. a deterministic setting, there is a formal correspondence between our study of the value of commitments and the existence of a Condorcet winner over simple alternatives. In a deterministic setting, as in the abstract example of Section 2.1), every policy may be identified with a particular terminal node of the decision tree. We can identify those terminal nodes with alternatives of a static collective decision problem. Moreover, dynamic voting over binary choices amounts to a particular sequence, or agenda, to select one of these alternatives. It is well-known 16 that, if the winner of a sequence of binary majority votes depends on the order in which alternatives are compared, then there is no Condorcet winner. 17 In the deterministic setting, our theorem is exactly equivalent to this statement. With uncertainty, however, the formal similarity with agenda setting breaks down. Most obviously, (state contingent) policies can no longer be identified to the terminal nodes of the decision tree: the commitment space is strictly larger, and policies cannot be viewed as being akin to static alternatives. In fact, a cycle over terminal node payoffs is no longer necessary for the equilibrium play to be dominated in the dynamic problem, as a policy now corresponds to a probability distribution over terminal nodes. In the Roman metro problem above, for instance, there is no cycle over outcomes. In From a dynamic decision problem that has no cycle among terminal nodes, our theorem derives a cycle in a 16 See, e.g., Miller (1977). 17 In a static choice problem, Zeckhauser (1969) and subsequently Shepsle (1970) study the existence of Condorcet winners in voting over certain alternatives and lotteries. Zeckhauser shows that, if all lotteries over certain alternatives are in the choice set, no Condorcet winner can be found, even if there is such a winner among certain alternatives. In a comment on Zeckhauser, Shepsle demonstrates that a lottery can be a Condorcet winner against certain alternatives that cycle. 10

static decision problem (among policies). Therefore, our framework subsumes the agenda setting literature, in the deterministic case, but extends well beyond it. The coalitional games of Section 4 extend the agenda setting problem to another direction: Our concept of coalitional coherence provides a novel approach for thinking about agenda setting when the rules governing collective decisions may depend on the type decision made, on past decisions, or on exogenous shocks. 3 Benchmark Setting: Simple Majority Rule Consider a dynamic voting setting with T periods and an odd number N of voters. Each period starts with a publicly known state θ t belonging to some space Θ t, which contains all the necessary information about past decisions and observations. In each period t, a collective decision must be made from some binary set A(θ t ) = {a(θ t ), ā(θ t )}. This choice, along with past choices and states, determines the distribution of the state at the next period. Formally, each Θ t is associated with a sigma algebra Σ t to form a measurable space, and θ t+1 has a distribution F t+1 ( a t, θ t ) (Θ t+1 ) given by a conditional probability system F t+1 ( ). 18 For example, the state θ t may represent the probability distribution of some unknown but payoff-relevant parameter ˆθ, given the information accumulated until period t. The state θ t+1 then includes any new information accrued between periods t and t + 1 about the value of ˆθ, and such information may depend on the action taken in period t. 19 In the Roman metro example, if the city starts construction of a metro line in period 1, some ruins may be discovered with positive probability, which affects θ 2. If the city does not undertake construction, nothing is learned and θ 2 contains no further information about the existence of ruins. The state θ t can also include a physical component, such as the current stage of a construction. Let Θ = T t=1 Θ t denote the set of all possible states at the beginning of any period and A = θ Θ A(θ) denote the set of all possible actions. Each voter i has a terminal payoff u i (θ T +1 ), which depends on all past actions and shocks, as captured by the terminal state θ T +1. A policy C : Θ A maps at each period t each state θ t into an action in A(θ t ). Similarly, a voting strategy for voter i is defined by a policy C i, which describes his voting decisions. Given a policy C and a state θ t, voter i s expected payoff, seen from period t, is V i t (C θ t ) = E[u i (θ T +1 ) θ t, C]. Definition 1 (Voting Equilibrium). A profile {C i } N i=1 of voting strategies forms a Voting Equilibrium in Weakly Undominated Strategies if the following conditions hold for each θ t Θ: 18 See for example Durrett (1995) for a formal definition of these objects. 19 In this case, Θ t = (Θ) for all t s, where Θ is the (finite, say) parameter space containing ˆθ and (Θ) is the set of distributions over that set. The sequence {Σ t} of sigma-algebras forms a filtration, i.e., is such that Σ t is finer than Σ t for all t t. 11

The resulting collective decision Z satisfies Z(θ t ) = a A(θ t ) if and only if C i (θ t ) = a N/2. C i (θ t ) = arg max a A(θt) V i t (a Z θ t ) The first condition describes simple majority voting: at each time, society picks the action that garners the most votes. The second condition corresponds to the elimination of weakly dominated strategies. In each period t, voter i, taking as given the continuation of the collective decision process from period t+1 onwards that will result from state θ t+1, votes for the action that maximizes his expected payoff as if he were pivotal. We assume for simplicity 20 that for each period t, state θ t, and policy C, each voter has a strict preference for one of the two actions in A(θ t ). That is, we rule out situations in which V i t (a(θ t ) C θ t ) = V i t (ā(θ t ) C θ t ) for some i, where a(θ t ) C denotes the policy equal to C on Θ \ {θ t } and equal to a(θ t ) for θ t, with a similar definition for ā(θ t ) C. Because indifference is ruled out and the horizon is finite, this defines a unique voting equilibrium, by backward induction. Proposition 2. There exists a unique voting equilibrium. 3.1 Commitment and Voting Cycles Let Z denote the equilibrium policy. Given two policies Y and Y, say that Y dominates Y, written Y Y, if there is a majority of voters for whom V i 1 (Y θ 1) > V i 1 (Y θ 1 ). A Condorcet cycle is a finite list of policies Y 0,..., Y K such that Y k Y k+1 for all k < K, and Y K Y 0. A policy X is a Condorcet winner if for any policy Y, either X Y, or X and Y induce the same distribution over Θ T +1. Theorem 1. The following statements hold: i) If there exists Y such that Y Z, then there is a Condorcet cycle that includes Y and Z. ii) If there exists a policy X that is a Condorcet winner among all policies, then X and Z induce the same distribution over Θ T +1. Remark 1. If we do not assume that voters have strict preferences, Part i) of the theorem still goes through with a weak Condorcet cycle: there is a finite list of policies Y 0,..., Y K such that Y k Y k+1 for all k < K, and Y K Y 0. Similarly, as the proof makes clear, the equilibrium Z continues to be a Condorcet winner in the following sense: there does not exist another policy Y such that Z Y. The proof, below, works as follows: if Y is different from Z, then Y takes an action somewhere that the majority opposes. This allows us to construct a sequence of policies, 20 Without this assumption, most of Theorem 1 still applies to weak Condorcet winner and cycle. See Remark 1. 12

starting from Y, where we switch actions to those preferred by a majority, thus always defeating the previous policy, until we recover Z, which was defeated by the original alternative Y. The proof collects states according to the (finitely many) winning coalitions, because individual states may (and often do) have zero probability. Because we group states by all possible majorities, and then switch actions for each majority, the switch is guaranteed to have the support of that particular majority. Proceeding by backward induction on the decision tree, this sequence of transformations recovers the equilibrium policy Z. Thus, we get a Condorcet cycle if, initially, Y Z. By way of illustration, consider the Roman metro game, where parameter values are such that i) the equilibrium outcome, Z, is the status quo, and ii) a majority prefers commitment to building the metro line, Y, over the status quo. In order to uncover the cycle predicted by Theorem 1, consider the majority (consisting of types B and C, in the notation of Section 2), which favors preservation if an antiquity is discovered. That majority prefers, over commitment to building the metro line, the plan of action Y 1 in which, conditional on digging and finding ruins, the ruins are preserved. Thus Y 1 Y. However, Y 1, in turn, is defeated by the status quo Z because types A and C prefer not to start the project, (the latter dislike the metro, the former dislike ending up with the ruins in the event of a discovery). We thus obtain the cycle Z Y Y 1 Z. Proof. Consider any policy Y. Let S denote the set of coalitions with at least N/2 voters. For each θ t, a A(θ t ) and policy X, let S(a θ t, X) denote the set of voters who strictly prefer a to the other action in A(θ t ), given the current state θ t and given that the continuation policy from t + 1 onwards is X. The set Θ T can be partitioned into A T ( S S B T (S)), where B T (S) = {θ T : Z T (θ T ) Y T (θ T ) and S(Z T (θ T ) θ T, Z)) = S} and A T consists of all remaining states in Θ T. In words, B T (S) consists of all the states at the beginning of period T for which the set of voters who strictly prefer the action prescribed by Z over the one prescribed by Y is equal to S. 21 A T consists of all the states for which Y T and Z T coincide. 22 We will index the coalitions in S from S 1 to S p, where p is the cardinal of S. Consider the sequence of policies {Y p } p T p=1, defined iteratively as follows: Y 1 T is equal to Y for all states except on B T (S 1 ), where it is equal to Z. For each p {2,..., p}, Y p T it is equal to Z. p 1 is equal to YT for all states except on B T (S p ), where By construction, YT 1 Y because the policies are the same except on a set of states where a majority of voters prefer Z (and, hence, YT 1 ) to Y. Moreover, because voters are assumed to have strict preferences over actions, the social ranking is strict if and only if B T (S 1 ) is reached with positive probability under policy Y : Y 1 T Y P r(b T (S 1 ) Y ) > 0. If P r(b T (S 1 ) Y ) = 0, Y 1 T = Y with probability 1. 21 In particular, Y T (θ T ) Z T (θ T ) for all those states. 22 Because Z is the equilibrium policy, the set of voters who prefer Y over Z at time T must always form a minority, so A T and S SB T (S) exhaust all states in Θ T. 13

Therefore, either Y and YT 1 coincide, or Y T 1 Y. Similarly, Y p T Y p 1 T for all p p, and Y p T Y p 1 T if and only if Y p T Y p 1 T with positive probability. This shows that Y p T Y 1 T Y, and at least one inequality is strict if and only if the set of states in Θ T over which Z T and Y T are different is reached with positive probability under Y. By construction, Y p T coincides with Z on Θ T : Y p T (θ T ) = Z(θ T ) for all θ T Θ T. We now extend the construction by backward induction to all periods from t = T 1 to t = 1. For period t, partition Θ t into A t ( S S B t (S)), where A t consists of all θ t s over which Y t and Z t coincide, and B t (S) = {θ t : Z t (θ t ) Y t (θ t ) and S(Z t (θ t ) θ t, Z)) = S}. That is, B t (S) consists of all states in Θ t for which the set of voters who strictly prefer the action prescribed by Z over the one prescribed by Y, given that Z is used for all subsequent periods, is equal to S. 23 within each period t, and then decreasing t: for each t, Y p t is defined inductively as follows, increasing p For p = 1, Yt 1 Z. is equal to Y p t+1 for all states, except on B t(s 1 ), where it is equal to For p > 1 Y p t is equal to Y p 1 t for all states, except on B t (S p ) where it is equal to Z. By construction, Y p+1 t Y p t for all t and p < p and Yt 1 Y p t+1 for all t. Moreover, the inequality is strict if and only if the policies being compared are not equal with probability 1 on the set of states reached by either of them. Finally, observe that Y p 1 = Z. Let {Y k} K k=1, K 1, denote the sequence of distinct policies obtained, starting from Y, by the previous construction, iterating from t = T and p = 1 down to t = 1 and p = p. 24 If Y Z with positive probability, then K 2. Moreover, Y = Y 1 Y 2 Y K = Z. (1) Therefore, we get a voting cycle if Z Y, which concludes the proof of part i). Since Z can never be defeated without creating a cycle, we can characterize a Condorcet winner out of all policies, if it (they) exists, and ii) follows. As mentioned in Remark 1, the entire proof goes through if one drops the assumption that voters have strict preferences. The only difference now is that (1) only holds with weak inequalities. While the Roman metro game illustrated how a majority may favor clearly inefficient choices that commitment to future actions cannot solve, Theorems 1 says nothing directly about efficiency. It is mute on why a particular policy is majority-preferred to the equilibrium, if such is the case. In the metro game, majority preference follows total utility by assumption, creating a link between utilitarian inefficiency in equilibrium and a Condorcet 23 Again, by definition of Z, there cannot be a majority who prefer Y t over Z t, given the continuation policy {Z t} t t, so A t ( S SB t(s)) = Θ t. 24 We call two policies distinct if they induce different distributions over Θ T. Policies that differ only at states that are never reached are not distinct. 14

cycle over policies. However, since there is, in general, no relationship between rankings by total utility and majority preference, there is none between efficiency in this sense and cycles either. Matters are different if efficiency fails in the stronger Pareto sense. In our context, we say that a policy is strictly Pareto-dominated if there exists another policy that all voters like strictly better. If a Pareto improvement over the equilibrium exists, then commitment to it surely defeats the equilibrium policy in majority voting 25 and sets in motion the machinery of Theorem 1. Corollary 1. If the equilibrium policy is strictly Pareto-dominated, then there is a Condorcet cycle over the set of all policies. An immediate consequence of the theorem is the following equivalence: a Condorcet winner exists if and only if the policy Z is undominated. Proof. Since the equilibrium is strictly Pareto-dominated only if there exists a policy Y that yields a higher expected payoff for all voters, hence is majority-preferred to the equilibrium policy, Theorem 1 applies immediately. 4 General Case: State Dependent Rules and Coalitional Coherence Theorem 1 concerns decisions made according to the simple majority rule. Actual procedures deviate in essential ways from majority voting in many settings. In the Roman metro problem, some stakeholders (archeologists) have a special say over preserving antiquities. In other settings, power might rest with institutions that can be swayed only by the preferences of a supermajority of ordinary citizens. This section shows that our main result still holds when the decision rule is extended beyond simple majority, under a coalitional coherence condition whose relevance is discussed in detail below. The formal environment is the same as before, except for the collective decision rule. 26 In each period t, given state θ t, action ā(θ t ) might, for instance, impose a particular quorum or require the approval of certain voters (veto power) to win over ā(θ t ). Moreover, the decision rule may depend on the current state. Some voters may be more influential than others, because they are regarded as experts on the issue under consideration, or because they have a greater stake, or simply because they have acquired more influence. Each state θ t is associated with a set S(θ t ) of coalitions which may impose ā(θ t ), in the sense that if all individuals in S S(θ t ) support ā(θ t ), given state θ t, then ā(θ t ) is implemented in that period. Likewise, there is a set S (θ t ) of coalitions which may impose ā(θ t ). 27 A coalition S belonging to W (θ t ) = S(θ t ) S (θ t ) will be called a winning coalition 25 This is not necessarily true for the more general setting of the next section, unless Pareto condition is imposed separately. 26 The number of voters need not be odd any more. We maintain the assumption that decisions are binary in each period to avoid the complications arising from coalition formation with more choices and equilibrium multiplicity. 27 S (θ t) contains all coalitions such that their complement is not in S(θ t). 15

and, whenever necessary, we will specify which action(s) S can impose. The collective decision is always well defined. 28 We require the following monotonicity condition: For any ordered coalitions S S and state θ t, S S(θ t ) S S(θ t ). With this restriction, individuals have a clear dominant strategy, which is to support the action that they prefer, because they can never weaken the power of their preferred coalition by joining it. A coalitional strategy C i for individual i is, like the policies defined earlier, a map from each state θ t to an action in A(θ t ). It describes which action, ā(θ t ) or ā(θ t ), i supports. Given a coalitional strategy profile C = (C 1,..., C N ), let a(c, θ t ) denote the action corresponding to the winning coalition: a(c, θ t ) = ā(θ t ) if {i : C i (θ t ) = ā(θ t )} S(θ t ) and a(c, θ t ) = ā(θ t ) if {i : C i (θ t ) = ā(θ t )} S (θ t ). 29 Given a policy C and state θ t, i s expected payoff, seen from period t, is V i t (C θ t ) = E[u i (θ T +1 ) θ t, C]. Definition 2 (Coalitional Equilibrium). A profile {C i } N i=1 of coalitional strategies forms a Coalitional Equilibrium in Weakly Undominated Strategies if the following conditions hold for all θ t Θ: The resulting policy Z satisfies Z(θ t ) = a(c, θ t )) C i (θ t ) = arg max a A(θt) V i t (a Z θ t ) The first condition describes coalitional power: in each period, society picks the action supported by the strongest coalition. The second condition corresponds to the elimination of weakly dominated strategies. In each period t, individual i, taking as given the continuation of the collective decision process from period t + 1 onwards that will result from state θ t+1, joins the coalition whose preferred action maximizes his expected payoff, as if he were pivotal. We maintain the assumption of the previous section that each voter has, for any policy and state θ t, a strict preference for one of the two actions in A(θ t ). Because indifference is ruled out and the horizon is finite, this defines a unique coalitional equilibrium, by backward induction. Proposition 3. There exists a unique coalitional equilibrium. 4.1 Commitment and Indeterminacy Now suppose that this group is given a chance to collectively commit to a policy at the outset. The objective of this section is to determine whether a coalition can form to support some policy that cannot be challenged by any other policy. This question generalizes the concept of a Condorcet winner to more flexible power structures. To this end, one needs 28 For example, with simple majority voting and an even number of voters, ā could require at least 50% of the votes. In that case, W(θ t), but also S(θ t) and S (θ t), consists of all sets with at least N/2 voters. 29 As explained earlier, exactly one of these cases must occur: the coalition of individuals preferring ā(θ t) can impose it if and only if its complement cannot impose ā(θ t). 16

to specify, when any two policies Y and Y are pitted against each other, which coalitions allow Y (or Y ) to win. As soon as one departs from the simple majority rule of the previous section, it becomes difficult to take a stand on such a specification. For example, if a supermajority rule is used, then one of the two policies must serve as the status quo. In some cases, a natural status quo can be found. In general, however, the two policies being compared can both differ from the natural status quo, for example when each proposes a reform, but they disagree on timing. In such cases, it is unclear which action should be viewed as the status quo. Fortunately, it is not necessary to resolve this question for all pairs of policies. Instead, the following invariance assumption on winning coalitions will be maintained: 30 Definition 3 (Coalitional Coherence). The structure of winning coalitions determining the social ranking among commitment policies is coherent if, whenever Y and Y differ only on a set Θ t (Y, Y ) of states corresponding to some fixed period t and S is a winning coalition imposing the action corresponding to Y for all states in Θ t (Y, Y ), then S is also a winning coalition of the social ranking imposing Y over Y. This assumption is necessarily satisfied if the coalitional rule over policies is invariant (for example, using the supermajority rule in every period with, say, a(θ t ) serving as the status quo), or if the same rule is used when comparing policies that only differ at t. Having specified, for each pair (Y, Y ), the set of coalitions for Y such that Y is chosen over Y, say that a policy Y is the coalitional Condorcet winner if there is no other policy Y preferred over Y by a winning coalition. The pairwise comparisons define a social ordering of policies. A coalitional Condorcet cycle can then be defined as usually, with the only difference that, for any pair Y, Y, any suitable winning coalition, rather than the narrow simple majority criterion, determines which is socially preferred. Whenever policy Y is supported by a winning coalition over Y, abusing notation of the previous section, we write Y Y. We now state a generalization of Theorem 1. Theorem 2. Assume coalitional coherence. Then, i) If there exists Y such that Y Z, then there is a coalitional Condorcet cycle that includes Y and Z. ii) If there exists a policy X that is a coalitional Condorcet winner among all policies, then X and Z induce the same distribution over Θ T +1. Proof. See Appendix. Thus, if comparison of policies is based on the same criteria as pairwise voting, there is no hope of removing political inertia through commitment. 30 Remark: One could define coherence for each state θ t rather than for a set of states Θ t(y, Y ). Because individual states may have zero probability (e.g., with Gaussian uncertainty), it is more appropriate to define it for sets of states. See the proof of Theorem 2. 17