A representation theorem for minmax regret policies

Similar documents
On the Axiomatization of Qualitative Decision Criteria. Faculty of Industrial Engineering and Mgmt.

(67686) Mathematical Foundations of AI June 18, Lecture 6

Maximin equilibrium. Mehmet ISMAIL. March, This version: June, 2014

Aggregating Dependency Graphs into Voting Agendas in Multi-Issue Elections

Mehmet Ismail. Maximin equilibrium RM/14/037

Introduction to Computational Game Theory CMPT 882. Simon Fraser University. Oliver Schulte. Decision Making Under Uncertainty

VOTING SYSTEMS AND ARROW S THEOREM

Can a Condorcet Rule Have a Low Coalitional Manipulability?

On the Foundations of Qualitative Decision Theory

Voting and preference aggregation

Approval Voting and Scoring Rules with Common Values

CSC304 Lecture 16. Voting 3: Axiomatic, Statistical, and Utilitarian Approaches to Voting. CSC304 - Nisarg Shah 1

Resource Allocation in Egalitarian Agent Societies

A New Method of the Single Transferable Vote and its Axiomatic Justification

Safe Votes, Sincere Votes, and Strategizing

Complexity of Manipulating Elections with Few Candidates

Voting and preference aggregation

Recall: Properties of ranking rules. Recall: Properties of ranking rules. Kenneth Arrow. Recall: Properties of ranking rules. Strategically vulnerable

Strategic Reasoning in Interdependence: Logical and Game-theoretical Investigations Extended Abstract

Voting rules: (Dixit and Skeath, ch 14) Recall parkland provision decision:

Learning and Belief Based Trade 1

Arrow s Impossibility Theorem on Social Choice Systems

Cloning in Elections

Cloning in Elections 1

Mathematics and Social Choice Theory. Topic 4 Voting methods with more than 2 alternatives. 4.1 Social choice procedures

A Characterization of the Maximin Rule in the Context of Voting

arxiv: v1 [cs.gt] 11 Jul 2018

A Calculus for End-to-end Statistical Service Guarantees

On the Complexity of Voting Manipulation under Randomized Tie-Breaking

Sequential Voting with Externalities: Herding in Social Networks

Sincere Versus Sophisticated Voting When Legislators Vote Sequentially

UNIVERSITY OF CALIFORNIA, SAN DIEGO DEPARTMENT OF ECONOMICS

On Optimal Voting Rules under Homogeneous Preferences

Limited arbitrage is necessary and sufficient for the existence of an equilibrium

Illegal Migration and Policy Enforcement

Making most voting systems meet the Condorcet criterion reduces their manipulability

COWLES FOUNDATION FOR RESEARCH IN ECONOMICS YALE UNIVERSITY

NP-Hard Manipulations of Voting Schemes

The axiomatic approach to population ethics

On the Rationale of Group Decision-Making

How hard is it to control sequential elections via the agenda?

Notes for Session 7 Basic Voting Theory and Arrow s Theorem

From Argument Games to Persuasion Dialogues

Computational Social Choice: Spring 2007

Lecture 7 A Special Class of TU games: Voting Games

Strategic voting in a social context: considerate equilibria

On Axiomatization of Power Index of Veto

1 Electoral Competition under Certainty

BIPOLAR MULTICANDIDATE ELECTIONS WITH CORRUPTION by Roger B. Myerson August 2005 revised August 2006

Indecision Theory: Explaining Selective Abstention in Multiple Elections

Computational Social Choice: Spring 2017

Sincere versus sophisticated voting when legislators vote sequentially

HOTELLING-DOWNS MODEL OF ELECTORAL COMPETITION AND THE OPTION TO QUIT

Approaches to Voting Systems

A procedure to compute a probabilistic bound for the maximum tardiness using stochastic simulation

Generalized Scoring Rules: A Framework That Reconciles Borda and Condorcet

Two-Way Equational Tree Automata for AC-like Theories: Decidability and Closure Properties

Wasserman & Faust, chapter 5

Collective Commitment

Social Choice. CSC304 Lecture 21 November 28, Allan Borodin Adapted from Craig Boutilier s slides

Dictatorships Are Not the Only Option: An Exploration of Voting Theory

Social Rankings in Human-Computer Committees

The Effects of the Right to Silence on the Innocent s Decision to Remain Silent

International migration and human capital formation. Abstract. Faculté des Sciences Economiques, Rabat, Morocco and Conseils Eco, Toulouse, France

University of Toronto Department of Economics. Party formation in single-issue politics [revised]

CONNECTING AND RESOLVING SEN S AND ARROW S THEOREMS. Donald G. Saari Northwestern University

EFFICIENCY OF COMPARATIVE NEGLIGENCE : A GAME THEORETIC ANALYSIS

Convergence of Iterative Voting

Distributive Equality

Game-Theoretic Remarks on Gibbard's Libertarian Social Choice Functions

On the Convergence of Iterative Voting: How Restrictive Should Restricted Dynamics Be?

Social Choice Theory. Denis Bouyssou CNRS LAMSADE

1 Aggregating Preferences

Minimizing Justified Envy in School Choice: The Design of NewApril Orleans 13, 2018 One App1 Atila / 40

ONLINE APPENDIX: Why Do Voters Dismantle Checks and Balances? Extensions and Robustness

The Integer Arithmetic of Legislative Dynamics

Extensional Equality in Intensional Type Theory

Topics on the Border of Economics and Computation December 18, Lecture 8

Can Commitment Resolve Political Inertia? An Impossibility Theorem

Political Change, Stability and Democracy

Policy Reputation and Political Accountability

WUENIC A Case Study in Rule-based Knowledge Representation and Reasoning

MATH4999 Capstone Projects in Mathematics and Economics Topic 3 Voting methods and social choice theory

Computational social choice Combinatorial voting. Lirong Xia

Committee proposals and restrictive rules

Cooperation and Social Choice: How foresight can induce fairness

Introduction to Computational Social Choice. Yann Chevaleyre. LAMSADE, Université Paris-Dauphine

CSC304 Lecture 14. Begin Computational Social Choice: Voting 1: Introduction, Axioms, Rules. CSC304 - Nisarg Shah 1

Abstract. 1. Introduction

Mechanism design: how to implement social goals

Reverting to Simplicity in Social Choice

Kybernetika. František Turnovec Fair majorities in proportional voting. Terms of use: Persistent URL:

TI /1 Tinbergen Institute Discussion Paper A Discussion of Maximin

Bilateral Bargaining with Externalities *

Discussion Paper No FUNDAMENTALS OF SOCIAL CHOICE THEORY by Roger B. Myerson * September 1996

Improved Boosting Algorithms Using Confidence-rated Predictions

Voting System: elections

Logic-based Argumentation Systems: An overview

Sampling Equilibrium, with an Application to Strategic Voting Martin J. Osborne 1 and Ariel Rubinstein 2 September 12th, 2002.

Spatial Chaining Methods for International Comparisons of Prices and Real Expenditures D.S. Prasada Rao The University of Queensland

Transcription:

Artificial Intelligence 171 (2007) 19 24 Research note www.elsevier.com/locate/artint A representation theorem for minmax regret policies Sanjiang Li a,b a State Key Laboratory of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China b Institut für Informatik, Albert-Ludwigs-Universität Freiburg, D-79110 Freiburg, Germany Received 6 October 2005; received in revised form 29 October 2006; accepted 2 November 2006 Available online 15 December 2006 Abstract Decision making under uncertainty is one of the central tasks of artificial agents. Due to their simplicity and ease of specification, qualitative decision tools are popular in artificial intelligence. Brafman and Tennenholtz [R.I. Brafman, M. Tennenholtz, An axiomatic treatment of three qualitative decision criteria, J. ACM 47 (3) (2000) 452 482] model an agent s uncertain knowledge as her local state, which consists of states of the world that she deems possible. A policy determines for each local state a total preorder of the set of actions, which represents the agent s preference over these actions. It is known that a policy is maximin representable if and only if it is closed under unions and satisfies a certain acyclicity condition. In this paper we show that the above conditions, although necessary, are insufficient for minmax regret and competitive ratio policies. A complete characterization of these policies is obtained by introducing the best-equally strictness. 2006 Elsevier B.V. All rights reserved. Keywords: Qualitative decision; Policy; maximin; minmax regret; competitive ratio 1. Introduction Decision making under uncertainty is one of the central tasks of artificial agents. Due to their simplicity and ease of specification, qualitative decision tools are popular in artificial intelligence (see e.g. [1 3,7]). Brafman and Tennenholtz [2] defined a model of a situated agent, where an agent is described by the set of her local states and the set of actions. For the current purpose, we identify the agent s local state as the set of states of the world she deems possible. Therefore an agent can be defined as a pair (S, A), where S is the (finite) set of states of the world in which the agent is situated, and A is the (finite) set of actions from which the agent can choose. The agent ranks the set of actions in a total preorder based on her state of information (i.e. her local state). This choice of ranking of actions is called a policy in this paper, which corresponds to the notion of generalized s-policy of [2]. Note that this naive description of policy is space-consuming. Brafman and Tennenholtz proposed an implicit way for specifying policies that uses value functions, where a value function assigns to each action-state pair a real value. This work was partly supported by the Alexander von Humboldt Foundation, the National Natural Science Foundation of China (60305005, 60673105), and a Microsoft Research Professorship. E-mail address: lisanjiang@tsinghua.edu.cn (S. Li). 0004-3702/$ see front matter 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.artint.2006.11.001

20 S. Li / Artificial Intelligence 171 (2007) 19 24 Many decision criteria can be defined using value functions. Of particular importance are the three qualitative ones: maximin, minmax regret, and competitive ratio. While maximin and minmax regret are well known in decision theory [6], competitive ratio is popular in theoretical computer science [5]. Brafman and Tennenholtz [2] carried out an axiomatic treatment of these three decision criteria. They gave representation theorems for maximin policies. As for minmax regret and competitive ratio, it is easy to see that (i) a policy is minmax regret representable iff it is competitive ratio representable; (ii) each minmax regret policy is maximin representable. In this paper we first show by an example that, unlike what was claimed in [2, Theorem 5, p. 466], maximin policies are not necessarily minmax regret representable. Then we find a necessary and sufficient condition, called best-equally strictness, foramaximin policy to be minmax regret representable. Roughly speaking, this condition allows the agent to adopt a value function which has the same best value for all singleton local states. The rest of this paper is structured as follows. Section 2 formalizes the three qualitative decision criteria. Section 3 gives an example that shows maximin policies are not necessarily minmax regret representable, followed by a complete characterization of minmax regret (competitive ratio) policies. Conclusions are given in Section 4. 2. Three qualitative decision criteria A binary relation is called a preorder if it is reflexive and transitive. A preorder is total if x y or y x for all x and y. For a total preorder, we define two associated relations and as follows: x y (y x) x y (x y) (y x). Definition 2.1. [2] A policy for an agent (S, A) is a function that assigns to each local state X S a total preorder X. In what follows, we denote ={ X : X S}, and if no confusion can occur, we often omit the superscript in the notation X. A policy may also be implicitly prescribed by using a value function. Definition 2.2. [2] A value function u assigns to each action-state pair a real value, i.e. u : A S R. For convenience, we call a value function u : A S R positive if u(a, s) > 0 for all (a, s) A S. Given a value function u on A S, we define the regret function reg u : A S R as reg u (a, s) = max a A u(a,s) u(a, s). Ifu is positive, then we define the competitive ratio function cmpr u : A S R as cmpr u (a, s) = max a A u(a,s)/u(a,s). Now, we can formalize the three qualitative decision criteria as follows. Definition 2.3. [2] A policy ={ X : X S} has a maximin representation if there exists a value function u on A S such that for any local state X and any two actions a, a, a X a iff min u(a, s) < min u(a,s). (1) Definition 2.4. [2] A policy ={ X : X S} has a minmax regret (competitive ratio, resp.) representation if there exists a (positive) value function u on A S such that the condition specified in (2) ((3), resp.) is satisfied for any local state X and any two actions a, a, where a X a iff max u(a, s) > max u(a,s), (2) a X a iff max u(a, s) > max u(a,s). (3) Noticing that minmax regret and competitive ratio are very similar, the following result is clear.

S. Li / Artificial Intelligence 171 (2007) 19 24 21 Proposition 2.1. [2] A policy is minmax regret representable if and only if it is competitive ratio representable. 3. When does a policy have minmax regret representation? Brafman and Tennenholtz [2] and Hesselink [4] gave representation theorems for maximin policies. This section gives a representation theorem for minmax regret (competitive ratio) policies. Note that by Proposition 2.1 we need only consider minmax regret policies. We begin with the following proposition. Proposition 3.1. [2] A minmax regret policy is maximin representable. Proof. Suppose is minmax regret represented by ū. Set u to be the value function that is specified by u(a, s) = regū(a, s) =ū(a, s) max a A ū(a,s). For any local state X and any two actions a, a,wehave a X a iff max ū(a, s) > max ū(a,s) iff min ū(a, s) < min ū(a,s) iff min u(a, s) < min u(a,s). This means is maximin represented by u. The following example shows, however, the inverse of the above proposition is not true. Example 3.1 (A counter-example). Suppose S ={s,s }, A ={a,a }. Consider the following policy that is specified as follows: a {s} a, a {s } a, a {s,s } a. (4) is maximin representable but not minmax regret representable (see Table 1). In fact, set u(a, s) = u(a,s)= u(a,s ) = 0 and u(a, s ) = 1. Then is maximin represented by u. Suppose we also have a value function ū that minmax regret represents. Write ū(a, s) = p 1, ū(a,s)= p 2, ū(a, s ) = q 1, and ū(a,s ) = q 2. Then by a {s} a we know max{p 1,p 2 } p 1 = max{p 1,p 2 } p 2,i.e.p 1 = p 2 ; and by a {s } a we know max{q 1,q 2 } q 1 < max{q 1,q 2 } q 2,i.e.q 1 >q 2. Therefore regū(a, s) = regū(a,s)= regū(a, s ) = 0 <q 1 q 2 = regū(a,s ). We also have max { regū(a, s), regū(a, s ) } = 0 <q 2 q 1 = max { regū(a,s),regū(a,s ) }. According to the minmax regret criterion, the agent would prefer a to a. This contradicts the assumption a {s,s } a. Consequently, is not minmax regret representable. So a maximin policy is not necessarily minmax regret representable. The following lemma identifies a necessary and sufficient condition for a maximin policy to be minmax regret representable. Table 1 A maximin policy that has no minmax regret representation {s} {s } {s,s } u s s ū s s regū s s a a a a a a a 0 1 a p 1 q 1 a 0 0 a 0 0 a p 2 q 2 a 0 q 1 q 2

22 S. Li / Artificial Intelligence 171 (2007) 19 24 Lemma 3.1. A policy is minmax regret representable iff it can be maximin represented by a value function u : A S R such that max a A u(a, s) = 0 for any s S. Proof. Suppose is minmax regret represented by ū : A S R. Set u to be the value function that is specified by u(a, s) = regū(a, s) =ū(a, s) max a A ū(a,s). By the proof of Lemma 3.1, we know is maximin represented by u. It is also clear that max a A u(a, s) = 0 for any s S. On the other hand, suppose is maximin represented by a value function u such that max a A u(a, s) = 0for any s S. Weshow is also minmax regret represented by u. In fact, since reg u (a, s) = max a A u(a,s) u(a, s) = u(a, s), wehave a X a iff min u(a, s) < min u(a,s) iff max u(a, s) > max u(a,s). Therefore is minmax regret representable. The above lemma suggests that, in order to characterize minmax regret policies, we need only to characterize those maximin policies that have a value function u such that max a A u(a, s) = 0 for all s S. The following example gives a clue. Example 3.2. Suppose S ={s,s }, A ={a,a }. Consider the following policy that is specified as follows: a {s} a, a {s } a, a {s,s } a. (5) is minmax regret represented by the value function u which is specified by u(a, s) = u(a,s)= u(a, s ) = 0 > 1 = u(a,s ). Note the two policies given in Examples 3.1 and 3.2 differ only in the local state {s,s }. Definition 3.1. A policy is best-equally strict if, for any pair of states s and t, and any pair of best choices a and b at s such that a is better than b at t, wehavethata is better than b at {s,t}. Or more formally, is best-equally strict if, for all s,t S and all a,b A we have a {s} b ( c A)c {s} a b {t} a b {s,t} a. (6) Note that while the policy given in Example 3.2 is best-equally strict, the one given in Example 3.1 is not. The next proposition gives a characterization of the best-equally strict maximin policies. Proposition 3.2. For a maximin policy, the following two conditions are equivalent: 1. is best-equally strict; 2. is maximin represented by a value function u : A S R which satisfies max a A u(a, s) = 0 for all s S. Proof. (Necessity) Suppose is maximin represented by a value function u such that max a A u(a, s) = 0 for all s S. For any a,a and any s,s, suppose a,a are two best choices at {s}, and a is better than a at {s }.Wenow show a is also better than a at {s,s }. Since a,a are two best choices at {s}, wehaveu(a, s) = u(a,s)= 0. Moreover, a {s } a implies u(a,s )< u(a, s ) maxã A u(ã,s ) = 0. Now, by min{u(a,s),u(a,s )}=u(a,s )<u(a,s ) = min{u(a,s),u(a,s )}, we know a is better than a at {s,s },i.e.a {s,s } a. Hence is best-equally strict. (Sufficiency) Suppose is a best-equally strict policy that is maximin represented by a value function u.wenext define a new value function ū such that max a A ū(a, s) = 0 for all s S and show that is maximin represented by ū. For (a, s) A S, define { 0, if u(a, s) = ϕ(s); ū(a, s) = u(a, s) k, otherwise

S. Li / Artificial Intelligence 171 (2007) 19 24 23 where ϕ(s) = max a A u(a, s), and k = max s S ϕ(s) = max (a,s) A S u(a, s). Note that u(a, s) k ū(a, s) 0for all (a, s) A S. In order to show that is also maximin represented by ū, we need only show the following condition (7) holds for any local state X, and any actions a,a. min s X u(a,s)<min u(a, s) min ū(a,s)<min ū(a, s). (7) s X ( ) Suppose min s X u(a,s) < min s X u(a, s). Takes 1 X such that u(a,s 1 ) = min s X u(a,s). Clearly, u(a,s 1 )<u(a,s)for each s X. In particular, by u(a,s 1 )<u(a,s 1 ) ϕ(s 1 ) we know ū(a,s 1 ) = u(a,s 1 ) k. For any s X, since u(a, s) k ū(a, s), wehaveū(a,s 1 ) = u(a,s 1 ) k<u(a,s) k ū(a, s). This means ū(a,s 1 )<ū(a, s) for all s X. Therefore min s X ū(a,s)<min s X ū(a, s). ( ) Suppose min s X ū(a,s) < min s X ū(a, s). Takes 1 X such that ū(a,s 1 ) = min s X ū(a,s). Clearly, ū(a,s 1 )<ū(a, s) for all s X.Wenextshowu(a,s 1 )<u(a,s)for all s X. We note that ū(a,s 1 ) = u(a,s 1 ) k because ū(a,s 1 )<ū(a, s 1 ) 0. Moreover, for each s X, wehaveeither u(a, s) < ϕ(s) or u(a,s)<u(a,s)= ϕ(s) or u(a,s)= u(a, s) = ϕ(s). Suppose u(a, s) < ϕ(s). Then we have ū(a, s) = u(a, s) k. Therefore, by ū(a,s 1 )<ū(a, s), we know u(a,s 1 )< u(a, s). Suppose u(a, s) = ϕ(s) and u(a,s)<u(a,s). Then by ū(a,s)= u(a,s) k and ū(a,s 1 ) ū(a,s), we know u(a,s 1 ) u(a,s)<u(a,s). Suppose u(a, s) = u(a,s) = ϕ(s). Recall that is maximin represented by u. This means a and a are two best choices of at {s}. Byū(a,s 1 )<ū(a, s 1 ) we know u(a,s 1 )<u(a,s 1 ),i.e.a {s1 } a. Since is best-equally strict, we know a {s,s1 } a. This means min{u(a,s),u(a,s 1 )} < min{u(a,s),u(a,s 1 )}, i.e.min{ϕ(s),u(a,s 1 )} < min{ϕ(s),u(a,s 1 )}. This is possible if and only if u(a,s 1 )<ϕ(s)= u(a, s). In summary, u(a,s 1 )<u(a,s)holds for all s X. Therefore, min s X u(a,s) <min s X u(a, s). As a corollary of Lemma 3.1 and Propositions 2.1 and 3.2, we have Theorem 3.1. A maximin policyisminmax regret (competitive ratio) representableiffit isbest-equally strict. Note that if has a strictly best choice at each singleton local state, then is best-equally strict. In particular, a deterministic policy is best-equally strict, where a policy is deterministic if X is a total order for each local state X. This proves the next two corollaries. Corollary 3.1. Suppose is a maximin policy such that at each singleton local state {s} the agent has a strictly best choice. Then is minmax regret representable. Corollary 3.2. A determinate policy is maximin representable iff is minmax regret representable. 4. Conclusions Axiomatic approach is the prominent approach for understanding and justifying the rationality of decision criteria. This paper showed that, unlike what was claimed in [2, Theorem 5, p. 466], there are policies that are maximin representable, but not minmax regret representable. We then identified a necessary and sufficient condition for a maximin policyto be minmax regret (competitive ratio) representable, whichallowstheagentto take the same value for all best choices at all singleton local states. Recall that Brafman and Tennenholtz [2] and Hesselink [4] have obtained representation theorems for maximin policies. We therefore conclude that a policy is minmax regret (competitive ratio) representable if and only if it satisfies (1) the closure under unions property [2], (2) the acyclicity condition [4], and (3) the best-equally strictness.

24 S. Li / Artificial Intelligence 171 (2007) 19 24 Acknowledgement We thank the anonymous reviewers for their invaluable suggestions that greatly improved the paper. In particular, the term best-equally strict is suggested by one referee for replacing the debatable term best-equal. References [1] C. Boutilier, Toward a logic for qualitative decision theory, in: KR, 1994, pp. 75 86. [2] R.I. Brafman, M. Tennenholtz, An axiomatic treatment of three qualitative decision criteria, J. ACM 47 (3) (2000) 452 482. [3] D. Dubois, H. Fargier, P. Perny, Qualitative decision theory with preference relations and comparative uncertainty: An axiomatic approach, Artificial Intelligence 148 (1 2) (2003) 219 260. [4] W.H. Hesselink, Preference rankings in the face of uncertainty, Acta Inf. 39 (3) (2003) 211 231. [5] C.H. Papadimitriou, M. Yannakakis, Shortest paths without a map, Theoret. Comput. Sci. 84 (1) (1991) 127 150. [6] L.J. Savage, Foundations of Statistics, John Wiley & Sons, New York, 1954. [7] S. Tan, J. Pearl, Qualitative decision theory, in: AAAI, 1994, pp. 928 933.