Title: Adverserial Search AIMA: Chapter 5 (Sections 5.1, 5.2 and 5.3)

Similar documents
Title: Local Search Required reading: AIMA, Chapter 4 LWH: Chapters 6, 10, 13 and 14.

Title: Solving Problems by Searching AIMA: Chapter 3 (Sections 3.1, 3.2 and 3.3)

Introduction to Artificial Intelligence CSCE , Fall 2017 URL:

Exercise Set #6. Venus DL.2.8 CC.5.1

Voting and Complexity

Experimental Computational Philosophy: shedding new lights on (old) philosophical debates

Enriqueta Aragones Harvard University and Universitat Pompeu Fabra Andrew Postlewaite University of Pennsylvania. March 9, 2000

Supporting Information Political Quid Pro Quo Agreements: An Experimental Study

CS 886: Multiagent Systems. Fall 2016 Kate Larson

Political Economics II Spring Lectures 4-5 Part II Partisan Politics and Political Agency. Torsten Persson, IIES

Introduction to Computational Game Theory CMPT 882. Simon Fraser University. Oliver Schulte. Decision Making Under Uncertainty

Comparison Sorts. EECS 2011 Prof. J. Elder - 1 -

Voting. Suppose that the outcome is determined by the mean of all voter s positions.

Introduction to Political Economy Problem Set 3

CS 4407 Algorithms Greedy Algorithms and Minimum Spanning Trees

Complexity of Manipulating Elections with Few Candidates

Maximin equilibrium. Mehmet ISMAIL. March, This version: June, 2014

UNIVERSITY OF CALIFORNIA, SAN DIEGO DEPARTMENT OF ECONOMICS

Appendix to Non-Parametric Unfolding of Binary Choice Data Keith T. Poole Graduate School of Industrial Administration Carnegie-Mellon University

Experimental Evidence on Voting Rationality and Decision Framing

Thema Working Paper n Université de Cergy Pontoise, France

Self-Organization and Cooperation in Social Systems

Politics is the subset of human behavior that involves the use of power or influence.

A comparison between the methods of apportionment using power indices: the case of the U.S. presidential election

1 Aggregating Preferences

Uninformed search. Lirong Xia

Sincere versus sophisticated voting when legislators vote sequentially

Topics on the Border of Economics and Computation December 18, Lecture 8

Sincere Versus Sophisticated Voting When Legislators Vote Sequentially

(67686) Mathematical Foundations of AI June 18, Lecture 6

1 Electoral Competition under Certainty

arxiv: v1 [cs.gt] 11 Jul 2018

Bargaining and Cooperation in Strategic Form Games

Mehmet Ismail. Maximin equilibrium RM/14/037

Algorithms, Games, and Networks February 7, Lecture 8

CSC304 Lecture 16. Voting 3: Axiomatic, Statistical, and Utilitarian Approaches to Voting. CSC304 - Nisarg Shah 1

Lecture 12: Topics in Voting Theory

Estimating the Margin of Victory for Instant-Runoff Voting

From Argument Games to Persuasion Dialogues

HOTELLING-DOWNS MODEL OF ELECTORAL COMPETITION AND THE OPTION TO QUIT

Political Change, Stability and Democracy

Mathematics and Social Choice Theory. Topic 4 Voting methods with more than 2 alternatives. 4.1 Social choice procedures

Evaluation of election outcomes under uncertainty

Hat problem on a graph

What is Computational Social Choice?

Voting rules: (Dixit and Skeath, ch 14) Recall parkland provision decision:

A Minimax Procedure for Electing Committees

ECE250: Algorithms and Data Structures Trees

Buying Supermajorities

Honors General Exam Part 1: Microeconomics (33 points) Harvard University

Manipulative Voting Dynamics

Dimension Reduction. Why and How

Coalitional Game Theory for Communication Networks: A Tutorial

Coalitional Game Theory

Biogeography-Based Optimization Combined with Evolutionary Strategy and Immigration Refusal

Democratic Rules in Context

Rational Choice. Pba Dab. Imbalance (read Pab is greater than Pba and Dba is greater than Dab) V V

Evaluation of Election Outcomes under Uncertainty

Figure 1. Payoff Matrix of Typical Prisoner s Dilemma This matrix represents the choices presented to the prisoners and the outcomes that come as the

Satisfaction Approval Voting

Sampling Equilibrium, with an Application to Strategic Voting Martin J. Osborne 1 and Ariel Rubinstein 2 September 12th, 2002.

Voting System: elections

Midterm Review. EECS 2011 Prof. J. Elder - 1 -

Chapter 9: Social Choice: The Impossible Dream

In Elections, Irrelevant Alternatives Provide Relevant Data

Michael Laver and Ernest Sergenti: Party Competition. An Agent-Based Model

Introduction to Game Theory. Lirong Xia

The story of conflict and cooperation

International Cooperation, Parties and. Ideology - Very preliminary and incomplete

How to Change a Group s Collective Decision?

Political Selection and Persistence of Bad Governments

Classical papers: Osborbe and Slivinski (1996) and Besley and Coate (1997)

Cloning in Elections 1

Sequential Voting with Externalities: Herding in Social Networks

Bargaining Power and Dynamic Commitment

ÇÙØÐ Ò ÁÒØÖÓ ÙØ ÓÒ º º ÓÙ ÖÝ ¾ ÁÒ ØÖÙØÓÖ³ ÒÓØ Å Ò Ñ Ü Ð ÓÖ Ø Ñ ÐÔ Ø ÔÖÙÒ Ò

Economics Marshall High School Mr. Cline Unit One BC

Approval Voting Theory with Multiple Levels of Approval

The Provision of Public Goods Under Alternative. Electoral Incentives

Recommendations For Reddit Users Avideh Taalimanesh and Mohammad Aleagha Stanford University, December 2012

THE EFFECT OF OFFER-OF-SETTLEMENT RULES ON THE TERMS OF SETTLEMENT

Nonexistence of Voting Rules That Are Usually Hard to Manipulate

Voting Methods for Municipal Elections: Propaganda, Field Experiments and what USA voters want from an Election Algorithm

LEARNING FROM SCHELLING'S STRATEGY OF CONFLICT by Roger Myerson 9/29/2006

Priority Queues & Heaps

MATH4999 Capstone Projects in Mathematics and Economics Topic 3 Voting methods and social choice theory

Protocol to Check Correctness of Colorado s Risk-Limiting Tabulation Audit

Game theoretical techniques have recently

How to identify experts in the community?

Influence in Social Networks

Social Rankings in Human-Computer Committees

An Integer Linear Programming Approach for Coalitional Weighted Manipulation under Scoring Rules

Computational Social Choice: Spring 2007

Subreddit Recommendations within Reddit Communities

Strategy and Effectiveness: An Analysis of Preferential Ballot Voting Methods

ONLINE APPENDIX: Why Do Voters Dismantle Checks and Balances? Extensions and Robustness

The Possible Incommensurability of Utilities and the Learning of Goals

information it takes to make tampering with an election computationally hard.

No Adults Allowed! Unsupervised Learning Applied to Gerrymandered School Districts

BOOK REVIEW BY DAVID RAMSEY, UNIVERSITY OF LIMERICK, IRELAND

Transcription:

B.Y. Choueiry 1 Instructor s notes #9 Title: dverserial Search IM: Chapter 5 (Sections 5.1, 5.2 and 5.3) Introduction to rtificial Intelligence CSCE 476-876, Fall 2017 URL: www.cse.unl.edu/ choueiry/f17-476-876 Berthe Y. Choueiry (Shu-we-ri) (402)472-5444

Outline Introduction Minimax algorithm lpha-beta pruning B.Y. Choueiry 2 Instructor s notes #9

B.Y. Choueiry 3 Instructor s notes #9 Context In an MS, agents affect each other s welfare Environment can be cooperative or competitive Competitive environments yield adverserial search problems (games) pproaches: mathematical game theory and I games

B.Y. Choueiry 4 Instructor s notes #9 Game theory vs. I I games: fully observable, deterministic environments, players alternate, utility values are equal (draw) or opposite (winner/loser) In vocabulary of game theory: deterministic, turn-taking, two-player, zero-sum games of perfect information Games are attractive to I: states simple to represent, agents restricted to a small number of actions, outcome defined by simple rules Not croquet or ice hockey, but typically board games Exception: Soccer (Robocup www.robocup.org/)

B.Y. Choueiry 5 Instructor s notes #9 Board game playing: an appealing target of I research Board game: Chess (since early I), Othello, Go, Backgammon, etc. - Easy to represent - Fairly small numbers of well-defined actions - Environment fairly accessible - Good abstraction of an enemy, w/o real-life (or war) risks : ) But also: Bridge, ping-pong, etc.

B.Y. Choueiry 6 Instructor s notes #9 Characteristics Unpredictable opponent: contingency problem (interleaves search and execution) Not the usual type of uncertainty : no randomness/no missing information (such as in traffic) but, the moves of the opponent expectedly non benign Challenges: - huge branching factor - large solution space - Computing optimal solution is infeasible - Yet, decisions must be made. Forget *...

B.Y. Choueiry 7 Instructor s notes #9 Discussion What are the theoretically best moves? Techniques for choosing a good move when time is tight Pruning: ignore irrelevant portions of the search space Evaluation function: approximate the true utility of a state without doing search

B.Y. Choueiry 8 Instructor s notes #9 Two-person Games - 2 player: Min and Max - Max moves first - Players alternate until end of game - Gain awarded to player/penalty give to loser Game as a search problem: Initial state: board position & indication whose turn it is Successor function: defining legal moves a player can take Returns {(move, state) } Terminal test: determining when game is over states satisfy the test: terminal states Utility function (a.k.a. payoff function): numerical value for outcome e.g., Chess: win=1, loss=-1, draw=0

B.Y. Choueiry 9 Instructor s notes #9 Usual search Max finds a sequence of operators yielding a terminal goal scoring winner according to the utility function Game search Min actions are significant Max must find a strategy to win regardless of what Min does: correct action for Max for each action of Min Need to approximate (no time to envisage all possibilities difficulty): a huge state space, an even more huge search space e.g., chess: 10 40 different legal positions verage branching factor=35, 50 moves/player= 35 100 Performance in terms of time is very important

B.Y. Choueiry 10 Instructor s notes #9 Example: Tic-Tac-Toe Max has 9 alternative moves Terminal states utility: Max wins=1, Max loses = -1, Draw = 0 M () MIN (O) M () MIN (O) TERMINL Utility O O O O O O O O O O............ O O O O O O 1 0 +1.........

B.Y. Choueiry 11 Instructor s notes #9 Example: 2-ply game tree Max s actions: a 1, a 2, a 3 Min s actions: b 1, b 2, b 3 M MIN 3 B 2 C 2 D b 1 b 2 b 3 3 a 1 a 2 a 3 c 1 c 2 c 3 d 1 d 2 d 3 3 12 8 2 4 6 14 5 2 Minimax algorithm determines the optimal strategy for Max decides which is the best move

B.Y. Choueiry 12 Instructor s notes #9 Minimax algorithm - Generate the whole tree, down to the leaves - Compute utility of each terminal state - Iteratively, from the leaves up to the root, use utility of nodes at depth d to compute utility of nodes at depth (d 1): MIN row : minimum of children M row : maximum of children Minimax-Value (n) Utility(n) if n is a terminal node max s Succ(n) Minimax-Value(s) if n is a Max node min s Succ(n) Minimax-Value(s) if n is a Min node

B.Y. Choueiry 13 Instructor s notes #9 Minimax decision M s decision: minimax decision maximizes utility under the assumption that the opponent will play perfectly to his/her own advantage Minimax decision maximes the worst-case outcome for Max (which otherwise is guaranteed to do better) If opponent is sub-optimal, other strategies may reach better outcome better than the minimax decision

B.Y. Choueiry 14 Instructor s notes #9 Minimax algorithm: Properties m maximum depth b legal moves Using Depth-first search, space requirement is: O(bm): if generating all successors at once O(m): if considering successors one at a time Time complexity O(b m ) Real games: time cost totally unacceptable

B.Y. Choueiry 15 Instructor s notes #9 Multiple players games Utility(n) becomes a vector of the size of the number of players For each node, the vector gives the utility of the state for each player to move B C (1, 2, 6) (1, 2, 6) (1, 5, 2) (1, 2, 6) (6, 1, 2) (1, 5, 2) (5, 4, 5) (1, 2, 6) (4, 2, 3) (6, 1, 2) (7, 4,1) (5,1,1) (1, 5, 2) (7, 7,1) (5, 4, 5)

B.Y. Choueiry 16 Instructor s notes #9 lliance formation in multiple players games How about alliances? and B in weak positions, but C in strong position and B make an alliance to attack C (rather than each other Collaboration emerges from purely selfish behavior! lliances can be done and undone (careful for social stigma!) When a two-player game is not zero-sum, players may end up automatically making alliances (for example when the terminal state maximizes utility of both players)

B.Y. Choueiry 17 Instructor s notes #9 lpha-beta pruning Minimax requires computing all terminal nodes: unacceptable Do we really need to do compute utility of all terminal nodes?... No, says John McCarthy in 1956: It is possible to compute the correct minimax decision without looking at every node in the tree, and yet get the correct decision Use pruning (eliminating useless branches in a tree)

B.Y. Choueiry 18 Instructor s notes #9 Example of alpha-beta pruning (a) [, + ] (c) (e) (b) [, + ] [, 3] B [, 3] B 3 3 12 [3, 3] B (d) [3, + ] 3 12 8 3 12 8 2 [3, 3] [3, + ] [3, 14] [, 2] [, 14] B C D (f) [3, 3] 3 12 8 2 14 3 12 8 2 14 5 2 Try 14, 5, 2, 6 below D [3, 3] [3, 3] B [, 2] C [, 2] [2, 2] B C D

B.Y. Choueiry 19 Instructor s notes #9 General principal of lpha-beta pruning a parent node of n If Player has a better choice m at any choice point further up n will never be reached in actual play Player Opponent...... Player Opponent Once we have found enough about n (e.g., through one of it descendants), we can prune it (i.e., discard all its remaining descendants) m n

B.Y. Choueiry 20 Instructor s notes #9 Mechanism of lpha-beta pruning α: value of best choice so far for M, (maximum) β: value of best choice so far for MIN, (minimum) Player Opponent...... Player Opponent lpha-beta search: - updates the value of α, β as it goes along - prunes a subtree as soon as its worse then current α or β m n

B.Y. Choueiry 21 Instructor s notes #9 Effectiveness of pruning Effectiveness of pruning depends on the order of new nodes examined (a) [, + ] (b) [, + ] [, 3] B [, 3] B 3 3 12 (c) [3, + ] (d) [3, + ] [3, 3] B [3, 3] B [, 2] 3 12 8 3 12 8 2 (e) [3, 14] (f) [3, 3] [3, 3] [, 2] [, 14] B C D [3, 3] 3 12 8 2 14 3 12 8 2 14 5 2 C [, 2] [2, 2] B C D

B.Y. Choueiry 22 Instructor s notes #9 Savings in terms of cost Ideal case: lpha-beta examines O(b d/2 ) nodes (vs. Minimax: O(b d )) Effective branching factor b (vs. Minimax: b) Successors ordered randomly: b > 1000, asymptotic complexity is O((b/logb) d ) b reasonable, asymptotic complexity is O(b 3d/4 ) Practically: Fairly simple heuristics work (fairly) well