Decision analytical perspectives into voting machines

Mat-2.108 Independent Research Proect in Applied Mathematics Decision analytical perspectives into voting machines Tommi Kauppinen 58020R November 21, 2007 Helsinki University of Technology Department of Automation and Systems Engineering Systems Analysis Laboratory

Contents 1 INTRODUCTION...3 2 SPATIAL THEORY OF VOTING...4 3 MATHEMATICAL FRAMEWORK...7 3.1 Perspective of a voter and decision analytics involved...7 3.2 Perspective of a nominee: probability of a social choice...11 3.3 Communal perspective: voting theory and social choice...17 4 VOTING MACHINE OF HELSINGIN SANOMAT FOR THE PRESIDENTIAL ELECTIONS IN 2006...20 4.1 Algorithm used in the voting machine of Helsingin Sanomat...22 4.2 Evaluation of the voting machine from the perspective of the voter and the nominee...23 4.3 Evaluation of the voting machine ith some common criteria...27 5 CONCLUSION...32 REFERENCES...34 2

1 Introduction Voting machine is a term that can be confused ith many a thing. Here the term refers to interactive internet-based services provided prior to different elections. In Finland voting machines have been used preceding elections of European and Finnish Parliament, municipal elections and Presidential elections. These services provided are used by voters to receive guidelines on hom of the different nominees to vote and to learn of their vies on specific issues. This paper concentrates on the algorithms that voting machines use to give recommendations on suitable candidates. Voting machines have had up to half a million users prior to election of Finnish Parliament in 2003 (Turun Sanomat 12.5.2004). As this corresponds to nearly 20% of Finnish voters, the recommendations given by voting machines may have considerable influence, herefore research on voting machines is needed. As a multifaceted subect of study hoever, principles of voting machines can be approached from different theoretical frameorks. The frameork presented in this paper is only a one possible ay of analysing them. For example, the spatial theory of voting used is only a one possible angle of approach. It complements other theories, e.g. the socialpsychological theory of voting. The aim of this study is to provide a decision analytical perspective to the use of voting machines by different users and user groups. The theoretical frameork used is that of decision analysis, preference modelling and voting theory. The spatial theory of voting is used to analyze the relationship beteen these theories and more specifically the case of voting machine of Helsingin Sanomat prior to Presidential elections of 2006. Different perspectives taken into account are those of a voter, nominee and community. Both the use and usefulness of voting machines in general and in the Case studied are analysed from the different perspectives. This study first discusses the spatial theory of voting in Section 2. Section 3 provides different perspectives into voting machines, and these are used to analyze the case study in Section 4. Section 5 presents conclusions. Works by Nurmi (e.g. 1987, 1999 and 2003) have been an important source for this paper, and his line of thought is evident in sections discussing communal perspective. Where individual decision making is concerned, the most important reference is Keeney & Raiffa s classic ork (1976). 3

2 Spatial theory of voting In the folloing the spatial theory of voting is briefly presented. 1 Spatial theory of voting are actually to separate theories concerning voting. The first theory is about committee voting, and the second is about elections. For the study of voting machine, only the latter is relevant, and this paper is limited to election voting (moreover, this is meant hen referring to spatial theory of voting). The key element of spatial models is the relationship beteen a voter s preferences and the distance of those preferences from the preferences of a nominee (Enelo and Hinich 1984:15). Therefore the key question of a spatial theory is the formulation of distance beteen preferences. Moreover, a multitude of different methods for quantitative measurement of this distance have been introduced (for some used in the context of voting preferences, see Nurmi [2003:39,73]). In the folloing a difference measure for the purposes of this paper is formulated. It is based on the mathematical definition of a metric (e.g. Nurmi 2003:72) and on formulation of eights for different preferences (e.g. Enelo and Hinich 1984:41). Folloing standard notation, a metric is a nonnegative function d( x, y) describing the distance beteen neighbouring points for a given set A. A metric satisfies the triangle inequality (1); it is symmetrical (2); and mapping of identical points using function d ( x, y) results in a distance of zero, and vice versa (3). Definition of a metric can be ritten as d( x, y) d( y, z) d( x, z), (1) d( x, y) d( y, x), (2) d( x, y) 0 y x, (3) d ( x, y) 0, (4) for x, y A. No A is defined as a set of all possible alternatives concerning some policy around specific issue. Then the function d ( x, y) maps to different policy alternatives x and y into some distance greater or equal than zero. The greater the difference, greater or equal (1), then, the distance beteen those policies x and y. 1 For a more detailed account, see Enelo and Hinich (1984). 4

If policies concerning n issues are studied, the total difference beteen given sets of voter policy alternatives offered, and Y, should be studied. These sets and Y can be defined as being composed of elements x and y Y, here x and y stand for offered policies, stands for the th issue and 1,..., n. If d x, y ) is defined as the difference of policy alternatives, ( then the total difference D, Y of offered voter policies and Y in n issues can be defined as D n, Y d x, y ) 1 (. (5) Equation (5) does not account for the fact that different policy issues have different degree of importance to the voters. For these, given sets and Y of voter alternatives, preference eight sets W and Y W can be defined. Concerning issue, then, the voter alternative set has a voter preference eight set W such that the th element of set has a corresponding th element in the set W and similarly for the set Y. th element in the set W is ritten and th element in the set Y W is ritten Y. In a general case e can then define the eighted total difference of voter preference D, Y as D n Y, Y f d( x, y ),, ) 1 (, (6) here the f is a mapping of the metric difference and eights concerning issue into some real number, namely 3 f :, 1,..., n. If function f is the same for all issues, and further, if this function is defined as a multiplication of to independent factors, namely a eight function Y g (, ) and a metric difference d x, y ), the equation (6) can be simplified into a form (e.g. Enelo and Hinich 1984:16) D ( n Y, Y g, ) d( x, y ) 1 (, (7) here the function g is uniform over the given set of issues, depending only on the preference eights of and Y, viz. g : 2. The equation (7) can also be ritten in the form of vector multiplication. It holds that Y D gˆ(, ) dˆ( x, ), (8), Y y 5

here vectors Y,, x and y can be ritten as vectors of length n and exponent T stands for T Y Y Y Y Y T T transpose: ( 1, 2,...,,..., n ), ( 1, 2,...,,..., n ), x ( x1, x2,..., x,..., x n ) T and y y, y,..., y,..., y n ). In addition, function n 2 1 n gˆ : and function ˆ ( 1 2 n 2 n 1 Y d :. Therefore g(, ) ˆ is a 1 n -matrix and ˆ d( x, y) is a n 1-matrix. The metric difference d x, y ) for issue can further be defined, e.g. in Euclidean space as the ( Euclidean distance beteen to policy alternatives x and y. Hoever, to use a norm (that is, to measure a distance) it is necessary to introduce countable qualities to the policy alternatives. One ay to accomplish this is to break the policy alternatives don to some other factors, hich can then be more easily numerated. 2 Folloing this rationale it is established that the policy alternatives hen discussing issue have m number of attributes enables interpretation of and Hinich 1984:15). x and a k, y as m-vectors of attributes ak ak k 1,..., and m. Formulation given Y a k, respectively (e.g. Enelo If the Euclidean space is adopted as the frame of reference, then the Euclidean distance beteen points x ) T ( a1, a2,..., am and Y Y Y T y ( a1, a2,..., am ) is defined as a norm x 1 Y 2 Y 2 Y 2 2 1 a1 ) ( a2 a2 )...( am am ) ] y [( a. (9) No the distance d x, y ) ( is identified in Euclidean space as the Euclidean distance of certain vectors x and y, given by the norm x y. Moreover, the eighted total difference of voter preference D, can be identified as eighted sum of these Euclidean distances. Y The preference eight sets W and Y W can be formulated by any means suitable for the eighting the distance measurement. 3 Hoever, minimum for solving the eighted total difference of voter preference D, is to give eights Y and Y for all issues and a scalar interpretation for the function g. It can be seen that D, Y is used as a quantitative measure of the distance beteen different preferences over alternative policy sets. Hence the term: Spatial theory of voting. 2 Folloing Keeney and Raiffa (1976), the policy alternatives ould be named as obectives, and the factors as attributes. Keeney and Raiffa (1976) discuss these matters at length, for introduction see (1976:34-41). 3 For possible lines of approach, see e.g. Keeney and Raiffa (1976:267) 6

3 Mathematical frameork A voting machine has different functions, depending on ho is using it. Three perspectives on voting machines discussed here are those of the voter, nominee and voting community. The voter has the possibility of extracting information from the nominees using voting machine; nominee can advertise him or herself to the voters using voting machine; and the voting community could develop into a society here more informed voting decisions are made. The voting community is not directly taking part (ho could it) into the design of any voting machine. Hoever, here it is assumed that through discourse community can contribute to voting machine s development. Fourth perspective on the voting machine is that of the designer, but is left here out of consideration. It is simply assumed that the design of the voting machine has been made ith the best intensions, contributing as constructively as possible to the ell-being of community and its individuals. Different perspectives are described also by Talponen (2006). Here a table is presented, giving a vie on different inputs and outputs of a voting machine. Table 1. Inputs and outputs of a voting machine from different perspectives 4 Perspective Input Output Voter Set of preferences Information on nominees Nominee Set of preferences Add for the campaign Community Discourse Developed democratic decisions Designer/administrator Technicalities, content Service, add and data Next the first three of the perspectives mentioned in Table 1 are considered, ith the aim of providing a theoretical frameork for the discussion concerning the case study. 3.1 Perspective of a voter and decision analytics involved To formulate a limited number of questions and alternatives so that the preference sets can be compared ith some accuracy is a challenging task for the designer. Hoever, here the interest is to characterise the theoretical context in hich the (prospective) voter uses the voting machine. The basic use of voting machine includes choosing from the ansering alternatives to the questions presented, by means of hich the voting machines algorithm finds the nominees hose set of preferences resemble the one of the voter. Effective use of voting machine is therefore related to the 4 see also Talponen (2006:47, in Finnish). 7

theory of decision analysis and preference modelling, on hich there is plenty of theory. Here some of the main issues are addressed, ith emphasis on theoretical aspects. Preference relation over attributes of anser alternatives z of question can be vieed as P, I - structure expressed by asymmetric and transitive binary relation R for hich the definitions for eak order hold (e.g. Öztürk et al. 2003:12). That is, for every preference relation over any attributes k', k K here K is the set of attributes for alternatives z for question, the relation is of the form of preference P(k,k ), meaning preference of k over k, and/or indifference I(k,k), meaning indifference beteen k and k. The combination of these forms the (asymmetric and transitive) binary relation R: K : R K K, R( k, k' ) iff ( P I)( k, k' ), (10) here R(k,k ) means eak preference of k over k. Here it is assumed that both the voter and the nominee can order all the attributes k of anser alternatives z of any question in binary relations R, therefore forming a preference ranking over the attributes k. Then eak preference relation for preferences over eights can be defined, such that R ( k, k') k k '. (11) Use of a voting machine is here seen as a declaration of a set of eak preference relations over attributes relating to given alternatives, in the context of questions presented. On every alternative decision maker considers m issue-related attributes k. This can be formulated as (e.g. Keeney and Raiffa 1976:80) V ( z) k vk ( z), (12) k here V maps the total value of an anser alternative z, z 1,... r for question. v k maps the score of alternative z regarding attribute k and k is the eight associated ith attribute k, k 1,..., m. In other ords, then v k stands for the score of the outcome resulting from alternative z, regarding attribute k, and k stands for the decision makers preference regarding that outcome. The goal of a voter ansering the questions is to find a nominee that best suits him (e.g. Enelo and Hinich 1984). This goal can be formulated as the voter s intention to imitate the ideal alternative set for a nominee. To put it otherise, the voter tries to minimize the distance beteen his or her ideal alternative set and his or her ansers to the questions given. This is done by 8

ansering the questions as the voter believes his ideal nominee ould. The distance beteen the alternatives voter s ansers dictate,, and his or her ideal anser set, *, can be given as (see eq.(7)): D I n *, *, x, x 1 min D min g(, ) d ( x, x *). (13) The elements of the eight set W are included in the minimization as variables for the sake of generality; a eighting function g can be formulated to give a smaller value to ansers that share the same eight for an issue or a eak preference relation for attributes k concerning issue. Considering the equation (12), it can be noted that maximised value of the function V, regarding alternatives z, results in choice of ideal alternatives in relation to the value functions v k (z) and eights k. Under complete information of voter s preference ranking and no uncertainty concerning the outcome, value function and eights can be formed as to solve for the ideal alternative set of the voter. Therefore, Y x x * ) D 0. (14) ( I Hoever, the limited number of questions, anser alternatives to these questions, and on the other hand the multitude of perspectives on different issues have an impact on the decision making of the voter, making it uncertain and incomplete of information. Binary relation R can also be used to express a fuzzy eak order (e.g. Öztürk et al. 2003:17), hich in turn can be used to describe the incompleteness of information and/or uncertainty associated ith decision making. The binary relation beteen attributes k and k as either true (=1) or false (=0). Under uncertainty or fuzziness, the fuzzy binary relation can be defined as a mapping of the probability that R(k,k). Then it must hold that R : K K [0,1] P : K K [0,1] I : K K [0,1] R( k, k' ) b P( k', k) 1 b. (15) here b is the probability that the relation R holds. Under uncertainty and incomplete information, the voter does not exactly kno his or her on preferences, over questions or attributes k (incomplete information) and he/she is unsure of outcomes of his or her ansers (uncertainty). This affects the evaluation of the distance beteen voter s ansers and his or her ideal anser set. In other ords: * x x * ) D 0, (16) ( I 9

In the context of decision analysis, a score interval for the value function v(z) can be formed to represent a range of possible outcomes, and a eight interval for eights to represent uncertainties in decision makers preferences. This in turn leads to overall value intervals [ V min (z), V max (z) ] for each alternative. As greatest overall value may not be available, the maximisation procedure must be modified in some fashion. For example, a dominance structure can be constructed for the alternatives (Salo and Hämäläinen 1992). Other ays of approaching incomplete information and uncertainties are varied. Here a Bayesian presentation of decision-making (e.g. Parsons et al. 2002) and Robust Portfolio Modelling approach (Liesiö et al. 2007) are introduced to provide an intuition on ho different mathematical formalism present the aspects discussed so far. The Bayesian approach particularly shos the idea of probability in a compact form. For the voter s expected utility function EU v for a question ith anser alternative z it holds that EU v ( v S S z) Pr( S z) U ( S ), (17) here S is a possible configuration of a voter s ideal nominee s attributes, S is the set of all possible attributes, and U v is the voter s utility function for a given configuration of the ideal nominee s attributes. The equation (17) can be used to elaborate on the problems of incomplete information and uncertainty. The uncertainty in choosing the proper anser alternative z from the limited number of r alternatives is given by the conditional probability. The incomplete information in turn is described by the utility function: the voter should be able to evaluate the kind of attributes his or her ideal nominee should take note hile assessing the question, in turn demanding the voter an insight of his or her ideal nominee. One further ay of presenting this problematique is the Robust Portfolio Modelling -formulation, or RPM for short (Liesiö et al. 2007). RPM is based on evaluation of different proects regarding different attributes. In this formulation, every vector z of ansers z to given questions can be interpreted as a portfolio. Different proects are z, belonging to set Z. Folloing Keeney and Raiffa (1976:76), it is knon that V ( z m ) V ( z ), (18) k 1 k k 10

here V k is a cardinal value function for the attribute k. Therefore, summing up for different questions, a combined total value for different proects z can be formed, and evaluation of different portfolios z can be done. The uncertainties and incompleteness of information involved are no all presented by the individual attributes value functions V k. The voter has n questions to anser, each ith m related attributes, and here it has been assumed that each question relates to only one issue for all voters i, i 1,..., l. In the case of voting machines hoever, this is not alays the case. The cases in hich many questions relate to the same issue for some voter are called cases of preferential dependence, opposed to preferential independence (Keeney and Raiffa 1976:109 5 ). Cases of preferential dependence can be vieed in relation ith spatial theory of voting. Given that to questions are for some voter i preferentially dependent and thus connected ith a same issue, then according to the spatial interpretation, the anser alternatives z, ' z for to questions ',, ' are linearly dependent and form a shared dimension vector. Shared dimension makes it possible to predict the anser given to one of the preferentially dependent questions by the voter i from the other vector, making one of the questions redundant. There is also another kind of dependence possible beteen the questions, namely logico-causal dependence (Talponen 2006). This form of dependence refers to questions in hich virtually the same thing is asked in both of them, ith different phrasing. This kind of dependence makes the other question redundant for all voters i. These sorts of dependencies can be of use for the nominee, and it is in the best interests of the voter to be aare of the possible dependencies beteen the questions. 3.2 Perspective of a nominee: probability of a social choice For the nominee, the voting machine is a tool for advertising his or her vies on issues and for his or her campaign for office in general. Therefore the opportunistic goal for a nominee is to anser in such a manner that a largest amount of voters using the voting machine possible ould give approximately similar ansers. This maximizes the visibility of the nominee s campaign and acceptability of the nominee for an office. Hoever, some degree of manipulation of the vies expressed may contribute to this obective (see e.g. Gibbard 1973). The question a nominee has to anser is then ho extensive misrepresentation of preferences is needed for an efficient campaign. 5 For the discussion concerning voting machines, see also Talponen (2006), Kauppinen (2006) (in Finnish). 11

Already a one important implicit assumption has been made, namely that the voters using the voting machine have no knoledge of the attributes k considered by the nominee ansering each question ; otherise they ould have a possibility of telling ho of the nominees are opportunistic. Assumption can be seen reasonable considering the time and resources available for the voters using the voting machine to study a particular nominee s alternatives chosen. Hoever, this leads to asking hat kind of data the nominees can expect the voters to have available for considering the credibility of nominee s ansers. And as importantly, hat kind of data the nominees have available about the voters using a voting machine. From the previous section some possible levels of data available for the nominees can already be identified. First it is possible to have data available only on questions presented. This data is acquired simply by using the voting machine. In this case no particular data on the alternatives chosen is available, and therefore no mathematical method can be used for the analysis of the alternatives chosen. Second it is possible to have data available on the distribution of alternatives z chosen by the voters i hen asked a single question also presented in the voting machine. This case is discussed more thoroughly in the Section 3.2.1. Here it is only noted that the social choice theory by Intriligator (1973) and preferential independence of the questions becomes important for conducting an effective analysis. Third it is possible to have data available on some number of voters i, i 1,...,l data choice of alternatives z i to some number of questions, 1,..., n presented in the voting machine. This kind of data can be analysed ith a number of methods, including statistical 6. Fourth it is noted that if in addition all the voter attributes having an effect on the decision making are knon for all voters i, i 1,..., l, exact probabilistic distributions of the main attributes and their effect on the decision making of the population are knon. In case of theoretical all-inclusive information all the exact distributions are available for the nominee. Table 2 lists these four different cases of data depth, their availability and the mathematical methods available. data 6 An introductory volume to these methods is also available in the internet, see Mustonen (1995, in Finnish). 12

Table 2. Different levels of data precision, data availability and respective mathematical methods Exact data on Data availability Mathematical methods available Questions used Single alternatives z chosen per question by a population on n voters Anser alternatives z i chosen by the voter i, question, i 1,...,l on the data 1,..., ndata Attributes in hich choice of alternatives is based on available to both voters and nominees available publicly on the voters possibility of obtaining on the voters by individual nominees In practice very challenging to obtain - Social choice theory (Intriligator 1973), exact probabilistic distribution in the case of preferential independence of questions 7 Factor analysis, k-clustering and other statistical methods 8 ; Bayesian 9 and neural 10 netorks Exact probabilistic distributions of different voters and alternatives As shon in Table 2, it is assumed that the maority of voters have available only the exact data on questions used in the voting machine. It is therefore easier for a nominee to misrepresent his or her preferences ithout maority of voters noticing. In addition, as established above, the possible misrepresentation of preferences performed by a nominee can benefit from gathering and analysis of accurate data on the voters. Without going to the details, ith the data on the anser alternatives z i the misrepresentation of preferences can be done more effectively than in the cases of less exact data. As it is, the cases of more exact data are left in this study ithout further discussion. Gathering of exact data on the anser alternatives z i, hoever is possible e.g. ith an internet inquiry form. While the statistical tools involved ould offer as an interesting area of discussion, in this study the intention is to discuss the possibility of nominees to misrepresent their preferences. Therefore the case of public data presents both the boundary case of preference misrepresentation and gives a possibility of more detailed analysis on the difficulties a nominee can face hen handling preferentially dependent data on the voters preferences. 7 Statement is discussed further in section 3.2.1. 8 See e.g. Mustonen (1995, in Finnish). 9 See e.g. Jensen (2001). 10 See e.g. Mandic & Chambers (2001). 13

Next the discussion is extended to a case here the nominee aims to misrepresent his or her vies on the basis of data available publicly. Typically, this could be data gathered from simple opinion polls, including only the distribution of respondents in reference to single specific questions. 3.2.1 Social choice theory and preferential dependence of questions Michael D. Intriligator (1973, 1982) has suggested a probabilistic presentation of choice made by a number of people, hich avoids some problems of voting theory, particularly Arro s theorem (Arro 1963 11 ). Intriligator s presentation (1973) assumes that 1. There exists a probability vector for every individual, giving a probability that an individual ansers according to an alternative, for all alternatives presented. 2. The average of these individual probabilities presents a social probability, the probability for a community to have a set of certain preferences, resulting in a probability vector giving a collective choice probability of each alternative. These and a fe minor axioms satisfy a property of collective rationality, meaning that the resulting social probability vector can be interpreted as a preference ranking. Intriligator s discussion can be understood in the context of this paper if the set of different anser alternatives z, z Z, to a question are taken each as a possible anser alternative, in case of each question and for each individual i. Then the concept of social probability can be used to give recommendations to a nominee of ho to anser to different anser alternatives z: a higher social probability means a higher probability for a random voter to choose according to the social probability. The social probability is not necessarily the same as the probability of a random voter s choice of alternatives. In order for the social probability to give the probability of random voter s choice of alternatives z, the questions must be preferentially independent. If the questions are not preferentially independent, it is possible to choose alternatives hich have the highest social probability, only to find that the probability of ansering ith the very combination of these alternatives z is highly improbable. This can be the case for example in national defence policy: in Finland there is an overhelming national support for strong sovereign defence forces. On the other hand there is a high support for banning certain type of mines hich cause casualties long after a conflict has ended. Hoever, a large number of people ho advocate the ban of this type of mine do not necessarily support strong defence forces, and vice versa. There are then at least to maor 11 For the discussion, see e.g. Nurmi (2003), or for philosophical analysis, MacKay (1980). 14

voter groups, and a nominee should not choose his or her anser alternatives z simply on the basis of highest social probability. The preferential dependence of ansers to different questions is the main problem faced from the perspective of a nominee. The attributes k for each anser alternative z form a preference structure on data of alternatives chosen. Preferential dependence hinders the efficient use of voting machine by the voter and the nominee; it also offers some possibilities of manipulation. To develop the efficiency of nominee s use of voting machine, it should be possible to provide a more comprehensive theory in hich individual decisions ould be presented as a total formed by all individuals ansers to individual questions. The main problem of this theoretical formulation is to form the combination of different questions to a total collective choice probability vector, because of the possible preferential dependencies beteen the questions. One ay of approaching the problem ould be to use as individual probabilities the total vectors z of different alternatives. Hoever, there are many different vectors z. For example in a case of five questions and three alternatives for each question, the number of different vectors is 3 5 = 243 alternative probability vectors. With average of four probabilities and 30 questions the number of alternatives is of order 10 18. Here it is first assumed that questions are preferentially independent. Then an individual probability vector q for individual i, question and alternatives z, q i ( qi1,... qia,..., qir z 1,..., r is formed (Intriligator, 1973): r ), q iz 0, i,, z, q 1, i. (19) z 1 iz, For some number of individuals l, the social probabilities are simple averages of individual probabilities for question. It therefore holds for a social probability vector s that for an alternative z the social probability is of form s 1 z q iz l i 1 l, (20) hich states that social probabilities for an alternative z are simple averages of individual probabilities for alternative z. Proposition (folloing Enelo and Hinich 1984:65). If the questions are preferentially independent and for every question there is an alternative ẑ so that s zˆ s z, z zˆ, then it follos that the 15

optimal voting rule for every question is to choose an alternative ẑ hich eakly dominates the alternatives z. Moreover, from preferential independence implies that the vector of these alternatives ẑ is the overall eakly dominant strategy for the nominee. Proof. Suppose that ith some question there is an alternative z hich is more preferred by a nominee than alternative ẑ, P ( z', zˆ ). For an alternative z to be more preferable to the alternative ẑ, the probable distance beteen a random voter i and the nominee must be shorter for alternative z than in the case of alternative ẑ. But the most probable alternative that a voter i chooses is the alternative ẑ, or some alternative ~ z s.t. I ( ~ z, zˆ ). Therefore it holds that s ˆ s k '. But this contradicts P ( z', zˆ ). Therefore there is no alternative z. The nominee should choose alternative ẑ or, hen alternative ~ z is available, alternative ẑ or ~ z, on every question. k With knon probabilities, the proof for the existence of alternative ẑ is trivial. Hoever, the discussion can be extended to uncertain probabilities, although ith added complexity. Because of the probabilities involved, the most natural ay to describe the decision nominee is facing is to use the Bayesian form of problem formulation (eq. (17)). As the attributes considered by the voter are taken as unknon for the nominee, the distance hich the alternative z that the voter has chosen has from the alternative ẑ chosen by the nominee is the only clue for the nominee to decide on his or her choice of alternatives. For alternatives z and ẑ for hich it holds that z, zˆ [1,2,3,..., r], a folloing norm can be adopted to measure the distances of alternatives as natural numbers: z zˆ z zˆ. (21) No S in eq. (17) can be interpreted as a distance beteen voter s and nominee s choice of an alternative instead of a possible configuration of a voter s ideal nominee s attributes. Then it holds for question that S z zˆ. (22) As this distance is uncertain, and there is only a finite amount of absolute distances possible, any single distance has some probability greater than zero. No, in order to find an acceptable interval in terms of utility for the distance measure, the utility function U v is defined as decreasing function of ε, z zˆ,. Thus, for questions 1,..., n and voters i 1,..., l, 16

, here z * l n z* arg max Pr( abs( zˆ z) z) U ( ). (23) z Z i 1 1 gives the optimal alternatives for questions, considering l number of voters. If the dominant strategies ẑ for questions are unavailable because of some probability distribution for the values of social probability vector s, the most probable alternatives chosen by a random voter i can be found as follos. Let s z, the social probability for every anser alternative z, be an s element of distribution S. Every element of S has an expected value of z based e.g. on the public data of opinion polls. In case of preferential independence the problem of choosing the anser alternative elements z * of the most probable social choice vector for all the questions, can be ritten z * arg max, =1,,n, (24) z Z here z * is the th element of the vector z * s z of most probable alternatives chosen by random s voter i, and z presents the expected value of the social probability for every anser alternative z and question. To account for the preferential dependence in the case of public data used, different voter s i probability vectors q i are functions of the probability vectors of other questions, as the probabilities are formed by preferential dependencies beteen attributes k. This can be ritten q q ( q,..., q, q,..., q ). (25) i ˆ i i1 i( 1) i( 1) in Therefore, taking a simple average over individual probability vectors, as in eq. (20), hides the structure of preferential dependence. The different structures should therefore be first classified and the social choice vector solved inside these classified structures. The inclusion of these classified structures leads to a study of different anser alternative vectors z, upon hich forming of clarity on the best option is not quite straight-forard. For inquiry on these, see e.g. McKelvey (1976), Miller (1977) and, for introduction, Nurmi (2003). 3.3 Communal perspective: voting theory and social choice In this section different voting theoretical considerations and their relation to the voting machine are considered. Main questions presented are the folloing: hat are the voting machine s 17

mathematical features in the context of voting theory? Are there some things in the logical structure of voting machine that should be considered in relation to social choice? A substantial part of voting theory is concerned ith preference sets in relation to some issue. Preference sets are one ay of introducing preference relations beteen different alternatives. For example, if there is an issue and alternatives a, b and c, then the possible preference sets are abc, acb, bac, bca, cab and cba, meaning, for example, that preference set abc includes preference relations P(a,b) and P(b,c). These preference sets are formed by pairise comparisons, in each of hich a preference relation is established beteen to alternatives. Voting machines do not consider preference sets in relation to issues, as they measure distances only beteen the single ansers of both the voter and the nominees. Hoever, preference sets are given in relation to nominees, giving every voter a list of nominees that under some distance metric best match their opinions. The distance metric has a crucial role in constructing the social choice. Yet a idely agreed method of vote, let alone a formulation for such a distance metric has not been developed. An in-depth discussion by Nurmi (1999) shos that there are great theoretical obstacles in forming a flaless method of voting. Any method of voting has a certain number of paradoxes hich have to be dealt ith. In the context of voting machines, then, both the public and designers of voting machines should be aare of them. Different ays of testing the performance of a voting procedure exist. These criteria of performance take the voting procedure as given, and compare the initial preferences of the voter to those hich the voting machine is able to produce. It has been established that a good voting procedure should be able to distinguish some of the differences beteen nominees A, B, C and so on. If performance meets the established criteria, then it is a better one. In this context different performance criteria can be introduced. Next the performance criteria given by Nurmi (2003:35-36) are introduced. a. Condorcet inner and loser Condorcet s inner is an alternative that ins all the other alternatives the most times in pairise comparison. Condorcet s loser is an alternative that loses to all other alternatives in a pairise comparison. Condorcet s inner should in the first place in the voting machine s comparison. Condorcet s loser should lose to all other options studied in the voting machine s comparison. b. Maority Winning 18

If some nominee receives the maority of votes, then this nominee should in the voting machine s comparison. The discussion on ho to solve for the votes received by a nominee in the case of a voting machine can be found in Section 4.3. c. Monotonicity If some nominee has been placed first in the comparison by the voting machine, and receives additional support by modification of the votes, this does not alter the result. d. Pareto Whenever all votes are distributed so that in all the questions considered nominee D is preferred to nominee E, then E is not the inner. e. Consistency If there is a such partition of the set of votes that a nominee D ins the voting machine s comparison in all subsets, then D must also in in the hole set of votes. f. Independence of irrelevant alternatives If to questions agree on the ranking of nominees D and E ith identical profiles considering D and E, then the votes received by D and E from these to questions should also be identical. These different criteria establish a viable frameork for an evaluation of a voting procedure, and by convenient modification (see Section 4.3) they can be used to analyse a voting machine s performance. It is a common interest for the individuals of a community to find the right candidate to vote on and there is therefore implicit or explicit interest in the performance of the voting machine relating to its use. If performance is not optimal in the sense described, voting machine can be deemed suboptimal in its presentation of social choice. Therefore it is in the interest of the community to discuss the possibility of different voting machines. Of particular interest is the inclusion of a more thorough investigation of preference sets to the functioning of voting machines, and the possible gains attained. Hoever, as a maor contribution to the communal vie, a rigorous investigation on different distance metrics in relation to voting machines could be conducted. 19

4 Voting machine of Helsingin Sanomat for the Presidential elections in 2006 Helsingin Sanomat produced a voting machine for the Presidential elections in 2006. Here this voting machine s operation is described and evaluated. First the functioning of the voting machine s algorithm is presented, along ith the exact formulas used. Afterards this functioning is studied by discussion above. Figure 1. Voter s user interface of the voting machine of Helsingin Sanomat: questions Voting machine has 29 questions, hich are divided by theme: ten questions discuss Presidential responsibilities, eleven discuss foreign and defence policy, and the Presidents character is discussed in final eight. Questions have to to five different alternatives of anser: there are 17 questions ith five alternatives, 1 ith four alternatives, 5 ith three alternatives and 6 of to alternatives. Every anser can be eighted ith a eighting coefficient, ith alternatives of {-, ±0, +}. Nominees have 20

to anser every question, but a voter can leave questions unansered, in hich case these questions are left out of the preference elicitation and comparison. Above figure (Figure 1) shos the user interface of the voting machine for voters. Figure 2. Voter s user interface of the voting machine of Helsingin Sanomat: result page After ansering the questions, the voter is presented ith result page, here the ranking of the different candidates by given ansers is presented. Both the candidate s name and party are shon. On the right the points given by the algorithm measuring the distances of the preference sets are shon (Figure 2). Moreover, the voter has the possibility of studying the ansers given by the nominees. Nominees can in addition give a short ustification for the choices they made, shon to the voters interested (Figure 3). 21

Figure 3. Voter s user interface of the voting machine of Helsingin Sanomat: anser comparison 4.1 Algorithm used in the voting machine of Helsingin Sanomat The voting machine s questions can be divided into three categories by the procedure of calculating the points received hen measuring the correspondence beteen to sets of preference (Kolunkulma 2006). These categories are choice, scale and yes/no questions. The choice category includes questions in hich the alternatives are mutually exclusive policies or vies. Scale questions are questions here the alternatives discuss matters of scale, e.g. amount of social services needed or the size of the defence budget. Yes/no- questions are normally in use, but in the Presidential elections this category as dropped out. Points received from different categories of questions are presented in Table 3 belo. In scale questions, the number presented under question category present the number of anser alternatives given. When ansers to all the questions hich the voter ishes to anser are given, the algorithm processes the ansers by comparing them to every nominee s ansers, giving points to every nominee according to Table 3. These points for the question number can be ritten as Y e d hs ( x, y ), 1,..., n (26) 22

here Y ei stands for the points received by the nominee Y from the question. Table 3. Point count according to question categories in the voting machine of Helsingin Sanomat Distance of alternatives beteen the ansers of a voter and a nominee Question 0 1 2 3 4 category yes/no 10-10 scale (3) 8 0-8 scale (4) 8 3-3 8 scale (5) 8 4 0-4 -8 choice 7-7 -7-7 -7 Both the nominee and the voter have the possibility of eighting the importance of some question and the topic related on the scale of {-, ±0, +}. The corresponding eighting coefficient are for the Y nominee y and voter x: {6,8,10}, {3,7,10}. Therefore the point count in the favour of the nominee in the matter of question, Y p, can be expressed as: p e Y Y Y (27) Next the points of the n-vector Y p are added together, scaling ith the total sum of each nominee s eight coefficients. This ay the total points are received for the comparison of the preference set of voter and the preference set of nominee Y, D,. hs Y 29 hs 232 Y D, Y 29 p, (28) Y 1 1 here the arbitrary natural number 232 stems from the fact that the sum of indifferent eighting of ansers by the nominee amounts to 232. 4.2 Evaluation of the voting machine from the perspective of the voter and the nominee In the light of the mathematical frameork provided in previous sections, it seems that ideally all m attributes having an effect on the decision-making should be included. Hoever, as as briefly 23

mentioned in Section 3.2, measuring these attributes is very challenging. The distance measure for the HS voting machine, defined in Section 4.1 comprises of eight for questions and distance metric defined in relation to the type of the question and other anser alternatives z for question. It can be seen then that HS s voting machine takes no account of attributes m having an effect on the decision-making. This conceals a structure of preferential dependencies in relation to attributes m behind the ansers given 12. Using a point-table (Table 3) as a definition of the distance measure d x, y ) beteen the voters ( and nominees anser to a question is consistent, if somehat arbitrary considering the more common metrics available. The ansers are given points one at a time, based on the distance and eights given. The point-table is approximately linear and the eights influence the end result in an understandable manner, hich can be used intuitively to compare different nominees. Some minor flas can be observed as ell. Choice and scale questions have different maximum and minimum values. The preferential and logico-causal dependencies have been reported (e.g. Kauppinen 2006). In any event no significant flas have been found in simulations done earlier (Riski 2003). There are also maor points orth considering. The scaling of eights may be quite counterintuitive to the voter. The eight should be defined by a number of attributes, hich in turn ould determine the preferences over eights in questions. No all the m related attributes are aggregated to a single value function (value being the alternative chosen), hich makes the usage of voting machine challenging, even if all the preferences over attributes are knon. At least an unbounded scale of eights ould give a voter a better idea of ho the eight variables function (For example the RPM-softare developed takes these points into account, e.g. Liesiö et al. 2007). The problem posed by m attributes not evaluated can be analysed in the folloing manner. As in equation (11), let V ( z) kvk ( z), (29) k here V (z) is the total value function of alternative z of question over attributes k. In the HS voting machine, a similar equation has been used for point count: 1 Y V i ( ) zv z ( ), ( p ) (30) Y z 12 The effects are mostly negative, as discussed belo. See also Section 3.2.1 and Kauppinen (2006) 24

here V i () is the total value function of question of voter i over alternatives z. On the other hand, it is also true for the HS voting machine that the total points V T (i) given for nominee y by the alternatives chosen by voter i is given by Y V T ( i) v ( i) ( e ). (31) No it is stated that if an evaluation of question is performed regarding alternative z, this is the same evaluation as hen alternative z is evaluated regarding question. That is, v z ( ) V ( z). (32) And similarly for question and voter i: v ( i) Vi ( ). (33) Using the equation (31), substituting first the equations (33) and (30), after hich the equations (32) and (29) are substituted in the same straight-forard manner, the folloing result is formed Y e i iz [ ik vik ( z)]. (34) z k eight chosen alternative Equation (34) shos that the associated ith the HS voting machine are not eights of attributes k, but eights of questions. Weights for different attributes, ik, or k for short, are therefore included in the choice of an alternative. As in the HS voting machine only one of the alternatives z can be chosen, eights for the different alternatives to 1. iz can be set as uniformly equal In effect, the voter should firstly have some knoledge on his or her preferences over the eights and on the scores of value functions of his attributes to anser alternatives z. Secondly, the voter should also be aare that the eight functions can only be used to eight the importance of one question against other questions. And thirdly, the voter should be aare that his or her preferences over attributes k might not be knon exactly; the voter should consider his or her ansers in the light of uncertain eights k and his or her incomplete information on the scores of attribute value functions v k (z) realised by the alternatives z. If this context of expressing the preferences is unfamiliar, one should be aare of the possibility of mismatch beteen the skills offered by the voting machine and the skills anted by the voter for the office in question. 25

In other ords the to sources of error originating from the lack of attribute evaluation in HS voting machine can be ritten in the form of eqs. (14) and (16). The assumptions presented correspond to the assumed similarity of voter s declared preferences over relative eights of questions and choice of alternatives, and those of voter s ideal nominee s preferences over relative eights of questions and choice of alternatives. Namely, o, * o x x *. As discussed in section 3.1, these can be erroneous assumptions. In addition, the second of these assumptions can be easier to accept as true than the first. Hoever, even the second one is most likely in error, because the limited number of questions and alternatives do not allo the voter to describe his or her preferences over different attributes accurately. Indeed, it is doubtful that this could be done by any number of different alternatives. In contrast, the first assumption is a problem of proportionality: voter can t anser accurately according to his or her on preferences over eights of questions, because he or she does not kno accurately the effect that different eights might have toards the end result. As voters might have problems forming ansers to questions based on alternatives alone, nominees can take use of the structure of the voting machine s questions. Theories concerning this kind of structure are available 13. In addition to the structure of questions themselves, also the structure beteen the questions can be used opportunistically. These sorts of structures, formed by preferential dependencies have been in focus in Section 3.2. Accounting for preferential dependencies requires some skill and data. It can be identified that there are to possible types of preferential dependence in the case of HS voting machine. First is the effect of a hidden attribute or attributes k on decision-making, influencing on a group of voters ansers to a number of questions. Although public data isn t available for the nominees to take an advantage of these attributes, every nominee has an opportunity of collecting data on his or her voters, for example ith an internet inquiry. Second type is more common dependence of logicocausal nature. These dependencies are fairly easy to recognize as they can be identified on the level of anser alternatives (giving a possibility of advocating an opinion over the same issue in several questions). Thirdly questions that are clearly preferentially independent can be used to get more 13 For one, see prospect theory by Kahnemann and Tversky (1979). 26