Ya'akov Gal, Avid Pfeffer, Barbara Grosz'
A mix of social agents and game theory (social vs. analytic strategies)
Motivation * computer agents interacting w/ people * computers and people making decision together * people's social behavior varied and complex * need to build agents that interact successfully in these environments
The challenge
People's behavior is affected by multiple variables * social prefs (selfish/altruistic) * type of environment ( (un)cooperative) * social context (who needs whom)
People make mistakes (do not adhere to economic def'n of rationality)
Problem and proposed Solution
- difficult for analytical approaches (e.g. Game Theory) to capture diversity of behavior
- To build a socially adaptive agent: define social factors in a precise network, learn them through observation, etc...
Hypothesis
Agents need to learn social prefs to interact with people
A socially competent agent will be more successful than analytical agent and be able to generalize to people and situations it has not seen before
Approach
Use a game for testing decision-making in groups comprised of people and computer agents
social utility fn(social prefs: individual benefit, social welfare, advantageous inequality)
Framework: Colored Trails (CT). Each player has resources they must surrender to reach goal, but may not have enough resources initially to reach goal. May need to bargain/negotiate with other players. * resources are chips of different colors * surrender chips of color of square they want to move to
Scenario
- Allocator makes a proposal
- Deliberator responds to proposal
- Movement towards goal
Score depends on distance from goal and number of chip at end of game
Games are non-cooperative (success does not depend on others)
Social preferences in CT
Reference points: No Negotiation Alternative (score of both players if don't negotiate, BATNA), Proposed Outcome (score of both if accept)
Social preferences of Deliberator are defined in terms of outcome: Selfishness, Social Welfare (increase in overall outcome), Advantage of Outcome (score over opponent), Advantage of Trade
Modeling the Deliberator
given exchange x, social utility for Deliberator u(x) is a weighted sum of social preferences
sigmoid function
utility also measures the degree to which a decision is preferred
Allocator will propose the exchange that maximizes its outcome. Uses probability of acceptance for given proposal to compute expected outcome. Uses model of how Deliberator makes decision.
Mixture model of Deliberator types
Allocator will propose deal that maximizes utility over all possible Responder types
Data Collection
used 32 subjects over 2 trials paid participants based on how well they do parameters learned using EM and gradient descent
Model evaluation
Played between computer Allocator and human Deliberator, as well as between two humans
Types of computer allocators: * Social Agent: used social utility model to make offer * Nash equilibrium * Nash bargaining
Ranking: Social agent, Human, Nash bargain, Nash equilibrium. Social agent and Human had equal exchanges accepted. NE had most declined offers. NB and human had very few declined.
Conclusion
Social behavior must be learned Learned model of human play Their model outperformed traditional game theory (analytic behavior)
Q and A
NE assumed single game. Would repeated game do better?
Did people have enough time to learn game?




