Post by steadyeddie on Nov 24, 2015 14:16:01 GMT -8
I've been experimenting with simultaneous play games.
In the beginning I just picked the child with the biggest joint move score for me (like a regular MCTS tree), but that was rubbish in loads of ways.
So I wrote a very elegant Nash algorithm that looked for dominant strategies, eventually using matrices to find mixed equilibria. Sadly that didn't work because the data to put in came from rollouts from the biased MCTS. And so I put the nash algorithm into the tree selection and that performed too poorly.
Then I recalled a Mathmetician friend of mine recently saying how everyone solves problems Monte Carlo these days, and how rubbish it was. So I tried each side picking the best moves for themselves (yes, yes, that's why SteadyEddie is so rubbish at multi-player games- I'll add it eventually), and doing this in MCTS at selection time I thought it might yield the Nash mixed strategy I was looking for.
Consider chicken with a payoff matrix like this:
(0,0) (5,3)
(3,5) (4,4)
Assuming for now I ignore the whole iteration thing, I think there are 3 Nash equilibria here:
100% straight vs 100% swerve and the other way round
75% swerve, 25% straight for both sides- the mixed strategy I think is best play.
The issue is at the mixed strategy equilibrium if player A has a slight bias towards straight, player B starts to prefer swerve, and so I think the mixed solution is unstable to this kind of approach. And I randomly end up (50-50) in the 100% straight/swerve cases.
In this particular case I think I could employ the Nash matrices effectively on the confident nodes and work up the tree. But I'd prefer a Monte Carlo solution because it's likely that would work in all sorts of simultaneous games. So the question is, what is the prevailing thinking on simultaneous play games? A brief Google suggest statistical methods may be doomed.
In the beginning I just picked the child with the biggest joint move score for me (like a regular MCTS tree), but that was rubbish in loads of ways.
So I wrote a very elegant Nash algorithm that looked for dominant strategies, eventually using matrices to find mixed equilibria. Sadly that didn't work because the data to put in came from rollouts from the biased MCTS. And so I put the nash algorithm into the tree selection and that performed too poorly.
Then I recalled a Mathmetician friend of mine recently saying how everyone solves problems Monte Carlo these days, and how rubbish it was. So I tried each side picking the best moves for themselves (yes, yes, that's why SteadyEddie is so rubbish at multi-player games- I'll add it eventually), and doing this in MCTS at selection time I thought it might yield the Nash mixed strategy I was looking for.
Consider chicken with a payoff matrix like this:
(0,0) (5,3)
(3,5) (4,4)
Assuming for now I ignore the whole iteration thing, I think there are 3 Nash equilibria here:
100% straight vs 100% swerve and the other way round
75% swerve, 25% straight for both sides- the mixed strategy I think is best play.
The issue is at the mixed strategy equilibrium if player A has a slight bias towards straight, player B starts to prefer swerve, and so I think the mixed solution is unstable to this kind of approach. And I randomly end up (50-50) in the 100% straight/swerve cases.
In this particular case I think I could employ the Nash matrices effectively on the confident nodes and work up the tree. But I'd prefer a Monte Carlo solution because it's likely that would work in all sorts of simultaneous games. So the question is, what is the prevailing thinking on simultaneous play games? A brief Google suggest statistical methods may be doomed.