|
Post by Steve Draper on Feb 19, 2014 6:34:45 GMT -8
Those of you who do (Greenshell and I think at least A2, possibly a few others, seem to do ok) how do you play a decent game of Connect5!?
It's been something of an Achilles heal for me, and I've been looking at it for a day or two. The problem is simply branching factor, which manifests as basic MTCS players (at least mine!) just missing obvious losses every now and again (basically when the opponent places 3 in a line with open spaces at both ends, 2 open on at least one end, so that they can next make a 4-in-a-line with both ends open). The branching factor near the start of the game is circa 60, so seeing that placing the 4th mark (after some arbitrary move of our own), an then winning with the 5th (after another move of our own) requires a look-ahead of 4-ply, which gives about 60^4 ~= 10 million states. Getting a good sample of that for MCTS purposes in 20 seconds you have for a move is a tall order unless you have really really stellar state machine performance!
Given the above it is totally unsurprising that C5 is not played well, but the mystery is that a few players DO seem to play fairly well (notably Greenshell), so I'm wondering how...?
Note - I am attempting to tackle this (or working on trying to do so) by reducing the search space (in effect) by biasing more sharply towards the moves that are likely to matter, using move action histories (but I haven't got it working well yet). I'm intrigued by what other approaches people have used in similar circumstances (basically large branching factor games, with critical win-loss moves)
|
|
|
Post by alandau on Feb 20, 2014 17:53:08 GMT -8
I've tried an approach with some versions of Alloy involving searching "the game tree" a few levels as if only one player were moving and the others weren't doing anything ("plans ignoring opponents"). This finds the type of threat you described quite promptly. The problem is that I haven't quite figured out the right way to tie this into UCT yet; this led to improvements in Connect Five, but worse performance in other games, so I've stopped using it until I get a chance to rejigger its integration with the search.
For something out-of-the-box that's more likely to work, I suspect RAVE/MAST would be able to point your searching in the right direction. The Cadiaplayer researchers (Finnsson and Bjornsson) have published some papers and presentations on the technique.
|
|
|
Post by Steve Draper on Feb 21, 2014 5:17:14 GMT -8
I've tried an approach with some versions of Alloy involving searching "the game tree" a few levels as if only one player were moving and the others weren't doing anything ("plans ignoring opponents"). This finds the type of threat you described quite promptly. The problem is that I haven't quite figured out the right way to tie this into UCT yet; this led to improvements in Connect Five, but worse performance in other games, so I've stopped using it until I get a chance to rejigger its integration with the search. For something out-of-the-box that's more likely to work, I suspect RAVE/MAST would be able to point your searching in the right direction. The Cadiaplayer researchers (Finnsson and Bjornsson) have published some papers and presentations on the technique. I have been quite successful over the last couple of days (at a prototype level anyway - not yet checked/tuned its impact on other games) by propagating move action histories (weights for moves) around the tree (up on the score update pass, with decay at each level, then down again, also with decay on the select pass). The main issues this gives me, which I need to address are: - It introduces essentially 3 (perhaps collapsible to 2) new somewhat arbitrary parameters, which might turn out to be fairly game-specific, so it's currently unclear if I can arrive at a no-net-negative tuning for them generically across all games
- It's somewhat processor intensive for th update and select passes, though this still seems to be ok in relation to the greater cost of the rollout themselves, so currently not a major issue
- Holding sets of move weights at each tree node would be prohibitively memory consumptive, so I have to do it at a restricted set (currently using an LRU associative cache)
You're idea of searching on the assumption the opponent does not get to move is a fascinating idea. I can see that having great potential in many games for identifying key lines.
I'll update this when I get a bot further down the line with tuning the move action histories (though I'm away on business for the next week, so it might be a while)
|
|
|
Post by Steve Draper on Feb 21, 2014 8:36:36 GMT -8
I was thinking a bit more about your suggested approach of considering only your own moves to a shallow level, in order to find key moves.
This seems to generalize quite nicely as follows:
Given a game, G generate simplified games {G'} such that each G' is a 'simplification' of G (e.g. - fix one role's legals to a noop, artificially introducing it if necessary, to get your suggested case). Identify subsets of {G'} that (in a loose sense) span G (e.g. - the two simplification that restrict each role to only noops in a 2 players game). The union of all elements of such a subset is then an approximation to G as a factorized game (except when G ACTUALLY factorizes it's only an approximation of course!). Perform limited search in each element of such a subset to identify interesting moves and weight them according to the results of this search. Apply the move weights so derived to the MCTS expansion of the full game search (use the move weights to influence the exploration component of the UCT value for a move at each node, during selection)
Speaking extremely loosely, we're trying to find a way to partition a game up into sub-games which are factors of another game (obtained from the union of the factors), which is then an approximation of the original game. The trick is to find good partitioning mechanisms to apply to the original game to generate the 'factors', but if we can find one then we'll reduce the search space size in pretty much the same way that true factorization would, so dividing into two reduces O(N) to O(N^.5)
Possible approximate factorizations that might apply quite widely:
1) Each factor has only one role making moves (your suggestion)
2) In a board game, divide the board into discrete regions and consider only moves within that region (e.g. - in Breakthrough consider the left and right hand halves of the board to be separate games, and search them independently. That is to say that the approximation game is exactly Breakthrough with walls. This should find good move sequences near the edges fairly effectively. We could consider an overlapping third partition in this case of the center four files to improve the approximation.
|
|
|
Post by Sam Schreiber on Feb 21, 2014 11:45:04 GMT -8
MCTS doesn't search the entire space comprehensively, so it shouldn't need to search the entire 10 million possible states before it sees a loss at depth 4. It should bias the opponent's behavior towards states where they have a lot of observed victories, which should lean towards those states where the opponent is building up a promising line across the board. GreenShell isn't doing anything special for this game, and is only getting ~6k depth charges per second from the initial state (passing over ~270k states per second).
|
|
|
Post by Steve Draper on Feb 23, 2014 1:02:20 GMT -8
While it's true that MCTS is not evenly searching the space, it does need to sample the wins during the playouts a few times before it starts to converge, and with the probability of hitting them being quite low, it takes a fair amount of sampling before it starts to concentrate its search. Maybe I'm just slow at simulating that state machine for some reason - I think I get a bit less than 2k playouts per second.
|
|