SteadyEddie (part 10/16): Confidence worker

steadyeddie
Junior Member

Posts: 58

SteadyEddie (part 10/16): Confidence worker Dec 18, 2016 6:22:55 GMT -8

Quote

Post by steadyeddie on Dec 18, 2016 6:22:55 GMT -8

I doubt my implementation of this is particularly great- I've only just added it. And, in fact, I didn't turn it on for the tiltyard open, instead focussing the CPU on rollouts.

Basically, instead of doing a rollout on a selected node, I just do a full search of the subtree attempting to see if the chosen node can be marked as 100% confident. Maybe I should have done a fake rollout on success, but I didn't. I just wrote the new confidence into the tree.

Oddly I don't even get much better results on Rubiks cube with this on.

Last Edit: Dec 18, 2016 12:13:40 GMT -8 by steadyeddie

Andrew Rose
Global Moderator

Posts: 100

SteadyEddie (part 10/16): Confidence worker Jan 27, 2017 14:12:46 GMT -8

Quote

Post by Andrew Rose on Jan 27, 2017 14:12:46 GMT -8

I don't quite get what you're saying here. Is this using a single core to pick a node and then do min-max (possible with alpha-beta pruning) to get a complete result? How do you decide which nodes to do this for? How long do you persist before deciding that the tree below that point is still much too large?

Why do you expect this to help with Rubiks cube?

As an observation, for 0/100 puzzles, doing MCTS is always wrong. You save all the time for selection & back-prop, etc. by just doing a depth-first search of the game tree. (For many puzzles, it's worth at least considering maintaining a hash table of already visited states, to avoid all the work below them. However, that gives you the significant overhead of memory management and hash lookups.)

steadyeddie
Junior Member

Posts: 58

SteadyEddie (part 10/16): Confidence worker Jan 30, 2017 15:44:32 GMT -8

Quote

Post by steadyeddie on Jan 30, 2017 15:44:32 GMT -8

Yes, I'm saying for N threads I set them searching a little tree looking for a complete result. Most of the time this is useless. If I was clever I'd use some sort of calculation using depth and choices to work out what nodes might be searchable. But towards the end of the game I figured it'd help. For example I was expecting a boost in Breakthrough (assuming # rollouts were the same). Nothing doing.

For rubiks cube (as I see it) GGP players fumble around until near the end of the game, when confidence/completeness (pick your term) is king. I figured that if I could extend the horizon by 4 ply I'd find some results better than the signal coming back from rollouts. But it doesn't work AFAIK.

And yes, for a node search, each thread has a hash table of visited nodes.

[ And on the 0/100 as per another thread, actually depth of 0 is actually useful surprisingly often] .

General Game Playing

SteadyEddie (part 10/16): Confidence worker

Post by steadyeddie on Dec 18, 2016 6:22:55 GMT -8

Post by Andrew Rose on Jan 27, 2017 14:12:46 GMT -8

Post by steadyeddie on Jan 30, 2017 15:44:32 GMT -8

Quick Reply