Amazing how subtle MCTS bug failure modes can be!

Steve Draper
Global Moderator

Posts: 143

Amazing how subtle MCTS bug failure modes can be! Jan 12, 2014 16:04:11 GMT -8

Quote

Post by Steve Draper on Jan 12, 2014 16:04:11 GMT -8

Since before Christmas my player has been playing with a bug I inadvertently introduced, which seems like it should be pretty serious, but which hasn't been handicapping it greatly in actual play it seems (apart from a few cases - see below).

The bug is that the 'random' playouts systematically never play the last (in enumeration order from the propnet) available move for a role in any given state, and play the first one with twice the probability of the others!

I finally pinned this down based on suspicions about some bad results I've been having in nine board TicTacToe (losses to Fortress, and 100% loss rate to Greenshell lately). Initially I couldn't convince myself anything was definitely wrong (no obvious killer errors in play I could see - it just lost), but I finally got round to staging some test matches against an early version (the version I used in the Coursera finale, which I tend to use as a reference build when this sort of thing crops up), and found it's loss rate to that version was about 75%.

I had various suspicions about recent changes that might have caused an issue, but after backing them all out one by one the results were the same. I then binary chopped back though version history and eventually (it takes SO LONG to do this sort of thing [annoying]) discovered the version it had broken in. Unfortunately this was a major representation change (I switched from passing Move objects around to passing info structures around that contain the actual proposition for the move's 'does' among other things), so not something I could turn on or off in parts. I then crawled through the version diffs reviewing the code changes line by line and eventually found the culprit - I'd moved a 'index--' out of an expression into an 'if' condition with the expression left using 'index' (doh!).

The amazing thing here is how subtle the break was! It makes life that much harder when bugs tend to result in really quite small strength reductions rather than dramatic failures.

Andrew Rose
Global Moderator

Posts: 100

Amazing how subtle MCTS bug failure modes can be! Jan 16, 2014 1:34:18 GMT -8

Quote

Post by Andrew Rose on Jan 16, 2014 1:34:18 GMT -8

There must be the opportunity for a basic regression suite here. To catch this problem, create a very simple game where each possible move gives you the same likelihood of winning. Play 100 times and check for statistically unlikely distribution of chosen move.

talinsalway
New Member

Posts: 6

Amazing how subtle MCTS bug failure modes can be! Feb 3, 2014 15:32:55 GMT -8

Quote

Post by talinsalway on Feb 3, 2014 15:32:55 GMT -8

Additionally, if you create such a test, and you're using git for version control, 'git bisect' can automatically handle the binary chopping to find the revision where the test starts failing. it's pretty useful for finding the cause of regressions that seem to have regressed a long time ago.

General Game Playing

Amazing how subtle MCTS bug failure modes can be!

Post by Steve Draper on Jan 12, 2014 16:04:11 GMT -8

Post by Andrew Rose on Jan 16, 2014 1:34:18 GMT -8

Post by talinsalway on Feb 3, 2014 15:32:55 GMT -8

Quick Reply