SteadyEddie (part 7/16): The control thread

steadyeddie
Junior Member

Posts: 58

SteadyEddie (part 7/16): The control thread Dec 18, 2016 6:18:57 GMT -8

Quote

Post by steadyeddie on Dec 18, 2016 6:18:57 GMT -8

The more I write, the more I realise the 1 control thread is clearly barmy in the long run (high CPU cores world). Oh well, it's what I've got for now.

Turn on a profiler against your Player for the first time and you'll find a bunch of crass performance bugs. Once I'd taken mine out, for a game where the control thread was the bottleneck (C4 is may favorite), I found the following:

⦁ If you are blocked in methods near a synchronize, it's the synchronize- find another way.

⦁ You need to be very careful with processing results and back propagation (see later).

⦁ Actually, posting and reading from the worker queues is not that expensive if you get the locking right.

⦁ Selection of nodes is surprisingly expensive. First up I discovered an inaccurate, but fast version of Math.log that has helped me a lot. And caching of values to avoid recalculation.

Last Edit: Dec 18, 2016 12:14:48 GMT -8 by steadyeddie

Andrew Rose
Global Moderator

Posts: 100

SteadyEddie (part 7/16): The control thread Jan 27, 2017 13:57:48 GMT -8

Quote

Post by Andrew Rose on Jan 27, 2017 13:57:48 GMT -8

Node selection is very expensive for Sancho too. I'm pretty sure I tried the cheap and chearful version of Math.log (and friends) that was mentioned in the chat channel during the last competition. Sadly, for Sancho, I don't think I saw any difference in performance, even for C4 (which is also my go-to game for tree-thread bottleneck).

steadyeddie
Junior Member

Posts: 58

SteadyEddie (part 7/16): The control thread Jan 30, 2017 16:00:45 GMT -8

Quote

Post by steadyeddie on Jan 30, 2017 16:00:45 GMT -8

To put a figure on how fast my selection is now, towards the end of a game, when I'm faking a lot of rollouts, and don't have to bother the OS with annoying context switching, I get 500K-1M selections per second. I'm not as good when I'm full on sending/receiving to worker threads. At C4 I'm seeing (using your propnet at 40-50,000 rollouts/second/thread) a total throughput of ~200K rollouts/sec in the tree.

General Game Playing

SteadyEddie (part 7/16): The control thread

Post by steadyeddie on Dec 18, 2016 6:18:57 GMT -8

Post by Andrew Rose on Jan 27, 2017 13:57:48 GMT -8

Post by steadyeddie on Jan 30, 2017 16:00:45 GMT -8

Quick Reply