Claus
New Member
Posts: 18
|
Post by Claus on Jul 31, 2014 9:26:28 GMT -8
Considerable dissatisfaction with the existing HTTP messaging protocol used for GGP was expressed during the 2014 Competition chat. While there are ideas for improvements, no one is willing to provide the resources to rework Game Managers (in particular) to handle a substantially different protocol and there is concern that some protocols might allow players to cheat when "recovering" from a communication error but actually sneaking in extra computation time or a different move.
However, the existing protocol could be made considerably more robust and extensible by making modest changes to the existing messages in the current protocol. I estimate that the level of effort to implement and test this would be 20-40 hours for a game manager and under 10 hours in a game player.
Robustness 1. Include timestamp in play messages sent by the manager (or even all messages) The benefit is that this could be used to tune players for local communication lags. While this could be done dynamically, it does not have to be. At least it provides real data that can be analyzed statistically and used to set a static delay. I think it would be sufficient to just use the raw millisecond clock from the manager as this is sufficient to estimate the offset and jitter relative to the player's clock. I am not opposed to formatting the time in human readable UTC but just want to observe that drift between the clocks on different computers make the absolute UTC time different between the manager and the player, so one is still reduced to estimating the offset and jitter between the computers' clocks when looking at lags on the order of seconds or less.
Thus the play message sent by the manager might look like: ( play game# ( noop (move 1 2 3 ) ) ( @time 999999999999 ) )
The manager could also echo back the time at which it received the previous move from the player, but I think that adds complexity without much additional benefit.
2. Include the last several moves, with turn number, in play message This allows the player to both detect a missed move, based on turn number, and to recover due to receiving the last few moves. The number of prior moves to include is limited by retaining the desirable feature that the entire message fit in a single network packet, to avoid the overhead associated with assembling multiple packets. I think current network packets are 1K or 4K (can someone confirm this?) so that is not overly restrictive. The problem is that game obfuscation usually increases the number of characters used to express a move.
This might look like: ( play game# ( 2 ( noop ( move 1 2 3 ) ) ) ( 1 ( ( move 3 2 1 ) noop ) ) ( @time 99999999999 ) )
Extensibility Simply including a version number and a minimal version negotiation would make it much easier to improve the protocol. What I propose here is a little more cumbersome than it could be, but I want to allow existing players to be supported during the transition. While this is aimed primarily at the existing messaging protocol, to minimize implementation effort, it also enables larger changes to the protocol or even the GDL language by indicating different versions.
1. Player response to the info request indicates the versions supported I recommend having the player enumerate all the versions it supports rather than indicating just the latest version with some convention about how many older version "must" be supported.
This might look like ( available ( versions 1.0 1.1 1.2 ) )
Some existing managers may already be able to parse this, recognize the available (or busy) and ignore the versions. Others would require a minor change to accept this response even if they do not implement any newer version of the messages.
Managers supporting newer versions could select one supported by both the Manager and the player.
2. Manager includes version on Start request For backward compatibility, the version would be omitted for version 1.0, but later versions of the communication protocol would include ( version 1.1 ) or similar before or after the game rules.
For robustness, I would recommend that players respond (immediately) with "unsupported" rather than "ready" if the player does not support the indicated version. This allows the manager to abort the match rather than waste time trying to play it. Obviously players using the existing (version 1.0) messages will not know to do this. Players could also respond "unsupported" or perhaps "failed" if they are unable to interpret the GDL during metagaming.
Feedback Does anyone else see merit in this proposal and think it would be feasible for the community to pursue?
Does anyone see a technical problem that would make this harder to implement than what I outlined here?
I would be happy to develop this into a more formal specification, if there is interest in implementing it.
|
|
|
Post by Sam Schreiber on Jul 31, 2014 12:59:59 GMT -8
I'm happy to work on the GGP Base game manager and player network interfaces, but if we're going to be changing the communication protocol, there are a couple of other things we should consider.
I really think we should allow players to repeatedly submit "draft" moves to the server before their play clock is over, and have the server just choose the latest submitted draft move for each player when they run out of time. The draft moves can be tagged with the step number to avoid confusion. Requiring players to send moves back at exactly the last possible minute, but no later, adds a latency-sensitive aspect to the communication that's not actually needed, and many teams waste time engineering around this, time that could be better spent improving their players.
Another thing that would be nice would be to remove the requirement of having long-lived (e.g. 30+ second) HTTP connections. These make it hard to build a player that's robust to individual machines that it's running on crashing or restarting. If the network connections are quick, it's much easier to have a player fail over to another machine (e.g. using a load balancer), which makes it easier to build robust players.
|
|
|
Post by Steve Draper on Aug 1, 2014 4:19:36 GMT -8
Ideally we should also reverse the client server relationship, so as to sidestep all the firewall issues we get with players having to accept incoming connections.
For backward compatibility I would suggest we extend the existing protocol minimally JUST to add a negotiation to allow a hand-off to the new protocol to occur if both parties are capable of it. That can b as simple as adding some protocol-defined semantically meaningful elements in the info request/response, which would imply be ignored by older players
|
|
|
Post by GGP fan on Aug 1, 2014 4:20:33 GMT -8
I would strongly suggest reversing roles (server - client) of the Game Manager and the players. Many more people would be able to participate then.
|
|
Claus
New Member
Posts: 18
|
Post by Claus on Aug 1, 2014 6:46:17 GMT -8
I intentionally did not propose reversing the client server relationship between Game Manager and players because that appears to cause the greatest concerns about the impact on the Game Manager. Perhaps I should have made that more explicit.
Beyond the complexities in the Game Manager of inverting the control structure, it requires a significant change to the protocol to authenticate the client and ensure that the client cannot be impersonated by another entity. That becomes especially significant if we want to enable the client to have fail over capability to a different node, in which case the IP address of the client would change, so we can't just use the client's IP as a proxy for identity (and using IP address for identity is inadequate for other well known reasons, like ease of spoofing). Reversing the client server relationship is not as simple as just tweaking the messages to reverse the request flow. Unless you go for a peer to peer approach where the Game Master starts as client (sending HTTP request) and reverses role so player becomes client (making HTTP request) during the game.
On this thread, I would like to keep the discussion to a more minimalist approach to modifying the existing protocol, maintaining the existing client server relationship.
I am happy to participate in a discussion of the protocol implications of reversing the client server relationship on another thread. In fact. I would be very happy to see such a change. Currently I host my player in the cloud at Amazon (c3 large instance for $3 a day) even though I have more compute power idling at home because I am unwilling to take on the security risk of hosting a server behind my home firewall without adequate security infrastructure. (Maybe I'm paranoid, or perhaps I've just spent too many years aware of best practices for corporate systems and you can read in the newspapers how well that is going. Not.)
|
|
Claus
New Member
Posts: 18
|
Post by Claus on Aug 1, 2014 7:39:49 GMT -8
I can see two options for addressing the concerns that Sam raised within my self imposed constraint of minimal changes to the existing protocol. I'll do one option in this reply and the second later.
This option allows for draft moves and, while reducing the duration of the HTTP request, does not fully address the fail over concern.
This option introduces an additional message used during play. The sequence would be as follows (message format after message flow): 1. Game Manager sends a play message for the current turn 2. At any time before timeout, player can reply with move, marked as draft or final with turn number 3. If the move was draft, the Game Manager immediately sends a new request, let's call it "update", including the last draft move received with turn number 4. The player responds to the update request in the same way as for a play request. 5. Steps 3 and 4 iterate as many times as desired. If the player has not send a move marked as final before the timeout, the manager uses the last draft move received. Manager processes moves from all players and proceeds to next turn. (If all players have submitted final moves, the manager processes moves and proceeds to the next turn even if the timeout has not expired.) 6. Any responses for a turn received by the manager after the turn's timeout are ignored.
The move response sent by the player for a draft move on turn 3 would look like: ( ( @draft 3 ) ( move 1 2 3 ) ) and final: ( ( @final 3 ) ( move 1 2 3 ) )
The update request from the manager would look like: ( update game# ( @draft 3 ) ( move 1 2 3 ) ( @time 999999999 ) )
(Potentially the manager could confirm a final move by sending ( update game# ( @final 3 ) ( move 1 2 3 ) ) to which the player would reply ready)
Some observations: 1. I am prefixing the new keywords with a funny character to avoid confusion with symbols that might be used for moves. 2. Timestamps are less important now (and could be omitted) since the player can estimate round trip communication delay from the time to the next update request after the response to the previous request. 3. Due to the intermediate responses, the HTTP response time is reduced. However since new update requests are only sent after responses, a crash of the node processing a request will not fail over to the another node until the play request for the next turn, effectively missing a turn. 4. While this does not add much complexity to the manager, it does mean the player should have two threads, one for processing the messages and another for computing the current move. Most advanced players already do this, so this is mostly a concern for simple players and the GGP base.
|
|
Claus
New Member
Posts: 18
|
Post by Claus on Aug 1, 2014 8:17:28 GMT -8
This is the second option to fully address Sam's concern about enabling player fail over. Personally, I don't think that node failure is frequent enough to justify the additional complexity.
This is essentially the first option, except that the player is expected to respond to the play and update messages within a fixed interval, perhaps about 3 seconds based on network latency that I have heard reported. If a response is not received, the manager repeats the last request. In this option, the manager timestamp would include two values, the time of the original request and the time of the current request. This allows the player to recognize that it missed the original request and to compensate for the time lost since the original message. I would still include moves from several turns on the play message, as in the original proposal, in case the player recovery takes more than one turn.
Ideally this resending of request absent a prompt response should be done on all requests, not just play and update. For the start request, the player might respond pending rather than ready. Because all but the simplest players typically use all the metagaming time, there is no need for the manager to poll all the players to see if they are ready (although this might help failover during metagaming). Once all the start requests have been acknowledged, the manager just sends the start request once the start timeout occurs.
This approach adds a lot of additional timers for the manager to maintain. In addition to a timer per game, there now needs to be a timer per player to track the last request sent to the player. For the player, the change is simpler. It still needs to send the same response currently expected, even if it is a duplicate request, but the player does need to recognize the duplicate request. The logic to handle missed turns can be easily extended to handle this, so there is just a little work on start, stop, and abort.
|
|
wat
New Member
Posts: 32
|
Post by wat on Aug 10, 2014 10:27:49 GMT -8
Reversing the client-server relationship should be top priority for me. Firewall/router issues are, by far, the most annoying issues in GGP events. It can be done without changing any message in the protocol. - Configuring the port changes from Player to Server. - Configuring IP address changes from Server to Player. - Player connects to Server. Sends login/password. - Server uses login to identify Player. Optionally uses password for security. After that, connection is established and messages can be sent both sides as usual. i.e. Server sends start message, Player sends ready message, Server sends play message...
Draft moves or something else which improve communication robustness comes next. Timeouts are the most annoying issues after you deal with router issues. Simple players can fallback to only sending the final move, making the need for 2 threads optional.
Avoiding the need for 30+ second connections while keeping real-time is tricky. Gets worse if you reverse the client-server relationship. But the client could reconnect to the server after each 30 seconds, like most AJAX Push implementations do.
|
|
|
Post by alandau on Aug 12, 2014 15:26:58 GMT -8
For the client/server reversal... I haven't worked much with networking code directly, but could the client send requests along the lines of "send me the moves played in the game on turn N" that block until a response is ready? (In other words, a request for the Nth "play" message from the existing protocol.) The server could include in its response the milliseconds elapsed since the moves were actually played server-side.
Normally the client would send the message for the upcoming turn, possibly right after it sends its "play" message (i.e. for a single-threaded gamer), possibly beforehand. If a connection is lost, the client can retry getting the play message for the same turn and won't lose track of the game state permanently.
As I said before, I haven't had to write this type of code, so although this seems like a better approach in the abstract, I don't know how painful it would be to implement at the networking level.
|
|
|
Post by Andrew Rose on Aug 13, 2014 11:09:40 GMT -8
I realise I'm a bit late to the party - sorry. I'd like to make a couple of points.
Firstly, I think if we're going to update the protocol, we should (as far as possible) do it in one go. Sorry Claus, I know you asked for this thread just to cover your relatively modest proposal, but I don't think it's likely to fly because it doesn't address by far the biggest perceived issue with the currently protocol (i.e. the unusual choice of client/server roles).
Secondly, from some considerable professional experience in protocol design, I'd advise using a capabilities model rather than versioning. In versioning, for example, V1 implies features A-C, V2 implies features A-F, V3 implies features A-G but dropping incompatible feature B, etc.. Instead, the capabilities model allows clients and servers to negotiate the precise set of features they're using by advertising their capabilities. This is much more easily extensible and allows parties to try experimental things without risking breaking everybody else. For example, the Tiltyard could advertise that it had the new "RTT-calculation" capability. Any players that understood that feature would know that they could send a protocol ping and expect an immediate response for use in working out network delay. Any players that didn't understand (or didn't want to use it), could carry on with their previous hard-coded safety margins.
|
|
Claus
New Member
Posts: 18
|
Post by Claus on Oct 4, 2014 19:04:23 GMT -8
My apologies for my extended absence from this thread. I was consumed by other projects.
Based on the responses, I see that there is little interest in changing the protocol unless we reverse the client / server roles, putting the game master in the server role and the player in the client role. I will start a new thread on this once I have a chance to organize my thoughts. Reversing the roles for actual game play is easy. The hard part is how to start the game without requiring the client to wait potentially a very long time for a response to a request to play in a game. (Consider the elapsed time on Tiltyard between the time a player waits between the end of one game and the beginning of the next, when there are a large number of active players.)
I agree with Andrew's point about using a capabilities approach being superior to a versioning approach for a very robust protocol, albeit with some additional complexity required for the capabilities negotiation.
|
|
rxe
Junior Member
Posts: 61
|
Post by rxe on Oct 4, 2014 22:13:10 GMT -8
My number one feature addition would be to know who I am playing against. Could be as simple as a non-intrusive addition to: (player <role1> <player_name1>) (player <role2> <player_name2>) Uses: (a) Would be good to know who you are playing against when looking at logs, instead of having to jump to and fro Tiltyard for instance. (b) Could do some opponent modeling if your player is so inclined to. (c) Knowing you're playing against "Random" would be a big deal for current n player games on Tiltyard (with n > 2). (d) It would allow for some makeshift random like games without major changes to the protocol or players. See for example ggp.boards.net/thread/62/card-gamesThis could be an entirely optional addition to gdl, so from a game server point of view it could choose to not emit this, and from a game player points of view they could choose not to do anything with it, thus acting as current bots do.
|
|