Post by Claus on Oct 16, 2014 17:18:36 GMT -8
The primary obstacle to changing the client / server relationship between the Game Manager and the players is the process of starting the match, which has potential response time and scalability issues. Once the match is started, everything is straightforward. The purpose of this post is to explore what the protocol for starting a match might look like. I hope to get the ball rolling and stimulate some suggestions to improve my initial proposal. Once we get that figured out, then we can turn to the protocol for playing the game, incorporating the other suggested enhancements to improve reliability and extensibility.
Once the player becomes a client, the client will need to authenticate with the Game Manager server, so the Game Manager can verify that the player is who it claims to be. The problem arises with the next step. Once the player is authenticated, it requests a match and expects a "timely" response from the Game Manager. Typically, on the Internet, this means a response time of seconds. We could stretch this out to as long as a minute, but even so that will not be enough time in many cases. If there are too few players, another player may not make a request for some time. (We probably do not want the Game Manager to resort to a single player game every time this happens.) If there are too many players, the Game Manager may not have the capacity to start another match until an existing match finishes.
The simplest solution would be for the Game Manager to fail the player's request after 30 to 60 seconds when it cannot arrange a match for the player. While the player remains interested in waiting, it continues to poll the Game Manager with a match request. The works fine with few players but does not scale well when there are a large number of players because the Game Manager will need to maintain a connection with essentially all the players waiting for a match. This has the potential for exhausting the available connections so some requests will be refused. (To avoid interfering with active matches, the match requests would need to be on a different port than the match moves.)
A more scalable solution would be for the Game Manager to maintain a queue of players waiting for matches without the players having an active connection with a request. When the Game Manager fails a request it also returns with a retry delay based on an estimate of the expected wait until a match can be scheduled for the player. The approximate time to complete existing games to complete is easy to estimate if the Game Manager (like Tiltyard) keeps statistics on the average number of turns per game, which then is just multiplied by the play clock. To discourage premature retrying of match requests (before the retry delay expires), violators could be bumped to the end of the queue.
I think this addresses the main issues with starting the match. I provided this background so everyone understands why the proposed solution is notably more complex than what a Game Manager does today as the client. With this, I can sketch out a possible solution. There are some further details to work out, but I first want to get some feedback on the general idea.
First, the view of the protocol from the player's perspective:
-Player makes a login request.
-Game Manager authenticates the player, and if successful responds with a list of capabilities, probably as a JSON or XML document.
-Player makes a match request, selecting the capabilities it wants to use, again as a JSON or XML document.
-If the Game Manager cannot start a match within 30-60 seconds due to absence of other players or due to capacity limits, the Game Manager fails the match request. The failure response will include a retry delay (which could be 0) indicating when the player may make another match request.
-Player expects match requests to fail and is prepared to retry match requests, observing the retry delay, so long as it continues to wish to play. Should the Player no longer wish to play, it will log out.
-If the Game Manager can start a match, it responds to all the included players with a match ID and the GDL. This begins the start clock. The player begins metagaming.
From the Game Manager's perspective, life is more complex. While the Game Manager will always accept all login and logout requests, the handling of match requests depends on how busy the Game Manager is.
If the Game Manager has the capacity to begin another match, it selects a game using some algorithm. This could be the existing algorithm so long as it does not depend on who the players are. As player match requests come in, they are associated with the game. If the match request of any player assigned to the game ages past some maximum (30-60 seconds), all the players associated with the game are removed and sent a failure response with a 0 retry delay. Presumably, all the players will retry immediately, get reassociated with the game again, and continue to wait for more players. However, it is possible that a player may elect to drop out (not send a match request), which is why all players get disassociated with the game when the failure response is sent. Obviously, single player games don't need to wait for an additional player, and for a two player game, just one player is waiting. (Note: For games with 3 or more players, Tiltyard appears to create random players if there are not enough active players. A Game Manager could continue to implement this type of policy when there are few active players. We probably also don't want the Game Manager to spawn a random player every time a player is waiting for an opponent in a two player game.)
If the Game Manager does not have the capacity to begin another match, it still selects a game using its algorithm, but now it estimates a start time for the match based on the matches currently active. As match requests come in, the Game Manager associates the player with the game and responds immediately with a failure including a retry timeout corresponding to the estimated start time of the match. Once the match has enough associated players, the Game Manager selects another game and associates subsequent match requests with this new match. When the scheduled start time for a match occurs, all the associated players will retry their match request and the game can be started, assuming there is available capacity. While the details are straightforward and I won't describe them now, there are several alternate scenarios that need to be handled (prior match takes longer than estimated, so Game Manager lacks capacity to start new match at the scheduled time; a player associated with a scheduled match logs out before the scheduled match start time; a player associated with the scheduled match fails to send a match request at the scheduled start time).
Complexity aside, does anyone see reasons why this would not work? Are there ways to simplify this while still satisfying the response time and scalability constraints? Given the complexity, is this worth doing? I would appreciate feedback from others in the community.
Once the player becomes a client, the client will need to authenticate with the Game Manager server, so the Game Manager can verify that the player is who it claims to be. The problem arises with the next step. Once the player is authenticated, it requests a match and expects a "timely" response from the Game Manager. Typically, on the Internet, this means a response time of seconds. We could stretch this out to as long as a minute, but even so that will not be enough time in many cases. If there are too few players, another player may not make a request for some time. (We probably do not want the Game Manager to resort to a single player game every time this happens.) If there are too many players, the Game Manager may not have the capacity to start another match until an existing match finishes.
The simplest solution would be for the Game Manager to fail the player's request after 30 to 60 seconds when it cannot arrange a match for the player. While the player remains interested in waiting, it continues to poll the Game Manager with a match request. The works fine with few players but does not scale well when there are a large number of players because the Game Manager will need to maintain a connection with essentially all the players waiting for a match. This has the potential for exhausting the available connections so some requests will be refused. (To avoid interfering with active matches, the match requests would need to be on a different port than the match moves.)
A more scalable solution would be for the Game Manager to maintain a queue of players waiting for matches without the players having an active connection with a request. When the Game Manager fails a request it also returns with a retry delay based on an estimate of the expected wait until a match can be scheduled for the player. The approximate time to complete existing games to complete is easy to estimate if the Game Manager (like Tiltyard) keeps statistics on the average number of turns per game, which then is just multiplied by the play clock. To discourage premature retrying of match requests (before the retry delay expires), violators could be bumped to the end of the queue.
I think this addresses the main issues with starting the match. I provided this background so everyone understands why the proposed solution is notably more complex than what a Game Manager does today as the client. With this, I can sketch out a possible solution. There are some further details to work out, but I first want to get some feedback on the general idea.
First, the view of the protocol from the player's perspective:
-Player makes a login request.
-Game Manager authenticates the player, and if successful responds with a list of capabilities, probably as a JSON or XML document.
-Player makes a match request, selecting the capabilities it wants to use, again as a JSON or XML document.
-If the Game Manager cannot start a match within 30-60 seconds due to absence of other players or due to capacity limits, the Game Manager fails the match request. The failure response will include a retry delay (which could be 0) indicating when the player may make another match request.
-Player expects match requests to fail and is prepared to retry match requests, observing the retry delay, so long as it continues to wish to play. Should the Player no longer wish to play, it will log out.
-If the Game Manager can start a match, it responds to all the included players with a match ID and the GDL. This begins the start clock. The player begins metagaming.
From the Game Manager's perspective, life is more complex. While the Game Manager will always accept all login and logout requests, the handling of match requests depends on how busy the Game Manager is.
If the Game Manager has the capacity to begin another match, it selects a game using some algorithm. This could be the existing algorithm so long as it does not depend on who the players are. As player match requests come in, they are associated with the game. If the match request of any player assigned to the game ages past some maximum (30-60 seconds), all the players associated with the game are removed and sent a failure response with a 0 retry delay. Presumably, all the players will retry immediately, get reassociated with the game again, and continue to wait for more players. However, it is possible that a player may elect to drop out (not send a match request), which is why all players get disassociated with the game when the failure response is sent. Obviously, single player games don't need to wait for an additional player, and for a two player game, just one player is waiting. (Note: For games with 3 or more players, Tiltyard appears to create random players if there are not enough active players. A Game Manager could continue to implement this type of policy when there are few active players. We probably also don't want the Game Manager to spawn a random player every time a player is waiting for an opponent in a two player game.)
If the Game Manager does not have the capacity to begin another match, it still selects a game using its algorithm, but now it estimates a start time for the match based on the matches currently active. As match requests come in, the Game Manager associates the player with the game and responds immediately with a failure including a retry timeout corresponding to the estimated start time of the match. Once the match has enough associated players, the Game Manager selects another game and associates subsequent match requests with this new match. When the scheduled start time for a match occurs, all the associated players will retry their match request and the game can be started, assuming there is available capacity. While the details are straightforward and I won't describe them now, there are several alternate scenarios that need to be handled (prior match takes longer than estimated, so Game Manager lacks capacity to start new match at the scheduled time; a player associated with a scheduled match logs out before the scheduled match start time; a player associated with the scheduled match fails to send a match request at the scheduled start time).
Complexity aside, does anyone see reasons why this would not work? Are there ways to simplify this while still satisfying the response time and scalability constraints? Given the complexity, is this worth doing? I would appreciate feedback from others in the community.