01 January 2010

[Literature] Optimizing Online FPS Game Server Discovery through Clustering Servers by Origin Autonomous System

by Grenville Armitage, 2008 located here.


Armitage is Australian, and Australians seem to suffer from latency problems. This might explain why he was looking for an efficient way to solve a latency problem: server discovery in CS:S. According to the author, the (client-side) server discovery process algorithm is as follow:

  1. Issue a getservers query to UDP port 27011 on the Steam master server
  2. Receive a packet containing a list of 231 active servers <IP address:port>
  3. Probe each game server in the previously received list
  4. Repeat until the Steam master server has no more game servers to return

The speed of server discovery is actually configured by the user: it is the "What is your connection capacity" which possible parameters are "Modem 56k", "DSL", "DSL+", etc. This parameter changes the speed of game server probation (phase 3 in the algorithm above). There are approximately 30K game servers and with a Steam client configured with "DSL > 256K", 140 servers can be probed per second. Without optimization, it takes 230s to complete the discovery process with a Steam client configured with "DSL > 256K".

In general, players want the game server with the smallest ping or RTT. A first solution could be grouping servers based on their location, but it is not always true that the closer geographically the server is, the smaller the ping will be. The paper proposes clustering game servers by the Autonomous System to which each game server's IP address belongs (its origin AS), ie grouping servers based on the Internet topology. An AS relies on IP routing prefixes to group IP together. The author also added an automatic termination of server discovery when servers RTT are above a specified threshold such as 200ms. Here is the algorithm:

  1. Retrieve all game servers
  2. Group servers in clusters based on their AS number
  3. Calibration process: this eventually splits clusters into smaller ones. In the end, the RTT of a randomly-picked server inside a cluster should be within a 40ms range of the cluster median RTT.
  4. Rank clusters in ascending order of their median RTT
  5. Probe the servers which have not been probed during the calibration process until a certain RTT threshold (200ms for instance)

Approximately 3k servers (10% of all the servers) are probed during the calibration phase. Around 1k clusters are created during the calibration process. In locations such as Australia, Taiwan or Japan, the algorithm stops within a minute and gives nearly all the available servers which RTT is below the specified threshold. Countries like Germany, the USA or the UK are closer to game servers and their RTT is smaller in average. Therefore, reaching the 200ms RTT threshold takes nearly the same time for players in these countries. However, if the RTT threshold is lowered to 100ms (for instance), the algorithm may stop much faster. If the player could specify the RTT threshold he/she wants the algorithm to stop, the proposed solution could be an efficient improvement to the current discovery process of Valve Counter-Strike game servers.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.