07 April 2012

MapReduce for MMOs

MapReduce is a powerful tool to parallelize batches of computations. MMOs may sometimes have to run batches, but from what local game companies tell me, nobody in the game industry is currently using MapReduce. I guess, this is mostly due to studios not knowing what to do with it. Here are some examples.

Business intelligence

Basic metrics such as weekly play time or stop rate can give a rough perspective of the retention of an MMO. These metrics can be estimated with a couple SQL queries on dumps of the production database(s). It starts taking more time and effort to distinguish accross server shard, faction, race, or class. Still, a SQL script running for a few hours can do the job. Fancier analyses such as machine learning or social network graphs explorations take even more time and effort. MapReduce can be used to tune machine learning algorithms through Mahout, and even to process graphs (Google's Pregel also seems interesting for parallel processing of graphs: the Pregel version of PageRank takes 15 lines of code).

Detecting bots, hacks, or gold farmers is not as straightforward, but I think it is doable. First, the typical deviant behaviors have to be determined and made explicit by humans. For instance, speed-hackers send too many messages per second to the server, while gold farmers interact with less players, but more intensely, than normal players. Then, detecting deviant behaviors can be a machine learning classification or a graph parsing problem. In both cases, MapReduce can help.

Game-specific

Matchmaking and ladder: Some pre-calculations or updates to parameters of the ladder and match-making algorithms could be done offline by a small MapReduce cluster. A player's skill is unlikely to change much in 12 hours, so a cron task could run the job twice a day. According to Josh Menke from Blizzard, matchmaking involves gradient descent or Gaussian Density Filtering. Not sure whether Mahout supports GDF, but gradient descent is supported.

Tuning and balancing can take days for system designers. MapReduce could do that automatically: each mapper job is given a particular set of system parameters: player 1 has skill A (cost x SP and inflicts y damage) and skill B (cost z SP and heals w HP), player 2 has skill C (...) and skill D (...). Mappers run a few hundred Monte-Carlo simulations of a player 1 versus player 2 match with a fixed set of parameters (player1:A,B; skillA:x,y; skillB:z,w; ...). When done, mappers pass average statistics (win/loss ratio, average amount of gold at the end of the match, ...) of the 100 matches to reducers who sort them. The interesting configurations for balance are those with a win/loss ratio close to 50%. Naturally, this brute-force way of balancing assumes a proficient AI, and designers will still have to tweak the configurations returned by MapReduce so that they feel fun.

Practical concerns

Engineering detail: MMOs have hundreds of shards, but really only one MapReduce cluster should be needed. Each shard could send its jobs to the MapReduce cluster when it needs them done, and wait asynchronously for the MapReduce answer on a particular port. If the MapReduce job uses data from the production database, producing a daily dump may induce a temporary extra load on the shard's database machines, but this should be fine during empty hours.

MapReduce can be a double-edged sword if overused. Exploring the parameter space of learning algorithms too aggressively may lead to less accurate models.


Edit: Some people have been using MapReduce for analytics: mogade's platform and keighl have been using it through mongodb, but it's more of an engineering constraint (scatter-gather queries in a nosql DB to build a ladder board) than an analytics or machine-learning endeavor.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.