SlideShare ist ein Scribd-Unternehmen logo
1 von 14
Downloaden Sie, um offline zu lesen
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10
Gameon10

Weitere ähnliche Inhalte

Andere mochten auch

一秒鐘的改變
一秒鐘的改變一秒鐘的改變
一秒鐘的改變Jaing Lai
 
借鏡 │ " Learn" marketing plan
借鏡 │ " Learn" marketing plan 借鏡 │ " Learn" marketing plan
借鏡 │ " Learn" marketing plan FRANZ AWARD
 
Os Melhores
Os MelhoresOs Melhores
Os MelhoresJNR
 
Infografica - Agriturismo in Italia
Infografica - Agriturismo in ItaliaInfografica - Agriturismo in Italia
Infografica - Agriturismo in ItaliaSella
 
Dropbox我们的创业经历(中文版)
Dropbox我们的创业经历(中文版)Dropbox我们的创业经历(中文版)
Dropbox我们的创业经历(中文版)Green2008
 

Andere mochten auch (6)

一秒鐘的改變
一秒鐘的改變一秒鐘的改變
一秒鐘的改變
 
Mexico
MexicoMexico
Mexico
 
借鏡 │ " Learn" marketing plan
借鏡 │ " Learn" marketing plan 借鏡 │ " Learn" marketing plan
借鏡 │ " Learn" marketing plan
 
Os Melhores
Os MelhoresOs Melhores
Os Melhores
 
Infografica - Agriturismo in Italia
Infografica - Agriturismo in ItaliaInfografica - Agriturismo in Italia
Infografica - Agriturismo in Italia
 
Dropbox我们的创业经历(中文版)
Dropbox我们的创业经历(中文版)Dropbox我们的创业经历(中文版)
Dropbox我们的创业经历(中文版)
 

Hinweis der Redaktion

  1. Parallel programming is not a new area of computer science.However, it is new in the context of video game development.Due to the current adoption of multi-core technology, its essential to improve game performance.PlayStation 3 is multi-core.XBOX 360 is multi-core.Your home PC and laptop is quite likely multi-core.GPGPU and CUDA – graphics cards for general purpose computing.There are two main approaches in academia:Client-side usually involves enhancing the complexity of:Artificial intelligence.Simulation.On-screen graphics.Server-side targets the multiplayer component of games:Usually to increase maximum player capacity.This is what we’re interested in.
  2. Abdelkhalek et al perform an investigation into the performance of interactive multiplayer game servers.They identify that the performance characteristics are similar to web servers.When the server is overloaded with players, the CPU usage is 100% and performance starts to degrade.There are potential gains to be found by leveraging parallel execution.The same group continued the investigation and developed a parallel version of the QuakeWorld server.This was a traditional multi-threaded approach using the pthreads library.Once of the biggest difficulties was synchronising access to shared data structures:Threads can all read/write the same data.Undefined results if 2 threads write to the same data simultaneouslyMutexes are used to access shared data in a serial fashion.This reduces the time spent executing concurrently: can drastically reduce performance.Static load balancing:Players are assigned to threads based on their position within the game world.What happens if all the players go into the same corner of a level?We’re no longer using the available threads effectively.
  3. The same group continued the investigation and developed a parallel version of the QuakeWorld server.This was a traditional multi-threaded approach using the pthreads library.Once of the biggest difficulties was synchronising access to shared data structures:Threads can all read/write the same data.Undefined results if 2 threads write to the same data simultaneouslyMutexes are used to access shared data in a serial fashion.This reduces the time spent executing concurrently: can drastically reduce performance.Static load balancing:Players are assigned to threads based on their position within the game world.What happens if all the players go into the same corner of a level?We’re no longer using the available threads effectively.
  4. This approach tries to tackle the complexities experienced in Abdelkhalek’s version.Using a unique Linux kernel feature, it simplifies the sychronisation of shared data:We selectively choose data that we wish to keep.Child threads do not modify the global data, unless we let them (specifically).Still has quite high overhead, for different reasons.Performs pre-processing to identify groups of entities which can be processed in parallel.Attempts dynamic load balancing to ensure optimum performance.We’re mostly interested in this.To what extend does load balancing effect performance?
  5. This is a fairly simple game loop.Client input is received via network:Clients send a request to move or perform an action.Prediction is used in the client program to provide uninterrupted gameplay.Client requests are processed:Request may have been illegal (i.e. moving too fast).Otherwise perform movement/action.Create necessary entities such as projectiles, remove pick-ups etc.Response packets are then sent via the network connection.IMPORTANT: time between request/response MUST be quick!Otherwise the players will experience “lag”, gameplay will degrade.Players are suceptible to just a few tens of Ms.
  6. This is a fairly simple game loop.Client input is received via network:Clients send a request to move or perform an action.Prediction is used in the client program to provide uninterrupted gameplay.Client requests are processed:Request may have been illegal (i.e. moving too fast).Otherwise perform movement/action.Create necessary entities such as projectiles, remove pick-ups etc.Response packets are then sent via the network connection.IMPORTANT: time between request/response MUST be quick!Otherwise the players will experience “lag”, gameplay will degrade.Players are suceptible to just a few tens of Ms.We’ve added more complexity to the game loop.Entity pre-processing is a very important step:Identifies groups of clients that can be processed in parallel.Creates groups based on a range of 256 game units.Based on their potential movement/actions, they cannot affect entities outside their groups.See video (if there’s time).Load balancing distributes groups between threads based on algorithm (LPTF by default).To what extent does this effect server performance?
  7. We’re using the Linux version of the QuakeWorld server.A Windows version is available:But this isn’t compatible with COW.We call it a “bot” in the loosest definition.It performs random inputs every 10 frames.Not representative of human behaviour, but enough to simulate/benchmark the server.
  8. We’re interested in whether the initial weight penalty is justified.Their justification is that the main thread must perform some other work in addition to entity processing.
  9. This is Cordeiro’s version of LPTF with the imbalance on Thread 1.First few iterations are identical.
  10. This is the first step where the results are different.With 8 threads there are more spaces to distribute groups.
  11. Which one of these is best?Technically 4 threads provides the best balance.The cost of using 8 threads may overshadow the benefit.
  12. FPS is just a general measure of performance.Workload distribution helps us visualize the profile of the load balancing algorithm.Server throughput is perhaps the most important:Shows us the server’s capacity and ability to respond to incoming requests.Responses-per-seconds increases linearly with the number of players/clients.Dips when it has become saturated.Intra-frame wait time measures how long the main thread waits for the supplementary threads to complete:It cannot continue until they have done so.Any time spent waiting is wasted performance.
  13. A pretty even distribution across all threads.Notice the imbalance on Thread 2, caused by initial weight penalty.We wanted to determine whether this was justified.
  14. A pretty even distribution across all threads.Notice the imbalance on Thread 2, caused by initial weight penalty.We wanted to determine whether this was justified.
  15. This algorithm appears to produce a better balance than LPTF.We didn’t expect this: an indication that a better metric may be required.Does not correlate with the server throughput results (in a few slides time).
  16. * Produces a fairly even distribution, but apparently not as good as the previous algorithms.
  17. * Pretty large imbalance on the first thread, but what effect will this have on overall performance?
  18. Server FPS does not really tell us much about the overall performance of the server.Cordeiro bases his conclusions that 3 threads is better on this.
  19. The number of response packets sent per-second, goes up linearly with player count.Point out the linear increase, until the server peaks.Its worth noting that LPTF and SRR exhibit very similar performance.This was confusing, because they have very different workload distributions.Intra-frame wait time needs to be considered.
  20. Interestingly, wait time for SRR decreases instead of increasing!LPTF does spend time waiting (doing nothing) and so the imbalance is not required.SRR performs well because we’re reducing the time spent doing nothing.Performance likely to degrade when wait reaches zero.Advantage will be lost.
  21. We can see that the deviation is high across all threads for SPTF.This demonstrates that the accumulated distribution was not a good measure of algorithm performance.This also backs up the results we saw in in the server throughput graph.
  22. SRR did not perform as well on this machine.6 threads *does* perform better, but the improvement is small at best.
  23. Due to the current hardware trend, there’s essentially no way for game developers to avoid the need for parallel programming.A full set of metrics is required, Cordeiro’s work did not take into account the bigger picture.The imbalance on the main thread is not required:Further work could investigate this.In the context of the QuakeWorld server, the best load balancing strategy is a trade off between:Consistently producing the same distribution on a frame-by-frame basis.The main thread must spend as little time waiting as possible; else we’re wasting performance!Thanks: questions?