My Time on Gears 5
May 2018 - October 2019
The beginning of an obsession with performance
Legacy Content Porting
The first thing I worked on when I joined The Coalition was the porting of legacy multi-player maps.
I quickly learned the best practices of separating the ported data into raw source art files, environment files, character files, material files, and miscellaneous files in between. Having encountered the concept of GUIDs (Game Unique IDs) previously at EA, I quickly realized that many of the GUIDs from the previous title cannot be transferred one to one with the new title. The new title has already occupied the GUIDs used in the previous title so the work is a manual vetting and redirecting of references and finding replacement materials within the new title’s library.
As you can imagine, this is a very tedious manual process and on average we are looking at 16,000 files per map that I was asked to port. At that number of files, Perforce begins to choke and freeze whenever I would shelve parts of the changelists and this would result in multiple hours of waiting at a time just for processes to finish. Even though a porting tool existed at the time, it merely brought files into a single changelist without references and that would still take roughly 8 hours.
I ended up being responsible for all the ported Gears of War 4 maps that launched with Gears 5. This included the performance of them, re-doing gameplay blueprints that had broken during the porting process, finding replacement materials and textures, as well as updating them to use the same lighting technology as Gears 5.
I would end up porting maps throughout my time at The Coalition on top of my other tasks. Dawn and Gridlock were mostly complete but were not released when I departed The Coalition.
This video actually does a pretty good breakdown of differences between the lighting of two games. Gears 5 uses a lot more real time raytraced shadows, combined with cascade shadows in the distance in an effort to reduce the amount of memory and disk space taken up from the baked light maps and baked shadows.
The utilization of these techniques resulted often in a sharper image but they can sometimes look unnatural because the lighting samples on a much lower resolution LOD of the geometry. Examples of these would be objects that are further from the ground that would normally result in softer, more diffused shadows but now have a very sharp shadow instead. As a whole though, this was a massive win on memory and disk space footprint and was worth the trade off.
VFX Optimization
One of my larger contributions to performance optimizations was the optimization of weapon VFX. Many of the VFX for weapons were from Gears 4 campaign where target frame rate was 30, and Gears 5 was targeting 60. To accurately measure my changes, I checked before and after in an empty scene.
The challenge with flashbang was the lighting shadow cost on lingering smoke particles. The reduction of the lifetime on the smoke by a half second and changing the timing of the light flash reduced the performance cost of the VFX by roughly 20% on the render thread.
For the boomshot, it is crucial for the lighting area to be large enough to imply overwhelming brightness. GPU sparks life time was a bit long on it so I cut the sparks lifetime to be short enough to bounce once off the ground. The rocket itself also has a light attached and initially the light from the rocket and the explosion were coinciding, costing a significant performance hit. The solution was to turn off the light on the rocket upon impact to eliminate that issue. Performance gain overall on this optimization was somewhere around 10% off the render thread.
On the breaker mace it’s purely the overdraw on the large amount of particles taking up screenspace. There is a small light that goes off at the point of impact, and I reduced the size of that by about half to gain back some performance from lighting. Otherwise the bulk of the performance optimization was attempting to lower the smoke so it covers no more than half the screen and reducing the lifetime of the smoke. Performance gained on this optimization was somewhere around 10% off the render thread.
Overkill shotgun has a large light and takes up a lot of screen space. The tuning here is to make sure the light and particles stop as soon as possible so a reasonable firing rate would not have multiple lights and sets of particles existing at the same time. Previously, the particles lingered well into a normal second shot and caused a large overdraw cost. This cut the cost of this VFX by about 50% under normal playing conditions.
Dropshot optimization was very similar to boomshot optimization. Much of the cost was actually lights overlapping lifetimes between the landing of the shot itself and the explosion. Tuning the explosion light size down by a quarter also helped cut general shadow costs down during gameplay.
This optimization also shared the same VFX as the frag grenade so combined it was a significant cost reduction somewhere between 10-20% cost on the render thread depending on how close the player is.
Level Performance Optimization
One of my primary duties was optimization of art content for multiplayer and eventually campaign levels. Among the concerns were draw calls that were hammering the render thread.
On the left is an example of a level with draw call concerns due to the overwhelming amount of foliage decorating a very long sight line. I tackled this with the use of Unreal’s Hierarchical LODs (HLODs). This allowed me to bundle multiple pieces of foliage into one LOD at manually tuned distances. There were also occasions where HLODs had other HLOds in them.
I also used this for buildings and other environmental props, especially small props that might be strewn about a level as set dressing.
Another part of performance optimization is fitting everything under the memory budget. Given the memory constraints of the XBOX One S, there is simply no way to make everything show up with correct texile density all the time. This is especially true in the open world chapters.
I was part of the effort to get the texture memory under control in these sections and it often takes a good understanding of how players interact with the content to make the correct choices.
For example, on the left is the reveal of Mount Kadar upon opening giant metal gates into the next section of the open world map. This texture is an 8k texture and took up a very significant amount of memory but it is a showpiece moment for the player and cannot be optimized lower.
We assign this texture top streaming priority as well so it will always be full resolution ahead of other textures in the scene. I instead hunted for small props, metal plates, trims, and small environmental objects to squeeze out memory from in order to preserve this moment.
Sacrifices like this also happens in multiplayer maps, where players have paid weapon skins and must receive full value for their purchase. I prioritized streaming weapon and character textures ahead of the map in many cases for multiplayer.
Part of the graphical upgrade from the previous title is to also use more dynamic shadows, both because it’s less memory intensive and more accurate when there are dynamic lighting elements happening in the scene.
We used raytraced shadows up close and transition to cascading shadow maps further away in the distance to reduce the performance costs further away. The transition distances are manually tuned for each level, and objects are opted in for the cascading shadow maps.
Example below, see the shadows on the red tool box on the left. Note how there is little to no self shadowing from the top of the box when the player is further from it but as the player gets closer the self shadowing from the geometry appears. This is because cascading shadow maps also use a lower resolution LOD compared to the raytraced shadows. However, due to a fading transition between the two shadow techniques the player largely won’t notice this.
Every single piece of geometry in every level gets tweaked like this and is a lot of what I did to optimize shadow costs. Sometimes the cascading shadows don’t match the shape of the raytraced shadows enough and so we ‘create’ shadows by putting in invisible ‘black flag’ geometry that influence cascading shadow maps.
Learnings
Through my time at The Coalition, I’ve developed a deep appreciation for performance optimization. I now see it as a core part of my interest in tech art, it always feels extremely satisfying to find ways to present a beautiful game while hitting performance targets.
You could say that PIX is my new best friend.