My Time on Age of Empire IV
October 2019 - November 2023
Giving the artists means to hit performance targets
Carrying on with performance
When I joined Age of Empires IV, I came in with the same focus on performance that I had on Gears 5. The first thing I noticed was the vastly different performance constraints that an RTS game like Age had versus a third person shooter like Gears. With RTS there’s a massive CPU load with up to 1600 fully fledged units all doing their own actions while buildings are being destroyed. This is vastly different to Gears where render thread was our big bottleneck.
I noticed was that there was no easy way to find out which buildings were causing huge CPU performance problems. There was no way to reproduce a scenario while easily dictating what happens in it and capturing the performance profile.
So I tasked myself with making a performance data capture harness that is easy to modify. I knew that physics costs were a massive concern from the engineering team, and I knew that they captured that data using Telemetry captures. I also noticed that Telemetry captures can be used as a launch argument when the game launches. The next step was to figure out how to have the game run a custom scenario, with scripted sequences, that users can dictate the buildings being spawned into it.
I then discovered that campaign missions were being written in SCAR (stands for SCripting At Relic, it’s really just LUA) and so I write a custom scenario where buildings and units can be spawned in. Once that was done, I wrote the function to decrement the health of objects spawned in scene over time.
The first request for this tool I had was to stagger the destruction of the buildings, while keeping some overlap to better represent real game play scenarios. The way I had it destroys all buildings at the same time and causes a spike in the report.
Once I had that working, I allowed the user to pipe in their desired spawned buildings based on name via a JSON file that is read at the launch of the game. This then gives the user the ability to choose what they’re testing and narrows down potential culprits in the performance capture reports.
I then added an option to capture video of the scenario so the user can compare that to the report and find out where the spikes are happening. Of course, running a video capture software (in this case ffmpeg or camtasia) causes a performance hit on its own so the scenario is ran twice. Once without the video capture and once with the video capture. These files then are assigned to a performance capture folder and an excel sheet is generated with the numbers from the Telemetry capture, along with the typical log files from running the game.
This Frankenstein tool (combination of LUA, command line, JSON, python) ended up being used to identify how civilizations compare to one another in terms of physics CPU costs during destruction. Ultimately it was decided that the physics costs for all buildings were problematic enough to the point where the art needed to be re-worked.
How to rework 500 buildings without reworking 500 buildings
When we identified that the buildings simply could not ship as is (they were collectively about 5 times over the CPU budget), we were only 4 months away from the launch of Age of Empires IV. We had to come up with a plan that does not involve having every building refractured because there is simply not enough time to do that for all 500 buildings. Compounding on that problem is that the fracture tool we had at the time did not run quickly in 3DS Max and was prone to user error.
All of this together meant we needed a new tool and a new strategy to re-do all the buildings without actually re-doing all the buildings. The solution we came up with was three pronged.
First we came to an agreement with the design team that the dynamic, directional destruction we had at the time was not readable from a game play standpoint. It was difficult to know how damaged a building is without a consistent silhouette. This meant we’re going from dynamic destruction to scripted destruction sequences for readability’s sake.
This created an opportunity to replace the previous tool with a streamlined tool in 3DS Max. The previous tool would have users jump between different tabs on the tool and set up material properties, physics properties, as well as fracture the building and set up quadrants for directional damage. We can do away with all of that in a scripted destruction sequence.
Buildings with directional destruction
Buildings with scripted destruction
My tool breaks the building lifecycle into buckets of % construction progress and health. The number of buckets/stages would depend on the type of building it is. If it’s a wonder, it has increments of 5% health vs 20% health increments on a house.
I then made the UI top down in list form, and the artist would fill out each stage in order from 100% down to full destruction by selecting which fracture meshes they want in each stage and assigning them. Once they get to the bottom of the UI there is an Export and they’re finished. This worked very well especially with outsourcing artists working in different languages, because top to bottom is a very standard way of interaction whereas left to right or right to left may vary.
The tool then has a clustering step where an artist would go in and tag pieces that would usually have a physic rigidbody generated individually and generate a single rigidbody for the entire selected group. This clustering step is a big contributor to getting the physics cost down to almost 1/5 of the original unoptimized cost.
Another performance win here is having the health in stages, which effectively spreads out the destruction over a longer time rather than all at once. The performance spikes are much smoothed out by this. On top of that, created a budget for artists to hit where each building must not exceed a set number of physics pieces and anything exceeding it must be tagged as no physics. In the no physics cases they would pop out, so artists would tastefully select the ones that would least visibly jarring.
With this tool, we were able to optimize all 500 buildings across 8 launch civilizations to a reasonable performance cost while keeping the game play readability and have artistic direction ensuring the silhouettes still looked beautiful.
After launch we also moved fracturing process from 3DS Max into Houdini to further accelerate the workflow.
Not unlike Gears, Age of Empires IV also had a very tight texture memory budget. In this case it was because we were shipping also for Gen 7 hardware (equivalent to roughly Surface Pro 2 specs) and targeting 60 FPX on that platform.
Among the texture memory optimizations were terrain and VFX. I tackled the VFX memory optimization along with our VFX artist Jeremy West. Originally the sprites were 2048x2048 albedo and normal. The normals were making the least visual impact from game play distance so I shrank normals to 256x256. Albedo would be severely visually impacted at 256x256 so I only shrank them to 1024x1024.
Visually speaking they appear slightly softer but still beautiful. Due to the super tight memory budget there was , these optimizations were shared by almost all VFX in the game. Together this saved about 40 MB of VRAM.
VFX optimization
Artist and outsourcing support
During my time on Age of Empires IV, I was responsible for artist support and outsourcing support once again like at FIFA. This time I am the sole technical artist fielding outsourcing requests and technical issues, as well as reviewing their art submissions on charring for buildings.
I was supporting teams from Philippines, China, Russia, and UK. I would often wake up after naps after work to support another team, and then nap again before the next team wakes up. While this work schedule is not ideal, I did learn much about how to document for ease of translation for different languages, how to build tools to be more outsourcing-proof, and how to break down the workflow so there is much less confusion.
On the translation part, I’ve learned that voice over videos are not as useful because they would need a person who needs to spend time listening and translating the video into their own text document. Diagrams and text is much preferred in this situation. Also, arrows and numbered lists are very helpful to indicate steps in the process.
Tools that have internal references to internal network locations and folder structures would break when it gets passed to outsourcing because often they need to work on multiple contract projects at the same time, and would have specific setups for that use case. I learned it is best practice to not rely on folder structure and network locations. UX built on common reading order is also useful here. Top to down is very well understood where as left to right may not be standard in all cultures.
It also created less room for error when the tools have one specific workflow and result rather than having multiple workflows and possible results. The latter has shown to cause bugs and misunderstanding.