The perfomance upgrade in Particle Playground 3

14 December 2015
3651 Views

playground-profiler

Particle Playground 3 has been under development since April 2015 and got released December 8 2015 along with the Unity 5.3 release, quite a long iteration cycle looking at previous releases. There are many reasons why this release took quite some time to finish up, the main reason is that I wanted to make sure I had the know-how to implement all the very technical features version 3 came with. Particle Playground has always been evolving one step at a time but in general under small iteration cycles, this release required a lot more foundation to become successful.

What I also wanted was to have early alpha versions out in different environments, where companies whom had asked for a specific feature got to test it early during their production. This was mainly done with the Playground Trails and the Playground Recorder, as they together with an upgraded thread pool were the most critical implementations. I had been notified early 2015 about a grand update to Shuriken by our beloved Unity founder David Helgason, so I knew there was something around the corner, just not exactly when and what. Particle Playground 3 got quite some iterations in the 5.3 beta, where everything was OK from the beginning except a fair list of warnings of deprecated members. Particle Playground 3 is fully compatible with the Shuriken upgrade in Unity 5.3, but still misses some key features such as Vector3 rotations which just couldn’t cut the deadline – this is coming in the next update!

Particle-Playground-3-trails02

In order to do insane things like this performant, some things just had to be done.

Particle Playground has been running asynchronous calculations since version 2.0, where 2.12 introduced calculation bundling onto the same threads. This in itself was a major performance boost for scenes with a high amount of Particle Playground systems and has helped projects to run better on low-end / mobile platforms immensely. There was still an overhead of the generated memory allocations each frame which has been troublesome for mobile platforms in some scenarios, especially when utilizing all CPU. While a Particle Playground system in itself only generates 24 bytes, each created thread needed up to 700 bytes fresh memory allocated. This quickly becomes an issue if your platform has ability to run on several CPU cores but very limited memory, which usually is the case for newer mobiles, where garbage collection then needs to kick in every so often.

In version 3 there’s a new thread pool called Playground Pool introduced, this is a self-managed thread pool which reuses threads dedicated to Particle Playground. Many tests in different environments shows that it didn’t only improve the CPU time spent inside the calculations by around 2x, the amount of GC allocations dropped up to 6x depending on which type of particle system setup you have. Each thread in the Playground Pool generates 128 bytes.

The scene memory has previously been an issue where the Shuriken component had been outrageously default set to 100k particles from the very first release of Playground, which generated 3.8 MB in itself. In version 3 this is now automatically set in Editor towards your Particle Playground system’s Particle Count which instead generates the more pleasant 13 kB. If you upgraded from a previous version of Playground what you only need to do is select the particle system in Hierarchy and this will be set automatically (remember to apply any prefab changes).

To help you further in improving performance there’s now an article available about the ins and outs of configuring the multithreading. Here you can see how Playground performs under different scenarios using different multithreading techniques:

playground-performance-featured Performance in Particle Playground

My advice is that you profile any changes you make to see how they affect your particular scene setup. Keep in mind that you need to counter for the target platform, where for example setting Max Threads to the target device’s CPUs in Editor will give you a more realistic profiling. All multithreading settings can also be altered seamlessly during runtime, which makes it easier to test performance directly on device.

If you missed the release video of Particle Playground 3 which covers some of the fresh additions, you can watch it here!

See this article to read more about the Particle Playground 3 update.