Easier or faster particle system in HF5/6

Back on thinking about how to achieve something similar to this result

https://www.youtube.com/watch?v=XlmIJg-RKXo

Right now I was able to achieve this

https://www.youtube.com/watch?v=hISNPwtQaW8

I did find two main limitations in my personal and limited push for better results (using HF3P, I suppose situation is not drastically different in HF4)

- Particle system UI is really crowded with many elements to remember, in a long list of parameters

- Being all on CPU can become really slow. Current simulation is about 10.000 particles for the smoke and 500 for the flowers, just testing and try various option is difficult, caching didn't solve in the design process since you need to change and test, no caching possible (AFAIK)

Looking around I find i.e. Trapcode Particular UI really more visual, there are also some particle systems partially based on GPU i.e. in the game engines such as Unity3d, Unreal or CryEngine, that allow a big jump in speed.

Interested to know what did you think can be more beneficial and achievable for i.e. HF 5 (or 6?), UI or speed improvements, neither or both of them.

My current system is i7-860 2.8Ghz, 14GB RAM, HD7950 Boost (= R9 280).

Comments

  • Copy and paste this and put it on the "HitFilm wishlist page." :)

  • Triem23Triem23 Moderator

    I believe the CPU calculates particle postions and physics, but the render is GPU. 

    Looking at that logo, I feel like it was done using techniques similar to these:

    https://youtu.be/9GH_Ue5yUpQ

    With additional particle sim elements being controlled with a technique similar to this (project file in description) :

    https://youtu.be/nxsBkrx-J7I

    Which is the exact technique used for this:

    https://youtu.be/MHssOp2hAgc

     

  • edited June 2016

    @Triem23

    yes I remember this tutorials , IMHO the main problem it's in the design phase, as far the effect is working as I wish no pbm waiting for the rendering, pbm is to test and achieve the effect in viewport in term of process (UI) and speed.

    Maybe no one is interested into particle sim evolution so I need just to post this in Wishlist and hope for.

  • @Triem23 is correct, all particle rendering is done on the GPU. The physics used by the particle system is calculated on the CPU.

  • Yes I suppose the main question is if there is any plan to port the particle physics simulation on GPU, or to update the UI.

  • Triem23Triem23 Moderator

    If you think no one is interested in improving the particle system, you've never seen my channel. ;-) 

    Might I suggest you put some thought into more specific form then hit the wishlist? Right now your questions/suggestions are,  to me,  vague. Basically, one is "can it go faster?" and the other is "UI has lots of stuff." What would improve the experience? I actually prefer Hitfilm's particle UI to Trapcode--Trapcode sticks Life over Time graphs in the main UI, Hitfilm moved them to the Lifetime Panel. I find Hitfilm's organization better. The only things Trapcode has over Hitfilm IMHO are spherical forces and deflectors and air resistance. On the other hand, Hitfilm can have more than two deflectors in a scene, has 3D unified space and can use 3D models as particles.

    But I digress.  

    Unity 3D has far fewer options than Hitfilm, and lack much of Hitfilm's power. No 3D model particles, limited frames on sprites, etc. 

    Unreal also lacks 3D model particles. It's UI is a nightmare of cascading windows, and, while it has some really great functions--particles as light sources and LOD (Level of Detail) switching, those two functions are CPU only. 

    Ok, so, anyway, come up with specific changes and wishlist 'em. You might want to "+1" some of the recent keyboard shortcut suggestions. Open/Close tree hotkeys, and the like are great for speeding up navigation of the particle sim. It's complexity does require a lot of interface space--after all it's broken into two panels! 

  • edited June 2016

    @Triem23 you're always right and useful.

    Digging a bit more what I saw was not just Particular but the new Effects Builder GUI

    https://youtu.be/io-d7SbrKMc

    From min 3:37 about the new builder, as far I understand an interface that use Particular standard features allowing live GPU driven preview and visual testing and composition of effects using feature blocks. Just an interface that ease the work and speed the design.

    IMHO a similar GUI will be something valuable and perfect to complement the power of HFP particle system.

  • edited June 2016

    Looking for GPU based particle systems in Unity here a some examples

    https://youtu.be/wf9nadEEmtw

    https://youtu.be/aDPCfKKB34A

    https://youtu.be/oIewYxncXYY

    Only the last one is available, but all are based on DX so no Mac version. Edit: computing shader are available also on OpenGL, so need to be possibile to achieve this also for Mac.

    But you can simulate a lot of particles in realtime.

  • edited June 2016

    Unreal engine GPUSpriteEmitter

    https://youtu.be/DcesEW380lc

    Automatically interacting with the environment trough z-buffer, with live preview and undered thousand particles.

    As state in Unreal manual just the fact being able to simulate in realtime so many particles change the way to design a scene, that's my point. I will not need to search for the right material and care about visual glitches when interact with the object in the scene, using millions of particles I will achieve this "vapor" effect by itself.

    I did say GPU particles is possible, for sure need more development than a new particle GUI as in the other post.

    From my point of view if the usage of 3d models will switch back to CPU no problem, I can't find now a case where I will need thousand of them. Using many more simpler particles on GPU I will be able to still achieve a lot of useful effects, also with volumetric such as vapor or clouds. 

    Can we call it and enhanced atomic particles :)?

    Meaning atomic particles with these features

    - In true 3d space

    - With emitters of various shape

    - Possibility to change emission rate, size, shape (predefined), opacity

    - Setting life based on space, time, random

    - Change of speed, speed variance in 3 axis

    - Change of color based on properties (position, speed, life)

    - Basic forces (directional, turbolence)

    - Interaction with 3d elements, bouncing, viscosity, attraction

    - Smoke and gas shader and material

    - Possibly rigid body physics among particles

  • Updating here since I was in the meantime experimenting with Unity Shuriken particle system.

    Shuriken it's not perfect, but is totally multithreaded, meaning I can allocate my (old) i7 full 8 thread on it.

    With HitFilm locked at single thread is way easier to start slowing down. I know the multithread dev is absolutelly not easy to digest from a small company, but I will put it in the list for future release.

    Next step will be a full GPU based particle system.

  •  HF is single thread?!? O.o

    I use a 5820k (6core 3.5ghz). So I'd be better off with a dual core 4ghz?

  • In my experiments with HF3P yes the particle physics is mono thread.

  • Triem23Triem23 Moderator

    @TriFlixFilms davide Staff would have to verify, but I believe Hitfilm designates threads to tasks. One thread to decode. One thread for particles, etc. 

  • edited October 2016

    To reiterate my concern... is there any benefit at all of using an expensive $300+ CPU with 6 cores if HF has 0 capability of utilizing all of it? @triem23 @axelwilkinson @joshdaviesCEO

    It's been a long day, if that comes across as hastey understand its not aimed at anyone hahaha

  • Triem23Triem23 Moderator

    @TriFlixFilms CPU and GPU utilization in ANY NLE or VFX software is a complex issue. 

    If you search out CPU utilization data for any NLE you'll see for Vegas, Premiere, Avid and yes, FCPX users either complaining that their CPU utilization is super low, or that it's at 100%. For the 100% people it's usually a crap CPU bottlenecking the system. 

    In general there is no such thing as true multitasking. A CPU core can do one thing at a time (even hyperthreading is just quick task switching). It seems most NLE's hold and allocate tasks to individual threads. So, instead of dividing of a particle sim to all cores (with the overhead of splitting and combining resources--why multicore optimized code still only shows 25-50% increases per core) one core is tasked with the particle sim and another with encoding. Specific details on Hitfilm code are out of my pay grade.

    Incidentally this post from Creative Cow's Vegas forum discusses the similar issue in Vegas.

    @NormanPCN this was you.  ;-) 

    "You CPU has four cores with hyperthreading and so Windows shows you 8 cores in the task manager. So 12.5% is one logical core fully utilized. The extra logical cores only give you a 0-20% boost.

    To use all cores the algorithm must run on all cores and full saturate each core. For this argument this algorithm is encoding a file.

    The encoding algorithm must be split into separate functions that can be run in parallel to fully saturate your 8 logical cores. Suppose that one of these functions finishes before the others, and that function gets its next piece of work from one of the other functions running on another core. So it must wait for that other function to finish to get another piece of work before it can start again.

    This is just one example of a great many as to why the CPU will not be 100% loaded as is your desire."

  • @TriFlixFilms Rest easy your CPU is of benefit. HitFilm is a multi threaded application even if parts of it are single threaded.

    Let's say you're dealing with multiple 4k streams. Members of this forum have pretty well figured out dealing with 4k can basically swamp a quad core processor before you get to the particle sim. This was done with some informal testing, running the numbers on the amount of data involved and checking the specs for several other applications just to confirm and everything says 4k is a resource hog and you need a lot of horsepower to deal with it effectively. Now let's say you do want to add a particle sim. A quad core user is going to be pushing his CPU to the max. You won't be. 

    This brings me to something I wanted to toss out for @davide445 . Making the particle sim multi-threaded might be a good idea but it might be a terrible idea instead. Unity Shuriken doesn't have to deal with the same kind of CPU intensive tasks that HitFilm or any other editor has to. In terms of overall performance keeping it single threaded might actually be the best choice. I'm not saying it is mind you because I honestly don't know I'm just tossing it out as a possibility. 

  • Triem23Triem23 Moderator

    @Aladdin4d as I point out oft, in 10 years we've gone from 640x480 to 3840x2160...32 times the pixel data... Oh, and from 8 bit to 10 or 12 bit. We'll stick with 10, which is takes us to 40 times the data. And this ignores additional load for overhead from an Alpha channel. 

    In that time, processors and GPUS got 20-25 times faster and storage got 10 times faster. 

    We old farts who did this in the 90's know this means inevitable slowdown. 

  • Some part, or parts, of the particle sim do seem to be single threaded and I can show many cases where my GPU is not loaded and my CPU is not loaded so there is some bottleneck there. Is it the single threaded thing or something else. You really need a profile analysis to tell.

    IMO, What is known is that Hitfilm has very real bottlenecks and these do affect Hitfilm performance.  The good news is that if you can reduce the bottlenecks a lot of performance can be gained.

    I understand that CPU utilization is limited to video file decode and graphic dispatch to the GPU and to export file encoding. Video file decode is multi-threaded to some extent and dispatch will be single threaded. So when I do something that is pure CGI, say particle sim, I understand that my CPU will max out around 12-15%. 4 core + hyperthread. This is expected. 

    If one graphics dispatch, particle sim, CPU thread is pegged to the wall and my playback is not real time then I would expect my GPU to be pegged to the wall. The GPU is the limiter in this experiment. Most times I only see my GPU (GTX 980) at 30% in this situation. There is something bottlenecking the dataflow.

    I once did a simple experiment. I took my CG image and made it 4k instead of HD. At that point I could jack my GPU up near the wall. A 4K image has 4x the pixels of HD and thus 4x the work. That simple test would seem to indicate that there is overhead going from one effect(computation) to the next. The GPU was done with the HD compute quickly and Hitfilm does not pipeline, or pipelines poorly, so the GPU is idle waiting for something to do. The 4k compute took far longer and the in between overhead likely being constant and thus GPU utilization effectively going up.

    Once simple bottleneck Hitfilm can do is get off the OpenGL 2.0 spec them seem to limit themselves to. It is an ancient spec. Their minimum hardware requirements list GPUs that all support OpenGL 3.3. I have a number of apps that list  GL 3.3 as minimum. Later versions of GL have features that add flexibility to rendering which might let an app keep the pipeline moving. GL 2.1 has a real important feature Hitfilm needs to use. PBOs. These will dramatically speed up frame buffer read backs. Something Hitfilm realy sucks at compared to other apps I have running on my same hardware. Don't blame the hardware.

  • edited October 2016

    Just to clarify I'm not discussing or complaining about HF generally speaking.

    I'm just a bit more interested with particles and so focusing my analysis and discovery in that region.

    So from my point of view HF is probably well balanced managing his many tasks, needing to deal with images and video processing, VFX, compositing, 3d hw rendering and editing. Some tasks are easily to parallelize, some don't or are not treatable at all. About particles dynamics there are examples of parallelization, with evident benefits.

    Multi-core CPU are useful in these cases, and if there are many articles telling than more than 2 threads per core are normally a waste of precious transistors, there are some processors with deeper hypertreading i.e. IBM POWER.

    GPU processing enable in these massively parallelized cases even more speed gain. Considering Unity my fist asset purchase was a realtime dynamic global illumination plugin, using GPU Compute Shaders. I'ts wonderful have global illumination with no CPU usage!

    I'll post some example of my testing to show what's happening in my very specific setup.

  • HitFilm uses the GPU for rendering particles, and for rendering everything else. Particle physics are calculated on the CPU, but all lighting and rendering is done on the GPU.

  • edited October 2016

    I learned something today! :D

    So @axelwilkinson has HF tested CPUs to find 'the most optimal CPU' for particles? Just curious as I don't plan on changing mine but for future reference :)

    Also thank you very much @triem23 and @aladdin4d for easing my mind, lot of insight shared here today :)

    So far in the last year of using HF I think @NormanPCN just answered my main question:

    "GL 2.1 has a real important feature Hitfilm needs to use. PBOs. These will dramatically speed up frame buffer read backs. Something Hitfilm realy sucks at compared to other apps I have running on my same hardware. Don't blame the hardware."

  • edited October 2016

    Generally, the "most optimal" CPU for anything in Hitfilm is the fastest clock rate you can get. That speeds up everything that does anything with the CPU. A high clock rate can really help playback not stutter with things like most AVC files.

    As to how many cores should you have, then it gets more complicated. HD vs 4K, how many simultaneous media file layers do you commonly composite, and so on.

  • Triem23Triem23 Moderator

    To make it more complex, what other software are you using? If one switches between Hitfilm, Vegas, AE, Mocha Pro, Blender, Photoshop, etc, it's not just Hitfilm one is optimizing. Do you game? 

  • edited October 2016

    I was referring to if anyone had benchmarks in Hit Film using  a 5820k vs 6700k v's 6800k ect.. the $300+ i7's.

    Hopefully HF5 impliments benchmarks testing built into the software with cpu/gpu monitoring opposed the current export speed test of the Harry Potter clip. 

    I think the OP and Development team would also benefit from this :)

    @Triem23 I do game but literally any i7 4core 3.5hz cpu and up can easily handle AAA games right now. Gpu was the bottle neck until pascal released.

    HF is the only software that lags for me, including 3ds max. And I generally only have one app open at a time. 6 core with 64gb of ram easily handle a broseer open with HF as I research tips and trick while editing.

  • @TriFlixFilms just to confirm, yes, your extra CPU cores will be used in HF. There are various parts of the software that are multithreaded. We know it's not perfect and that there's room for improvement but these things take time to get right.

    @NormanPCN HF is using OpenGL 2.1 and PBOs.

  • Such a great community, Thank you :)

  • @CedricBonnier "HF is using OpenGL 2.1 and PBOs."

    That's depressing.

  • edited November 2016

    Discovered another great realtime particle system PopcornFX, this just an old example:

    https://youtu.be/CMPsOXI6Wfs

    On my old system I'm able to simulate up to 300.000 particles in realtime, using both CPU and GPU using stream processing. Problem is the learning curve appear not so fast.

    Think is somehing too specialistic for FXHome, but will be nice to get something like this in HF6-7 (suppose HF5 it's nearly out).

Sign in to comment