Would it be possible to optimize for Ryzen?

So I just recently updated my PC to a Ryzen 1700 CPU. When I rendered out a video I got about 50% use on all cores. That kinda seems like maybe there's a GPU bottleneck in my rendering (GTX 980) but I'm not sure. The video in question was 720P 60FPS, was 14 minutes long, and took around 12 minutes to render. Not bad but not blowing me away either.

Basically I would like to know if there's any plans to patch HF4E and HF2017 to optimize rendering for Ryzen? 

Comments

  • It's normal to see lower CPU and/or GPU utilization % numbers in Hitfilm. It is just the way things are in Hitfilm. It's resource utilization can be a bit low. It is very specific to the exact circumstances of what you are doing.

    Also, remember that on a hyperthread CPU 50% utilization is really very close to a real time, real world, 100%. Hyperthreading typically adds about 10% real time performance above the basic full core performance.

  • @NormanPCN I know HF doesn't use the entire CPU, but the main reason I upgraded was for speedier render times. It's definitely faster than my old fx8350, but I wanted to make sure HF was getting the most out of this new architecture.

    I guess my real question should be does that render time seem right? Seems slow to me considering how this is an 8 core processor at 3.8GHz. I will need to test it in 1080p60fps to see how it handles that.

  • edited April 2017

    Render times are extremely variable depending on exactly what you are sending the encoder. I am assuming you are encoding to MP4/H.264/AVC.

    The AVC encoder operation should be (I hope) asynchronous to Hitfilm rendering the video stream. The AVC encoder should make reasonable use of 8-core verses 4-core. However, if Hitfilm is slow at feeding the AVC encoder frames then it just does not have much work to do. Therefore it will finish what it can do and then block waiting for another frame from Hitfilm.

  • does ram I have 8 gigs have a lot to do with ware performance in windows 7?

     

  • edited April 2017

    "does ram I have 8 gigs have a lot to do with ware performance in windows 7?"

    RAM use is variable depending on what you are doing in Hitfilm. Here is a simple check you can do.

    Open your project in Hitfilm.

    Open the operating system task manager.

    Look at the available/free memory reported by task manager. If there is a reasonable amount memory available, say 10-20%, then Hitfilm should be using the ram it wants to use and there is still memory free in the system.  The free ram never really gets to zero. When memory gets tight the system frees up other resources to make free ram.

    You can also look at this during an Export. The memory consumed will  be higher during this operation.

    Another item you can look at is the memory allocated to Hitfilm in the processes tab of task manager. If your media files are Quicktime MOV files, then you also have to look at the Quicktime media server Hitfilm sets up (FxQtmediaServer). It will be using some RAM on behest of Hitfilm "proper".

    You will probably see Hitfilm using less than 1GB, or 2, or ram for most things until you use the ram preview feature. The amount of ram available to that is controlled by you in the application options. RAM preview is not used during export.

    Bottom line is that for many/most things, 8GB is probably good. Even with a 4GB ram preview buffer being used. If you want the most ram available for Hitfilm, just don't run other apps at the same time as Hitfilm. 

     

  • Can someone explain in the simplest way possible why games (and other applications) are able to reach 90-100% on the CPU where as HitFilm generally can't, except on a weaker CPU like my own (judging by the posts in the past)?

  • Triem23Triem23 Moderator

    @CNK mostly because Hitfilm mostly utilizes the GPU. 

    Hitfilm only uses the CPU for I/O, encode/decode and particle physics. A game is using CPU for a lot of other things. 

    Additionally as Norman often notes, multi-threading is splitting a single core into threads. Effectively, 50% is full-blast. 

  • Do we know if that 50% really is 8 cores though, and not 4 cores and 4 threads, for instance, because applications just see the threads, not which ones are fake?

  • Applications create and manage their own threads. They do so based on the number of physical and logical processors available. They can't determine which is which; you can know that you have 4 physical and 8 logical processors, but all of the logical processors look the same, and all of the physical processors look the same. (There are exceptions, some systems have a big/little design with high-performance cores and low-power cores, but you don't find that on performance machines).

    50% could be due to the processors being I/O bound, and it could also be due to their only having one HitFilm thread per processor. If HitFilm allocates as many threads as it has physical processors available, the OS will spread them out over the physical processors.

    Interesting is when you see how HitFilm's render times change as you add more and more effects. That I'm looking forward to seeing.

     

  • @CNK As @Triem23 stated Hitfilm only uses CPU for certain things. The biggest one being media file decode. So if you are doing pure CG then at most you are going to peg one CPU thread to the wall, plus misc housekeeping. With a 4 core 8 thread CPU (4C/8T), 1 thread at the wall registers 12.5% for itself.

    So as to my 50% being close to max comment that goes like this. If you have a 4C/8T CPU and 4 threads at the wall this will register 50%. One might think my CPU is only being half used. Not true. Hyperthreading only typically adds 10% extra performance. This is because most threads, but not all, cannot really keep a core fully loaded 100% of the time.

    That said what does 50% mean on a 4C/8T CPU. You have to look at the per CPU utilization to see what is adding up to 50%.

    That said most times I never see Hitfilm peg CPU threads to the wall during playback. I can envision logical/understandable/reasonable reasons for this during playback. Situationally dependent of course (as always). During render we kick in another CPU item. Only the AVC encoder is really multi-threaded. I hope it is asynchronous with the timeline but I don't know. Either way, it is still limited by Hitfilm feeding it frames to encode. So it will be hard for the encoder to crank up the threads for any length of time because they finish with their work and need another frame to do more. If Hitfilm is not ready with a frame when the encoder wants one then they have to block and wait for something to arrive.

    The bottom line is when we see CPU utilization not at 100% then the parts of the CPU are sitting there waiting for something to so. The same goes for GPU utilization.

    So when I do something CG, and playback is not making real time, and only see 15% CPU I understand why. No media decode so only the GPU dispatch logic is about the only thing running on the CPU. This is the CPU code that organizes and tells the GPU what to do.

    What makes me cry is to only see 40-50% GPU utilization in that situation. I want something pegged to the wall when things don't make real time. Maybe the GPU dispatch code is single threaded and cannot setup my GPU (GTX 980) fast enough and that is why the GPU is waiting for work. Maybe the work is organized in tiles and this title size limits the number of GPU kernel threads (different than CPU threads) on HD frame size material. I see 4k frame sizes peg my GPU much more easily. Same effect. Just frame size change. I once did a quick HD to UHD comparison test for pixels per second processed to try and detect any effect to effect overhead in Hitfilm on the same frame. There may be something there but I don't trust my quickie test is properly illuminating.

  • @CNK

    "Can someone explain in the simplest way possible why games (and other applications) are able to reach 90-100% on the CPU"

    Forgot about that one. Consider games. In many way they are much simpler apps that something like Hitfilm. It is easier for them to keep the CPU and GPU pipelines moving and not stalling. They simple single 3D environments. You compute one frame for time T and then compute the next frame of time T+1 and so on. the game play logic and physics and such are easily asynchronous to the  graphics display of said "universe". This allows you to keep as much computation going on in parallel as possible. This better utilizes multiple cores. For graphics even using multiple GPUs via SLI is want they want. Combining multiple GPUs into a logical virtual GPU works for them.

    Now consider something like Hitfilm. We have many layers and each is generally independent and they are then composited together is a given order. In Hitfilm bottom to top layer/track order with some special rules. Media decode in Hitfilm may be asynchronous to the timeline so lets assume it is. Decode is CPU.

    Can we compute/composite multiple layers in parallel. Nope, since Hitfilm uses the GPU, a single GPU, for all effects and compositing. Technically you could but that would slow things down by adding overhead with a single trying to do two separate things. You could do the graphics dispatch logic (CPU) for multiple layers in parallel but without parallel GPU why put in the effort. Unless of course if helps. I have no idea of knowing that level of detail. For apps like Hitfilm using multiple GPUs is probably easier not the virtual GPU SLI way of games, but use each GPU independently. With two GPUs you could fully compute two layers in parallel. Each layer has its own GPU. This all assumes you are using a bunch of effects because without those the GPU is not doing much/anything.

    So to trivialize the comparo. Games have single complex scenes (lots of stuff) that is much easier to pipeline all the things you do. Besides async, with minimal need for sync, keeping the pipeline moving is everything. Hitfilm is generally made up of a lots of items that are combined to build up the scene. Each items may not be user complex and as such is hard to keep the pipeline filled and thus moving. Anytime something stalls, portions of the CPU and/or GPU are waiting for something to do. Something else has to finish. That waiting shows up as lower GPU utilization. Assuming the app in general can fully utilize the resources with stalls.

    IMO Hitfilm is very synchronous even disregarding the single GPU thing.

Sign in to comment

Leave a Comment