Video Encoding Tested: AMD GPUs Still Lag Behind Nvidia, Intel

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
First of all, thanks for the test!

What surprises me a bit are the generational regressions/performance changes e.g. within the Nvidia cards: I find it hard to imagine that someone at Nvidia would somehow reimplement a lesser IP block for a codec because there was a shift in popularity. It would be cut & paste wherever possible and unless there was a specific need to improve or fix a bug, such blocks would be left alone. Of course clock rates or energy budgets might change somewhat between generations and a different fab node means the physical design might change with a performance impact. But anything outside a few percent?

When it comes to mulitple encoder blocks, I wonder if they really can speed up a single encode: I'd be more inclined to believe that they simple allow for additional transcoding queues and target cloud use.

I am utterly astonished at the AV1 software encoding performance figures you have obtained. Admittedly I am rather lazier and use Handbrake, but Handbrake is little more than a front for FFMPEG and should not perform vastly different. While FFMPEG just got bumped to 6.0, I assume you still used 5.1 which is what Handbrake 1.6.1 would also contain and it's the AV1SVT codec by default in both cases, I believe.

I am getting 17FPS encoding AV1 at 2500kbps on a 5800X3D and 26FPS on a 5950X from a 1080p source, far below the better than real-time speeds you and others here are reporting, both H.264 and HVEC are significantly faster, but not the triple digit FPS hardware codecs achieve.

So I wonder if you could post the option set also for the software encodes.. and btw: there seems to be an error (I hope!) in your "tuned Intel" parameters, as they call the _amf instead of the _qsv coded ;-)
 
First of all, thanks for the test!

What surprises me a bit are the generational regressions/performance changes e.g. within the Nvidia cards: I find it hard to imagine that someone at Nvidia would somehow reimplement a lesser IP block for a codec because there was a shift in popularity. It would be cut & paste wherever possible and unless there was a specific need to improve or fix a bug, such blocks would be left alone. Of course clock rates or energy budgets might change somewhat between generations and a different fab node means the physical design might change with a performance impact. But anything outside a few percent?

When it comes to mulitple encoder blocks, I wonder if they really can speed up a single encode: I'd be more inclined to believe that they simple allow for additional transcoding queues and target cloud use.

I am utterly astonished at the AV1 software encoding performance figures you have obtained. Admittedly I am rather lazier and use Handbrake, but Handbrake is little more than a front for FFMPEG and should not perform vastly different. While FFMPEG just got bumped to 6.0, I assume you still used 5.1 which is what Handbrake 1.6.1 would also contain and it's the AV1SVT codec by default in both cases, I believe.

I am getting 17FPS encoding AV1 at 2500kbps on a 5800X3D and 26FPS on a 5950X from a 1080p source, far below the better than real-time speeds you and others here are reporting, both H.264 and HVEC are significantly faster, but not the triple digit FPS hardware codecs achieve.

So I wonder if you could post the option set also for the software encodes.. and btw: there seems to be an error (I hope!) in your "tuned Intel" parameters, as they call the _amf instead of the _qsv coded ;-)
It'll depend on the codec you're using.

AV1 has... 4 encoders out there? OBS I know has 2 (SVT and AOM) and one of them (SVT) is super punishing on the CPU, but (in theory) delivers better quality. I haven't found a visual difference at 6mbps (Twitch cap) on quality at 1080 (downsize from VR's viewport in OBS). This is for a "1-pass" on the fly encoding for streaming and recording. I haven't done higher bitrates as capping at 6mbps makes the most sense to me because of Twitch and Discord (although they don't expose any encoding flags or anything).

For "proper" encoding I use VirtualDubMod nowadays with any codec I want on raw sources. It's a tad painful at times, but it's great.

Regards.
 
I think your VMAF scores are wrong.

From 1Encodes CPU-13900K-Medium.txt, I got this sample VMAF command line:

ffmpeg.exe -i BL3-Seq1-1080p-CPU265-13900K-Medium-3M.mp4 -i BL3-Seq1-1080p.mp4 -lavfi [0:v]setpts=PTS-STARTPTS[reference];[1:v]setpts=PTS-STARTPTS[distorted];[distorted][reference]libvmaf=n_threads=20 -f null -

But you mixed [0:v]...[reference] with [1:v]...[distorted]

Based on the order give in ffmpeg -i ... -i ... :
0:v is BL3-Seq1-1080p-CPU265-13900K-Medium-3M.mp4 (and is the distorted)
1:v is BL3-Seq1-1080p.mp4 (and is the reference)

The above command gives VMAF 79.897865

The correct VMAF command line should be:

ffmpeg.exe -i BL3-Seq1-1080p-CPU265-13900K-Medium-3M.mp4 -i BL3-Seq1-1080p.mp4 -lavfi [1:v]setpts=PTS-STARTPTS[reference];[0:v]setpts=PTS-STARTPTS[distorted];[distorted][reference]libvmaf=n_threads=20 -f null -

and gives a VMAF of 70.303203

The order in VMAF is VERY important. As you can see, it gives entirely different results.
See also here: https://stackoverflow.com/questions/67598772/right-way-to-use-vmaf-with-ffmpeg
Ugh. The good news is I have all the files and can easily make a script to recalc the VMAF. I wonder if I always had it wrong or accidentally shifted the order at some point? Stay tuned...
 
  • Like
Reactions: bit_user
It'll depend on the codec you're using.

AV1 has... 4 encoders out there? OBS I know has 2 (SVT and AOM) and one of them (SVT) is super punishing on the CPU, but (in theory) delivers better quality.

That's what I wondered also, but it seems that everybody just seems to converge on the AV1SVT for software, which is the FFMPEG (and Handbrake) default.

It's 'punishing' in the sense that it uses every ounce of CPU power it can find, but it's supposed to reward with the best performance. AOM by contrast was supposed to be functionally complete but very little optimized, at least initially.

From what I've read there are other custom software codecs out there, but they are proprietary and niche.
 
  • Like
Reactions: -Fran-
That's what I wondered also, but it seems that everybody just seems to converge on the AV1SVT for software, which is the FFMPEG (and Handbrake) default.

It's 'punishing' in the sense that it uses every ounce of CPU power it can find, but it's supposed to reward with the best performance. AOM by contrast was supposed to be functionally complete but very little optimized, at least initially.

From what I've read there are other custom software codecs out there, but they are proprietary and niche.
I'm just using the built-in libsvtav1, nothing special AFAIK. Here's the 1080p AV1 encoding command:

Code:
ffmpeg -i BL3-Seq1-1080p.mp4 -c:v libsvtav1 -b:v 3M -y -g 120 BL3-Seq1-1080p-CPUAV1-13900K-Medium-3M.mp4
 
We tested multiple generations of AMD, Intel, and Nvidia GPUs to look at both encoding performance and quality. Here's how the various cards stack up.

Video Encoding Tested: AMD GPUs Still Lag Behind Nvidia, Intel : Read more
Easily the most helpful article I have read this year, answered many of my questions. I live stream vMix at 4k, for corporate events. Typically, using my AMD 7950x, Nvidia RTX a4000 rig, am able to easily stream 4k/60 @ 200 Mbps, while simultaneously recording 4 ISOs, and the program at 70 MB/s each. Still, always looking for more, and seriously contemplating Intel Arc for an associated NDI rig I will build with a 13700K. I do test a bit with OBS, although my vMix Pro gives me dial in guests, multi-stream, and much more stability. I have been wanting to test AV1, today I avoid HVEC and stick with H.264. When is Intel's next high end, discrete card expected? October?
 
  • Like
Reactions: bit_user
I'm just using the built-in libsvtav1, nothing special AFAIK. Here's the 1080p AV1 encoding command:

Code:
ffmpeg -i BL3-Seq1-1080p.mp4 -c:v libsvtav1 -b:v 3M -y -g 120 BL3-Seq1-1080p-CPUAV1-13900K-Medium-3M.mp4

I can confirm that I get close to 300FPS on my Ryzen 9 5950X with AV1SVT now, using FFMPEG (v6) directly and your parameters.

I even get much smalller files at larger data rates and on top MSU VQMT reports excellent VMAAF scores...

Ether I have my brain in a twist or Handbrake really manages to mess up the performance very badly: I'll keep on testing to find an answer.

For now the only major difference that I see is audio, where I use DTS pass through on Handbrake, but that should lighten the compute load, if anything.
 
  • Like
Reactions: bit_user
Thanks for the very comprehensive test. As I am doing quite a bit of video editing myself, this was actually an incredibly useful and informative read!

One question, though: Why didn't you incorporate at least some real video footage alongside those games? The material would have been much more complex than the gaming stream captures and thus challenge those encoders in a somewhat different way.
 
  • Like
Reactions: Tac 25 and bit_user
Thanks for the very comprehensive test. As I am doing quite a bit of video editing myself, this was actually an incredibly useful and informative read!

One question, though: Why didn't you incorporate at least some real video footage alongside those games? The material would have been much more complex than the gaming stream captures and thus challenge those encoders in a somewhat different way.
It actually depends a lot on the type of video you use. Some movies as an example have even less movement (relatively speaking) than a game, because constantly panning camera shots can be annoying to viewers. I also didn't want lengthier encodes — as you can imagine with ~380 different encodings done, it takes plenty long even with ~30 second clips.

But the biggest issue is getting very high bitrate (and thus quality) source material. I basically want 4K and 60 fps, though I suppose 4K 30fps or 24fps might suffice. Like, downloading a YouTube 4K video would result in a source that's at best maybe 16Mbps or so, and thus re-encoding that to 8Mbps will yield far higher VMAF scores than starting with a source that's 100Mbps H.264.

Big Buck Bunny had a 4K 60fps version available for download. It's encoded at 8Mbps. Doctor Strange: Multiverse of Madness trailer meanwhile was about 12Mbps. Like I said, getting source material that's got a very high quality to start is difficult, unless you rip a Blu-ray maybe, but I don't even own any Blu-rays that would qualify.

I suspect there's less difference with typical movies and the game recordings I used than you might imagine. But also, this was about streaming quality, where someone would use a GPU in the first place. Which inherently implies game streaming is more likely than anything else.
 
That's what I wondered also, but it seems that everybody just seems to converge on the AV1SVT for software, which is the FFMPEG (and Handbrake) default.

It's 'punishing' in the sense that it uses every ounce of CPU power it can find, but it's supposed to reward with the best performance. AOM by contrast was supposed to be functionally complete but very little optimized, at least initially.

From what I've read there are other custom software codecs out there, but they are proprietary and niche.

Personally, I've found aom-av1 more efficient in quality/bitrate terms than svtav1 in ffmpeg encodes. But svtav1 has better mutlithreading, and some (few) encodes give better results than aom-av1.
 
Ugh. The good news is I have all the files and can easily make a script to recalc the VMAF. I wonder if I always had it wrong or accidentally shifted the order at some point? Stay tuned...

That's really good news. Perhaps the results would surprise you. I run some tests myself (not on a 13900K, on my 12400):

VMAF File
70.303203 BL3-Seq1-1080p-CPU265-13900K-Medium-3M.mp4
75.315422 BL3-Seq1-1080p-CPUAV1-13900K-Medium-3M.mp4

I run some tests on quality presets that approximate 3M (but on qp):
82.707927 BL3-Seq1-1080p_aom-av1-yuv420p12le.1pass.quality48.CPUPreset4.0.MP4
76.217054 BL3-Seq1-1080p_x265-yuv420p12le.1pass.quality33.PresetSlow.MP4

You can see that more CPU punishing presets (x265 slow, aom-av1 CPU4.0) give much much better quality.
 
So everyone went with Nvidia's NVENC

this has been a really useful tool for me.
it allows even my older pc's to encode at a decent speed with Handbrake.

I record videos often. Literally almost every coop with friends in Europe and Asia servers of Genshin, because I want to preserve and re-watch the fun times. Then I re-encode the files with Handbrake NVENC, to make them easier to store in flash drives. :)

edit: a small tip to people interested in using Handbrake to encode videos. Use "constant frame rate", it allows for a better quality.
 
OMG. That's still a thing? I last used it like 15 years ago!

Since I discovered AviDemux, I've switched to Linux for what little video tweaking & transcoding I've needed to do.
I don't know if it's "a thing" still, but at the very least it is still being used, for sure. There's VirtualDub2 and Mod, so I use one or the other for specific things (different optimizations). Mod was really good for MKV in the early days when it was a new container, but I believe VD2 has basically replaced all other versions and revisions.

Making multiple passes in long queues is so seamless; I love it.

Regards.
 
  • Like
Reactions: bit_user
That's really good news. Perhaps the results would surprise you. I run some tests myself (not on a 13900K, on my 12400):

VMAF File
70.303203 BL3-Seq1-1080p-CPU265-13900K-Medium-3M.mp4
75.315422 BL3-Seq1-1080p-CPUAV1-13900K-Medium-3M.mp4

I run some tests on quality presets that approximate 3M (but on qp):
82.707927 BL3-Seq1-1080p_aom-av1-yuv420p12le.1pass.quality48.CPUPreset4.0.MP4
76.217054 BL3-Seq1-1080p_x265-yuv420p12le.1pass.quality33.PresetSlow.MP4

You can see that more CPU punishing presets (x265 slow, aom-av1 CPU4.0) give much much better quality.
All of the charts and text are now fully updated with corrected VMAF scores. There's also a new torrent file, because I wanted to rename files, add the new logs, etc. VMAF scores dropped anywhere from I think 10% to as much as (nearly) 40% in some cases. AMD's older GPUs doing 4K at 8Mbps with H.264 had a particularly rough time, with results of ~33 VMAF on the Borderlands 3K video.
 
I'm glad you took the time to include the integrated graphics. However, I'm very disappointed the iGPU has such low throughput, especially at 4k. I would think Intel would be too lazy to change the decoder much between implementations, so I wonder if it's bottlenecked by memory speed, or something else.
 
To be fair, all hardware encoding is significantly less good looking than software at least for consumer chips.

I wish that was not the case, since any GPU is so much faster, and I've honestly just bought more disk space for all my recordings, since might was well just save them in the original recording format (though those are usually encoded either in mp4, h264 or something of the sorts, I don't have pro hardware), and just not worry too much about it, since VLC is now on mobile devices too.

Though it does make a lot of sense, AMD is just now killing it with GPUs, this is quite the edge use case for the consumer cards.
 
  • Like
Reactions: Fulgurant
Can you please make the graphs readable? You used 5 shades of blue!!!
Did you click the full-screen icon? That helps a lot.

One thing that makes it tricky is some of the line styles are just an outline. That results in a pair of lines, for some of the GPUs. I think the main issue is just that they cluster so tightly that there's a lot of overlapping.

Although it's usually considered bad practice to offset graphs so they don't start at 0, it might be permissible when you have so many lines that are bunched so closely together.
 
I am sure the kind of the first graph aimed at women with those shades of grayish blue. And to confusion due to the continuous and the dashed lines versus the color code samples ...
 
Can you please make the graphs readable? You used 5 shades of blue!!!
I don't know quite how Excel ended up with those colors, but there's also a ton of overlap. I'm working to write some VBA to update all the colors to be the same, as otherwise trying to manually do it for 13 cards in 12 charts sucks.

There is an image with the full table of data lower in the article if you're really looking for precision, though.
 
  • Like
Reactions: bit_user
If you need help with this DM me. Send me the code and the data, not the workbook (sorry I don't open workbooks from the internet).
It's done now. I wrote lots of VBA stuff for these charts already, I just never explicitly set the line colors. I've cleaned things up as well, so hopefully the various lines (where they're not fully obscured) are a bit easier to pick out. Lots of AMD overlapping AMD and Nvidia overlapping Nvidia, though. I could potentially drop some of those lines, but the data's all there now so...
 
  • Like
Reactions: bit_user