Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support AMD GPU encoding #45

Open
tve opened this issue Jul 2, 2022 · 19 comments
Open

Support AMD GPU encoding #45

tve opened this issue Jul 2, 2022 · 19 comments
Labels
enhancement New feature or request ffmpeg possibly-fixed Maybe fixed already, but I can't test it or would like confirmation

Comments

@tve
Copy link

tve commented Jul 2, 2022

I just got a new box with an integrated AMD GPU. Of course getting hardware encode/decode to work cost a bunch of hair off my head... I'm using linux, it's possible that the 'AMF' drivers on windows make things easier, dunno.

The simplest profile settings ASFAIK are:

  "vaapi": {
    "input": ["-hwaccel", "vaapi"],
    "output": ["-vcodec", "h264_vaapi"]
  },

The result is:

Impossible to convert between the formats supported by the filter 'Parsed_overlay_0' and the filter 'auto_scale_1'

which is ffmpeg's way to say that the output of the overlay filter can't be piped like that into the h264_vaapi encoder 'cause the latter expects the frame to be in the hardware/gpu. What's needed is a 'hwupload' filter that does the upload to the gpu memory. E.g., in FFMPEGOverlay instead of

            "-filter_complex", f"[0:v][1:v]overlay{filter_extra}",

it needs

            "-filter_complex", f"[0:v][1:v]overlay{filter_extra},hwupload",

nice, eh?

BTW, I noticed that ffmpeg has an overlay_vaapi filter, so this would mean the decoded video frames would stay in gpu, get overlaid, and then encoded. Sadly the AMD vaapi driver doesn't support that... I believe the Intel one might.

@time4tea
Copy link
Owner

time4tea commented Jul 2, 2022

This is great info, thanks for the detective work. I'll see if I can find a way to introduce this to the profile concept. Ffmpeg is super powerful but it doesn't seem to abstract the complexity away sometimes...

@tve
Copy link
Author

tve commented Jul 3, 2022

Every time I need to do something different with ffmpeg I have to spend time looking up docs, blogs, and stackexchange...

After digging into it, I'm not sure whether it's worth pursuing the vaapi encoding. The quality I'm getting is crap. The only really useful way to use it (for me) is with constant-quality mode (-qp flag). The default quality (-qp 20) is good, but the file size is ~2.5x the original. I find -qp 23 at the limit of what I'd accept (stuff just starts to get soft) and the file size is still ~2x the original. -qp 24 is noticeably soft and the file size is still 2x.

Compared to libx264... the very-fast preset you use by default is 20% slower than using vaapi encoding (vaapi takes the same time regardless of setting), produces a file that is a bit over half the original, and the quality is decent, perhaps similar to -qp 23 above. Using preset super fast is faster than vaapi for me and produces a file between very-fast and the original. Then there's also ultra-fast, which is good quality, very fast, but produces a file a tad larger than the original.

The one big caveat is that I'm using an AMD Ryzen 9 5900HX with integrated GPU. I don't know and have not found any info on how the video encoding block on that iGPU compares to those on higher-end discrete AMD GPUs. I also don't know what limitations the Linux VAAPI driver has vs. the actual HW capabilities that may be accessible on Windows. I do find complaints that the AMD HW doesn't produce B-frames, which I can verify looking at the files produced.

Any way, I'm planning to use the VAAPI decoding (no quality harm there) and then the libx264 ultra-fast preset to get an initial rendering and then redo using the medium preset. (Medium produces great quality, files ~60% the size of the original, but takes 2x as long as the very fast preset.)

@time4tea
Copy link
Owner

Based on your comment, "I'm not sure whether it's worth pursuing the vaapi encoding" - I wasn't planning to do anything with AMD GPU support. It that's not what you meant, please let me know!
In any case, I don't have an AMD GPU, so I'd be relying on you completely for implementation information... :-)

@time4tea time4tea added enhancement New feature or request ffmpeg labels Jul 22, 2022
@time4tea time4tea added the not-planned Its not quite like wontfix, but no current plans to implement, could change label Aug 5, 2022
@DemiMarie
Copy link

@tve what quality can one get with libx264 with the same file size as -qp 20?

@time4tea
Copy link
Owner

I dont know if anyone has experimented much with AMD GPU settings, but if there are recommendations, I'd be happy to include them in the documentation.
I dont have an AMD GPU so can't offer much, I'm afraid...

@time4tea
Copy link
Owner

time4tea commented Jan 8, 2023

@tve I revisited the vaapi config a little bit, and I think that I can make the config possible, by adding an optional "filter" parameter to the profile. Did you have any success with getting vaapi to work? What parameters did you use? Thanks!!

@tve
Copy link
Author

tve commented Jan 9, 2023

I did not pursue vaapi further after my last comment above. I'm using libx264 and the veryfast setting.

@time4tea
Copy link
Owner

time4tea commented Jun 4, 2023

Since 0.93.0, and support for the input/filter/output settings in the "profiles" configuration, this should be possible.

I dont know the settings, but check the PERFORMANCE_GUIDE doc, and the same sort of thing should work for vaapi...

@paxunix
Copy link

paxunix commented Jan 28, 2024

Just wanted to confirm that the "filter" property in profiles does indeed let you use vaapi. I'm currently using:

{
  "vaapi": {
    "input": [
      "-hwaccel", "vaapi",
      "-hwaccel_device", "/dev/dri/renderD128",
      "-hwaccel_output_format", "vaapi"
    ],
    "filter": "[1:v]format=rgba,hwupload[overlay];[0:v][overlay]overlay_vaapi",
    "output": [
      "-vcodec", "h264_vaapi",
      "-movflags", "faststart"
    ]
  }
}

@time4tea
Copy link
Owner

This is fantastic info. Thank you for sharing it.

@time4tea time4tea added possibly-fixed Maybe fixed already, but I can't test it or would like confirmation and removed not-planned Its not quite like wontfix, but no current plans to implement, could change labels Jan 28, 2024
@igutidze
Copy link

Unfortunately, AMD driver on Linux does not support overlay_vaapi filter, therefore this is what I'm using as an alternative to full vaapi pipeline:

{
  "vaapi": {
    "input": ["-hwaccel", "vaapi", "-hwaccel_output_format", "vaapi"],
    "filter": "[0:v]hwdownload,format=nv12[a],[a][1:v]overlay,hwupload",
    "output": ["-c:v", "hevc_vaapi", "-b:v", "25M"]
  }
}

It should work on all platforms. Shall I create a PR?

@time4tea
Copy link
Owner

Hi. This is great info. Maybe can call it vaapi-linux ? Then it can be added to the built in profiles. Don't worry about a pr, I can add it.

Thanks!

@igutidze
Copy link

@time4tea as you see fit!

@igutidze
Copy link

Btw, you may use both h264_vaapi and hevc_vaapi, I just preferred HEVC in my particular case

@RaveGun
Copy link

RaveGun commented Aug 27, 2024

Hello,
I've read all the above and I am still not sure if it is possible to use the HW acceleration just for generating the overlay.

I tried all the above overlay configurations and still get the:
Impossible to convert between the formats supported by the filter 'Parsed_overlay_0' and the filter 'auto_scale_1'
error.

I am on Ubuntu 24.04 with a 6650TX Radeon.

Can the generating of the overly be accelerated?

Thanks.

@igutidze
Copy link

@RaveGun what command line are you using and where do you add the ffmpeg configuration?

@RaveGun
Copy link

RaveGun commented Aug 29, 2024

@igutidze I have a file created at this location:
~/.gopro-graphics/ffmpeg-profiles.json

And the command line is:
venv/bin/gopro-dashboard.py --units-speed kph --use-gpx-only --gpx=../GPX/25.08.2024.gpx --layout-xml=layout.xml --profile=overlayhw --overlay-size=1920x1080 25.08.2024_lres.mp4

And the overlyhw has cahanged many times. Currently it is like this:

  "overlayhw": {
    "input": [],
    "filter": "[0:v]hwdownload,format=nv12[a],[a][1:v]overlay,hwupload",
    "output": ["-vcodec", "h264_vaapi", "-q:v", "65"]
  }, 

I am not happy with the performance of the system anyhow on editing any 4k Videos. I am not a content creator so this will be a one time event, that I will have to use the computer for 4k video editing.

@igutidze
Copy link

@RaveGun please add --show-ffmpeg command line switch and paste the full ffmpeg execution options here. It should look something like this:

Executing [PosixPath('bin/ffmpeg'), '-y', '-hide_banner', '-loglevel', 'info', '-hwaccel', 'vaapi', '-hwaccel_output_format', 'vaapi', '-i', '/home/irakli/Videos/GH010098.MP4', '-f', 'rawvideo', '-framerate', '10.0', '-s', '2704x1520', '-pix_fmt', 'rgba', '-i', '-', '-filter_complex', '[0:v]hwdownload,format=nv12[a],[a][1:v]overlay,hwupload', '-c:v', 'h264_vaapi', '-b:v', '25M', '-movflags', 'faststart', '/home/irakli/Videos/champ-dashboard.mp4']

@RaveGun
Copy link

RaveGun commented Aug 29, 2024

I did the test:

Executing ['ffmpeg', '-hide_banner', '-y', '-hide_banner', '-loglevel', 'info', '-f', 'rawvideo', '-framerate', '10.0', '-s', '1920x1080', '-pix_fmt', 'rgba', '-i', '-', '-r', '30', '-vcodec', 'h264_vaapi', '-q:v', '65', '25.08.2024_lres.mp4']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ffmpeg possibly-fixed Maybe fixed already, but I can't test it or would like confirmation
Projects
None yet
Development

No branches or pull requests

6 participants