Jump to content


Photo

Server not using scale_cuda for 4k HEVC


  • Please log in to reply
4 replies to this topic

#1 terahz OFFLINE  

terahz

    Newbie

  • Members
  • 3 posts
  • Local time: 12:23 PM

Posted 22 October 2019 - 10:51 PM

Hi,

 

I just want to report that for some? HEVC files, Emby doesn't use full cuda acceleration and thus can't keep up going from 4K HEVC -> 1080 h264 single stream. 

 

ex:

4k h264( h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 3840x2160 [SAR 1:1 DAR 16:9], 82970 kb/s, Level 51, 30 fps, 30 tbr, 30 tbn, 60 tbc (default)) to 1080 h264 - about 150fps (h264_to_h264.txt.gz)

 

/opt/emby-server/bin/ffmpeg -hwaccel cuda -hwaccel_device 0 -hwaccel_output_format cuda  -c:v h264_cuvid  -f mp4 -i file:"/nfs/testfile.mp4" -threads 0 -map 0:0 -map 0:1 -sn -c:v:0 h264_nvenc -filter_complex "[0:0]scale_cuda=w=trunc(min(max(iw\,ih*dar)\,1920)/2)*2:h=trunc(ow/dar/2)*2"  -b:v:0 14680001 -maxrate 14680001 -bufsize 29360002 -profile:v:0 high -g:v:0 90 -keyint_min:v:0 90 -sc_threshold:v:0 0  -copyts -vsync -1 -codec:a:0 copy -disposition:a:0 default -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3  -individual_header_trailer 0 -segment_format mpegts -segment_write_temp 1 -segment_list_type m3u8 -segment_start_number 0 -segment_list "/tmp/transcoding-temp/fbc2e74a36a640812fcb214d61147d5d.m3u8" -y "/tmp/transcoding-temp/fbc2e74a36a640812fcb214d61147d5d%d.ts"

 

4k HEVC (hevc, none, 3840x2160, SAR 1:1 DAR 16:9, 23.98 fps, 23.98 tbr, 1k tbn (default)) to 1080 h264 - about 15fps (hevc_to_h264.txt.gz)

 

/opt/emby-server/bin/ffmpeg -ss 00:41:45.000 -c:v hevc_cuvid -f matroska -i file:"/nfs/testfile.mkv" -threads 0 -map 0:0 -map 0:3 -sn -c:v:0 h264_nvenc -filter_complex "[0:0]scale=trunc(min(max(iw\,ih*dar)\,1920)/2)*2:trunc(ow/dar/2)*2" -pix_fmt yuv420p  -b:v:0 14616000 -maxrate 14616000 -bufsize 29232000 -profile:v:0 high -g:v:0 72 -keyint_min:v:0 72 -sc_threshold:v:0 0  -copyts -vsync -1 -codec:a:0 libmp3lame -metadata:s:a:0 language=eng -disposition:a:0 default -ac:a:0 2 -ab:a:0 192000 -af:a:0 "volume=2" -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3  -individual_header_trailer 0 -segment_format mpegts -segment_write_temp 1 -segment_list_type m3u8 -segment_start_number 835 -segment_list "/tmp/transcoding-temp/93b669574d545f9d56fbffcdd3abe839.m3u8" -y "/tmp/transcoding-temp/93b669574d545f9d56fbffcdd3abe839%d.ts"

 

Modifying the command to include cuda like this results in steady 80fps:

 

/opt/emby-server/bin/ffmpeg -ss 00:41:45.000 -hwaccel cuda -hwaccel_device 0 -hwaccel_output_format cuda -c:v hevc_cuvid -f matroska -i file:"/nfs/testfile.mkv" -threads 0 -map 0:0 -map 0:3 -sn -c:v:0 h264_nvenc -filter_complex "scale_cuda=w=trunc(min(max(iw\,ih*dar)\,1920)/2)*2:h=trunc(ow/dar/2)*2,hwdownload,format=p010le" -pix_fmt yuv420p  -b:v:0 14616000 -maxrate 14616000 -bufsize 29232000 -profile:v:0 high -g:v:0 72 -keyint_min:v:0 72 -sc_threshold:v:0 0  -copyts -vsync -1 -codec:a:0 libmp3lame -metadata:s:a:0 language=eng -disposition:a:0 default -ac:a:0 2 -ab:a:0 192000 -af:a:0 "volume=2" -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3  -individual_header_trailer 0 -segment_format mpegts -segment_write_temp 1 -segment_list_type m3u8 -segment_start_number 835 -segment_list "/tmp/transcoding-temp/93b669574d545f9d56fbffcdd3abe839.m3u8" -y "/tmp/transcoding-temp/93b669574d545f9d56fbffcdd3abe839%d.ts"

 

 

Emby version 4.2.1.0

CentOS 7

Nvidia drivers 440.26

Cuda 10.2

 

Also, thanks for the nice work so far! Based on my limited testing, I've already purchased emby premium! 

Attached Files



#2 Luke OFFLINE  

Luke

    System Architect

  • Administrators
  • 140184 posts
  • Local time: 12:23 PM

Posted 22 October 2019 - 11:28 PM

@softworkz can comment on this. Thanks.



#3 softworkz OFFLINE  

softworkz

    Advanced Member

  • Developers
  • 1863 posts
  • Local time: 06:23 PM

Posted 23 October 2019 - 12:27 AM

@terahz - Handling color conversions is currently a weak point. When the source video is 10bit, we're avoiding hw scaling because we can't handle it properly. 

 

I understand that one might think that it would be easy to fix - which is obviously true for that exact situation. But there are hundreds of different cases that we need to account for and that's where things are getting a bit more complex. Not unsolvable, but for historic reasons we are handling input, filtering and output more or less independently from each other, and that model isn't suited anymore for handling color conversions in combination with hw acceleration. That's why it is a huge step for us - but we're already working on it!

 

There's no doubt that the command line you're showing is doing much better than what Emby currently does. But it's still not the most desirable solution because it involves copying all video data back from GPU memory to system memory after hw scaling, then converting color format using CPU, and afterwards transferring video data to GPU memory again for encoding.

Ideally, the color conversion will happen in hardware to avoid the copying and cpu processing.

 

We will get there - stay tuned!


Edited by softworkz, 23 October 2019 - 12:27 AM.


#4 terahz OFFLINE  

terahz

    Newbie

  • Members
  • 3 posts
  • Local time: 12:23 PM

Posted 23 October 2019 - 10:46 AM

@softworkz, I completely understand. I'm sure you guys will figure it out. I can't even imagine the amount of combinations you have to worry about when tuning ffmpeg.

 

I've seen some other threads about using resizing in the decoder. That yields the best performance for me and there is no need to use a scale filter. I hope that's still on the table, especially when someone selects a manual quality setting.

 

Meanwhile, consider putting a checkbox in the Transcoding section of the settings for HW scaling of HEVC. As it is, those files are not watchable for me on anything that can't handle the direct stream. I'll take inferior quality/color over 10fps :) Also, I can't migrate to Emby until I can get my kids' videos playable by grandparents' tablets and computers. Half of them are done by a camera that shoots HEVC 10bit. My next step is to figure out how to tell Emby to reuse the Optimized Versions I've already generated using that other platform ;)

 

 

Also, thanks for the detailed and quick response. This right here is one of the main reasons I'm planning to switch to Emby. 



#5 softworkz OFFLINE  

softworkz

    Advanced Member

  • Developers
  • 1863 posts
  • Local time: 06:23 PM

Posted 23 October 2019 - 03:11 PM

All I can say is that it's a product management decision. If it was about me, we would already have a few more options to configure. 

While I don't advocate having an abundant set of options that would require users to be transcoding experts, there are some which would be reasonable and where Emby cannot automatically make the right decisions.to adapt to a user's requirements. I hope we'll make some progress in that area soon.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users