mbnwa 49 Posted February 16, 2018 Share Posted February 16, 2018 (edited) Hello, I am seeing an issue where if I have HEVC decoding enabled in the transcode section and I attempt to play a 4K video that needs to be transcoded due to bandwidth limits I am getting around 19FPS transcode via the P4000 with 12-18% utilization - If I disable HEVC from the hardware decode and allow the CPU to transcode I get around 45-50FPS. My issue is I want to offload all transcoding tasks to the P4000 as I had a K2200 and it worked great for non-4K content however it did not support HEVC so hence the upgrade to the P4000. C:\...\MediaBrowser == Junction link to my RAID SSD array S:\...\Media Storage\ == 45 drive RAID 10 array (platter based + RAM cache backed) T:\...\Transcode == RAID SSD array ffmpeg version git-2017-12-31-2906363 Copyright © 2000-2017 the FFmpeg developers Here is the FFMPEG command that is being executed according to the logs: C:\Users\BLAH\AppData\Roaming\MediaBrowser-Server\System\ffmpeg.exe -c:v hevc_cuvid -i file:"S:\Media Storage\Media\MEDIA\Meida (4K UHD).mkv" -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -vf "scale=trunc(min(max(iw\,ih*dar)\,720)/2)*2:trunc(ow/dar/2)*2" -pix_fmt yuv420p -preset default -b:v 1116000 -maxrate 1116000 -bufsize 2232000 -profile:v high -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+3))" -copyts -vsync -1 -codec:a:0 libmp3lame -ac 2 -ab 384000 -af "volume=2" -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3 -individual_header_trailer 0 -segment_format mpegts -segment_list_type m3u8 -segment_start_number 0 -segment_list "T:\transcoding-temp\4095b6eda9031249089b2bba97c7cbfb.m3u8" -y "T:\transcoding-temp\4095b6eda9031249089b2bba97c7cbfb%d.ts" Stream mapping: Stream #0:0 -> #0:0 (hevc (hevc_cuvid) -> h264 (h264_nvenc))Stream #0:1 -> #0:1 (truehd (native) -> mp3 (libmp3lame))Press [q] to stop, [?] for help[segment @ 000002baa812c900] Opening 'T:\transcoding-temp\4095b6eda9031249089b2bba97c7cbfb0.ts' for writingOutput #0, segment, to 'T:\transcoding-temp\4095b6eda9031249089b2bba97c7cbfb%d.ts':Metadata:encoder : Lavf58.3.100Stream #0:0: Video: h264 (h264_nvenc) (High), yuv420p, 720x404 [sAR 404:405 DAR 16:9], q=-1--1, 1116 kb/s, 23.98 fps, 90k tbn, 23.98 tbcMetadata:encoder : Lavc58.9.100 h264_nvencSide data:cpb: bitrate max/min/avg: 1116000/0/1116000 buffer size: 2232000 vbv_delay: -1Stream #0:1: Audio: mp3 (libmp3lame), 48000 Hz, stereo, fltp (24 bit), 384 kb/s (default)Metadata:encoder : Lavc58.9.100 libmp3lameframe= 3 fps=0.0 q=18.0 size=N/A time=00:00:00.21 bitrate=N/A speed=0.43xframe= 14 fps= 13 q=10.0 size=N/A time=00:00:00.72 bitrate=N/A speed=0.689xframe= 25 fps= 16 q=10.0 size=N/A time=00:00:01.12 bitrate=N/A speed=0.713xframe= 36 fps= 17 q=10.0 size=N/A time=00:00:01.63 bitrate=N/A speed=0.775xframe= 47 fps= 18 q=10.0 size=N/A time=00:00:02.04 bitrate=N/A speed=0.77xframe= 56 fps= 18 q=10.0 size=N/A time=00:00:02.44 bitrate=N/A speed=0.771xframe= 66 fps= 18 q=10.0 size=N/A time=00:00:02.85 bitrate=N/A speed=0.776xframe= 77 fps= 18 q=10.0 size=N/A time=00:00:03.36 bitrate=N/A speed=0.796xframe= 87 fps= 18 q=22.0 size=N/A time=00:00:03.74 bitrate=N/A speed=0.789xframe= 97 fps= 18 q=21.0 size=N/A time=00:00:04.15 bitrate=N/A speed=0.785xframe= 107 fps= 18 q=26.0 size=N/A time=00:00:04.56 bitrate=N/A speed=0.785xframe= 116 fps= 18 q=27.0 size=N/A time=00:00:04.94 bitrate=N/A speed=0.779xframe= 126 fps= 18 q=28.0 size=N/A time=00:00:05.35 bitrate=N/A speed=0.78xframe= 136 fps= 18 q=24.0 size=N/A time=00:00:05.76 bitrate=N/A speed=0.779xframe= 147 fps= 19 q=23.0 size=N/A time=00:00:06.26 bitrate=N/A speed=0.789xframe= 157 fps= 19 q=27.0 size=N/A time=00:00:06.64 bitrate=N/A speed=0.787xframe= 167 fps= 19 q=28.0 size=N/A time=00:00:07.05 bitrate=N/A speed=0.786xframe= 176 fps= 19 q=28.0 size=N/A time=00:00:07.44 bitrate=N/A speed=0.784xframe= 187 fps= 19 q=29.0 size=N/A time=00:00:07.94 bitrate=N/A speed=0.792xframe= 197 fps= 19 q=30.0 size=N/A time=00:00:08.32 bitrate=N/A speed=0.789xframe= 207 fps= 19 q=28.0 size=N/A time=00:00:08.73 bitrate=N/A speed=0.788xframe= 217 fps= 19 q=26.0 size=N/A time=00:00:09.14 bitrate=N/A speed=0.789xframe= 226 fps= 19 q=24.0 size=N/A time=00:00:09.55 bitrate=N/A speed=0.787xframe= 236 fps= 19 q=24.0 size=N/A time=00:00:09.93 bitrate=N/A speed=0.786xframe= 246 fps= 19 q=24.0 size=N/A time=00:00:10.36 bitrate=N/A speed=0.788x[segment @ 000002baa812c900] Opening 'T:\transcoding-temp\4095b6eda9031249089b2bba97c7cbfb.m3u8.tmp' for writing[segment @ 000002baa812c900] Opening 'T:\transcoding-temp\4095b6eda9031249089b2bba97c7cbfb1.ts' for writingframe= 256 fps= 19 q=24.0 size=N/A time=00:00:10.75 bitrate=N/A speed=0.787xframe= 267 fps= 19 q=24.0 size=N/A time=00:00:11.25 bitrate=N/A speed=0.793xframe= 277 fps= 19 q=24.0 size=N/A time=00:00:11.64 bitrate=N/A speed=0.792xframe= 286 fps= 19 q=25.0 size=N/A time=00:00:12.04 bitrate=N/A speed=0.791xframe= 296 fps= 19 q=25.0 size=N/A time=00:00:12.45 bitrate=N/A speed=0.792xframe= 305 fps= 19 q=24.0 size=N/A time=00:00:12.88 bitrate=N/A speed=0.794xframe= 315 fps= 19 q=23.0 size=N/A time=00:00:13.24 bitrate=N/A speed=0.792xframe= 325 fps= 19 q=24.0 size=N/A time=00:00:13.65 bitrate=N/A speed=0.792xframe= 336 fps= 19 q=23.0 size=N/A time=00:00:14.16 bitrate=N/A speed=0.798xframe= 346 fps= 19 q=24.0 size=N/A time=00:00:14.54 bitrate=N/A speed=0.796xframe= 356 fps= 19 q=24.0 size=N/A time=00:00:14.95 bitrate=N/A speed=0.795xframe= 365 fps= 19 q=23.0 size=N/A time=00:00:15.33 bitrate=N/A speed=0.793xframe= 375 fps= 19 q=22.0 size=N/A time=00:00:15.74 bitrate=N/A speed=0.793xframe= 385 fps= 19 q=15.0 size=N/A time=00:00:16.15 bitrate=N/A speed=0.792xframe= 396 fps= 19 q=21.0 size=N/A time=00:00:16.65 bitrate=N/A speed=0.795xframe= 406 fps= 19 q=23.0 size=N/A time=00:00:17.04 bitrate=N/A speed=0.794xframe= 416 fps= 19 q=21.0 size=N/A time=00:00:17.44 bitrate=N/A speed=0.794xframe= 425 fps= 19 q=23.0 size=N/A time=00:00:17.85 bitrate=N/A speed=0.795xframe= 435 fps= 19 q=21.0 size=N/A time=00:00:18.24 bitrate=N/A speed=0.794xframe= 445 fps= 19 q=22.0 size=N/A time=00:00:18.67 bitrate=N/A speed=0.795xframe= 455 fps= 19 q=22.0 size=N/A time=00:00:19.05 bitrate=N/A speed=0.794xframe= 466 fps= 19 q=22.0 size=N/A time=00:00:19.56 bitrate=N/A speed=0.798xframe= 476 fps= 19 q=24.0 size=N/A time=00:00:19.94 bitrate=N/A speed=0.797xframe= 485 fps= 19 q=24.0 size=N/A time=00:00:20.35 bitrate=N/A speed=0.797xframe= 495 fps= 19 q=23.0 size=N/A time=00:00:20.76 bitrate=N/A speed=0.797x Edited February 16, 2018 by mbnwa Link to comment Share on other sites More sharing options...
Luke 37067 Posted February 16, 2018 Share Posted February 16, 2018 Assuming drivers are up to date it's hard to say. You could try turning off hevc encoding and just use decoding, and vice versa. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 16, 2018 Author Share Posted February 16, 2018 yeah if I disable HEVC decode things work good but then we are back to CPU based decoding and that's costly on my Dell R710 Drivers are currently the latest drivers. Link to comment Share on other sites More sharing options...
Luke 37067 Posted February 16, 2018 Share Posted February 16, 2018 But it is still encoding with the gpu. Perhaps it just has some inefficiencies on the gpu side. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 21, 2018 Author Share Posted February 21, 2018 (edited) @@Luke Here is some testing I have done, please let me know your thoughts... This simple change also increased my GPU decoder utilization by about 1/2 19% Emby's command vs around 35-40% with the updated command. FFMpeg Command 0.95x speed (19-22fps) - Emby Default C:\Users\$username\AppData\Roaming\MediaBrowser-Server\System\ffmpeg.exe -c:v hevc_cuvid -i file:"S:\Media Storage\$videos\$media_file\$media_file (4K UHD).mkv" -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -vf "scale=trunc(min(max(iw\,ih*dar)\,640)/2)*2:trunc(ow/dar/2)*2" -pix_fmt yuv420p -preset default -b:v 416000 -maxrate 416000 -bufsize 832000 -profile:v high -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+3))" -copyts -vsync -1 -codec:a:0 libmp3lame -ac 2 -ab 384000 -af "volume=2" -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3 -individual_header_trailer 0 -segment_format mpegts -segment_list_type m3u8 -segment_start_number 0 -segment_list "T:\transcoding-temp\1355279e13b40fa96934820627bdeecf.m3u8" -y "T:\transcoding-temp\1355279e13b40fa96934820627bdeecf%d.ts" frame= 128 fps= 23 q=23.0 size=N/A time=00:00:05.42 bitrate=N/A speed=0.958x Proposed FFMpeg Command avg 2.9x speed (63-74fps) FFMpeg seems to have an issue when using the following "-vf "scale=trunc(min(max(iw\,ih*dar)\,640)/2)*2:trunc(ow/dar/2)*2"" if you replace that string with -resize (widthXheight) as in the below example the speed of the transcode process (decoding / encoding) greatly increases. C:\Users\$username\AppData\Roaming\MediaBrowser-Server\System\ffmpeg.exe -c:v hevc_cuvid -resize 1280x720 -i file:"S:\Media Storage\$videos\$media_file\$media_file (4K UHD).mkv" -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -pix_fmt yuv420p -preset default -b:v 416000 -maxrate 416000 -bufsize 832000 -profile:v high -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+3))" -copyts -vsync -1 -codec:a:0 libmp3lame -ac 2 -ab 384000 -af "volume=2" -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3 -individual_header_trailer 0 -segment_format mpegts -segment_list_type m3u8 -segment_start_number 0 -segment_list "T:\transcoding-temp\1355279e13b40fa96934820627bde999.m3u8" -y "T:\transcoding-temp\1355279e13b40fa96934820627bdeecf9999%d.ts" 640x360 (same as Emby) = frame= 521 fps= 84 q=24.0 Lsize=N/A time=00:00:21.93 bitrate=N/A speed=3.53x 1280x720 = frame= 1438 fps= 73 q=28.0 size=N/A time=00:01:00.07 bitrate=N/A speed=3.06x EDIT: This also greatly helps h264 as well Emby Current Command: frame= 2352 fps=186 q=16.0 Lsize=N/A time=00:01:39.32 bitrate=N/A speed=7.86x Modified Command (same as Emby): frame= 4003 fps=573 q=39.0 Lsize=N/A time=00:02:48.19 bitrate=N/A speed=24.1x Modified Command 1280x720: frame= 1795 fps=327 q=49.0 Lsize=N/A time=00:01:16.09 bitrate=N/A speed=13.9x EDIT2: It looks like you can not use the -resize option if you are doing NON-HW decoding as it tosses back some errors so this would be for nVidia and maybe QS however I do not have a QS chip to check. Codec AVOption resize (Resize (width)x(height)) specified for input file #0 (file:S:\Media Storage\$videos\$media_file\$media_file (4K UHD).mkv) has not been used for any stream. The most likely reason is either wrong type (e.g. a video option with no video streams) or that it is a private option of some decoder which was not actually used for any stream. EDIT3: My test is a bit skewed as I think I used a higher resolution vs what Emby did due to the major difference in the q=XX line. I will look at the logs and post some updated below (Emby default res added above) Edited February 21, 2018 by mbnwa Link to comment Share on other sites More sharing options...
Luke 37067 Posted February 21, 2018 Share Posted February 21, 2018 where did you learn about this? i don't see that in ffmpeg documentation. thanks. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 21, 2018 Author Share Posted February 21, 2018 I have been digging a lot after our last discussion, found the following https://lists.ffmpeg.org/pipermail/ffmpeg-user/2017-July/036839.html when searching due to the inability to use -hwaccel cuvid and due to an error I got "Impossible to convert between the formats supported by the filter 'graph 0 input from stream 0:0' and the filter 'auto_scaler_0' Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0" so that started to drive me down the path to look and see if -vf "scale=trunc(min(max(iw\,ih*dar)\,640)/2)*2:trunc(ow/dar/2)*2" can be replaced with something that allows the hw decoder to work better and landed on that post about -resize > Try something like:>> ffmpeg -hwaccel cuvid -c:v h264_cuvid -deint bob -resize 1280x800 -i > foo -c:v h264_nvenc -c:a aac ./bar Link to comment Share on other sites More sharing options...
Luke 37067 Posted February 21, 2018 Share Posted February 21, 2018 are they fixed sizes or max sizes? because the advantage of vf scale is that we can feed in max sizes and it handles it for us so that we don't have a dependency on knowing the input resolution ahead of time. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 21, 2018 Author Share Posted February 21, 2018 I am not sure as I also could not really find any doc's on the function, if staying with vf scale maybe substitute to vf scale_npp however it does not look like Emby's build has --enable-libnpp, I'll look around and see if I can find FFMpeg already compiled with libnpp and see if replacing scale with scale_npp is a direct replacement. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 21, 2018 Author Share Posted February 21, 2018 (edited) It looks like NPP is part of non-free so I am building FFMpeg with NPP locally, IF this works maybe you can add an option on transcode hardware page Use libnpp (custom FFMpeg only) or something of the like. - It does not look like you can distribute ffmpeg with NPP built in. Edit: looks like I am having a heck of a time getting FFMpeg to compile with libnpp .. I'll look into that later however I don't think it's a good solution due to the limitation of not being able to distribute it already compiled. Edited February 21, 2018 by mbnwa Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 21, 2018 Author Share Posted February 21, 2018 (edited) on -resize the way I see this is that you are telling FFMpeg to resize the input file to HxW so as long as you know your output resolution when the command is executed it should work, you are doing the resize on the decoder side vs on the encoder side at least that's what it looks like when looking at the output files. If -resize works as I think it does, it would be best to use that vs trying to build libnpp into the builds that have to be done locally vs distributed. Edited February 21, 2018 by mbnwa Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 21, 2018 Author Share Posted February 21, 2018 @@Luke this is what I found on -resize from FFMpeg, clearly it looks like -resize is part of the cuvid options as I do not see a resize option with qsv M:\Emby\system>ffmpeg -h decoder=hevc_cuvid ffmpeg version git-2017-12-31-2906363 Copyright © 2000-2017 the FFmpeg developers built with gcc 7.2.0 (GCC) configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-bzlib --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-amf --enable-cuda --enable-cuvid --enable-d3d11va --enable-nvenc --enable-dxva2 --enable-avisynth --enable-libmfx libavutil 56. 7.100 / 56. 7.100 libavcodec 58. 9.100 / 58. 9.100 libavformat 58. 3.100 / 58. 3.100 libavdevice 58. 0.100 / 58. 0.100 libavfilter 7. 8.100 / 7. 8.100 libswscale 5. 0.101 / 5. 0.101 libswresample 3. 0.101 / 3. 0.101 libpostproc 55. 0.100 / 55. 0.100 Decoder hevc_cuvid [Nvidia CUVID HEVC decoder]: General capabilities: delay Threading capabilities: none Supported pixel formats: cuda nv12 p010le p016le hevc_cuvid AVOptions: -deint <int> .D.V.... Set deinterlacing mode (from 0 to 2) (default weave) weave .D.V.... Weave deinterlacing (do nothing) bob .D.V.... Bob deinterlacing adaptive .D.V.... Adaptive deinterlacing -gpu <string> .D.V.... GPU to be used for decoding -surfaces <int> .D.V.... Maximum surfaces to be used for decoding (from 0 to INT_MAX) (default 25) -drop_second_field <boolean> .D.V.... Drop second field when deinterlacing (default false) -crop <string> .D.V.... Crop (top)x(bottom)x(left)x(right) -resize <string> .D.V.... Resize (width)x(height) Link to comment Share on other sites More sharing options...
Luke 37067 Posted February 21, 2018 Share Posted February 21, 2018 ok, it's better than nothing. it's just unfortunate it requires knowledge of a fixed size. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 21, 2018 Author Share Posted February 21, 2018 (edited) fixed output size not input, I assume the transcoding profiles have fixed sizes? For example 4K video that needs to be outputted to 720p 1mbps would be WxH for that transcoding profile I assume you do not want people rolling ffmpeg to add npp? I am currently building ffmpeg with libnpp built in to see if I can use that in place of scale, if I can maybe you can add an advanced option to use scale_npp that people select a check box if they roll ffmpeg outside of Emby. I will post back after this super long compile of FFMpeg is done with the results of libnpp Edited February 21, 2018 by mbnwa Link to comment Share on other sites More sharing options...
Luke 37067 Posted February 21, 2018 Share Posted February 21, 2018 the client profiles have max sizes, those maxes are all we currently feed into ffmpeg. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 21, 2018 Author Share Posted February 21, 2018 ok so you feed max size into ffmpeg and what it uses -vr to calculate the correct output size based on bitrate ect? What would happen if you feed max size into the -resize option? Link to comment Share on other sites More sharing options...
Luke 37067 Posted February 21, 2018 Share Posted February 21, 2018 Don't know, sounds like it might force a video resize to those max sizes, which could be larger or smaller than the input. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 21, 2018 Author Share Posted February 21, 2018 hmm ok let's see what happens after my libnpp build is finished. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 24, 2018 Author Share Posted February 24, 2018 (edited) @Luke I finished compiling FFMpeg (side note: pain in the @$$ for sure, to include cuda-sdk in the build you have to have MS Visual C 2015, yes it has to be 2015 or it will not compile + nVidia toolkit) So If you want to support full hardware and keep the auto scale feature as you have above a custom compiled version of FFMpeg would be required and I assume would have to be created by the user as it includes items from the "non free" category. Please let me know if you intend to support this as my time is running out on returning the Quadro P4000 - I have no need for a 800$ paperweight if I can not use it for HEVC transcoding. In order to support this with the current version of FFMpeg that is shipped with Emby you would need to use the resize function. Ok now to the fun part that I have spent the last 3 days compiling by trial and error... To keep auto scale in place you have to A) Compile FFMpeg with cuda-sdk (libnpp is NOT required and it does not support 10bit anyway...) The following two commands would require custom compiled versions of FFMpeg due to the inclusion of --enable-cuda-sdk --enable-nonfree The updated command in order to use autoscaler for HEVC would be the following: C:\Users\$username\AppData\Roaming\MediaBrowser-Server\System\ffmpeg.exe -hwaccel cuvid -c:v hevc_cuvid -i file:"S:\Media Storage\$videos\$media_file\$media_file (4K UHD).mkv" -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -vf "scale_cuda=trunc(min(max(iw\,ih*dar)\,640)/2)*2:trunc(ow/dar/2)*2,hwdownload,format=p010le" -pix_fmt yuv420p -preset default -b:v 416000 -maxrate 416000 -bufsize 832000 -profile:v high -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+3))" -copyts -vsync -1 -codec:a:0 libmp3lame -ac 2 -ab 384000 -af "volume=2" -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3 -individual_header_trailer 0 -segment_format mpegts -segment_list_type m3u8 -segment_start_number 0 -segment_list "T:\transcoding-temp\1355279e13b40fa96934820627bdeecf.m3u8" -y "T:\transcoding-temp\1355279e13b40fa96934820627bdeecf%d.ts" The updated command in order to use autoscaler for H264 (based on one of my tests) would be the following: C:\Users\$username\AppData\Roaming\MediaBrowser-Server\System\ffmpeg.exe -hwaccel cuvid -c:v h264_cuvid -i file:"S:\Media Storage\$videos\$media_file\$media_file.m4v" -threads 0 -map 0:1 -map 0:0 -map -0:s -codec:v:0 h264_nvenc -vf "scale_cuda=trunc(min(max(iw\,ih*dar)\,640)/2)*2:trunc(ow/dar/2)*2,hwdownload,format=nv12" -pix_fmt yuv420p -preset default -b:v 674565 -maxrate 674565 -bufsize 1349130 -profile:v high -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+3))" -copyts -vsync -1 -codec:a:0 copy -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3 -individual_header_trailer 0 -segment_format mpegts -segment_list_type m3u8 -segment_start_number 0 -segment_list "T:\transcoding-temp\527943752781c5890fd839740c8b28db.m3u8" -y "T:\transcoding-temp\527943752781c5890fd839740c8b28db%d.ts" In short you have to tell the decoder to upload the frames to the GPU for processing, then during your autoscaler -vf call download the the frames back into system ram and you MUST provide the input format that was uploaded to the GPU so in this case due to it being 4K it was p010le however for my m4v's that I use I had to use format=nv12 Current Emby FFMpeg can support the following --resize HxW (with the removal of -vf autoscale) C:\Users\$username\AppData\Roaming\MediaBrowser-Server\System\ffmpeg.exe -c:v hevc_cuvid -resize 1280x720 -i file:"S:\Media Storage\$videos\$media_file\$media_file (4K UHD).mkv" -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -vf "scale=trunc(min(max(iw\,ih*dar)\,640)/2)*2:trunc(ow/dar/2)*2" -pix_fmt yuv420p -preset default -b:v 416000 -maxrate 416000 -bufsize 832000 -profile:v high -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+3))" -copyts -vsync -1 -codec:a:0 libmp3lame -ac 2 -ab 384000 -af "volume=2" -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3 -individual_header_trailer 0 -segment_format mpegts -segment_list_type m3u8 -segment_start_number 0 -segment_list "T:\transcoding-temp\1355279e13b40fa96934820627bde999.m3u8" -y "T:\transcoding-temp\1355279e13b40fa96934820627bdeecf9999%d.ts" I do have to say that performance is outstanding using this method (HEVC) even if it's a PITA to build, this method yielded the following: HEVC scale_cuda: frame= 8428 fps=138 q=35.0 Lsize=N/A time=00:05:51.72 bitrate=N/A speed=5.74x HEVC resize: frame= 521 fps= 84 q=24.0 Lsize=N/A time=00:00:21.93 bitrate=N/A speed=3.53x HEVC no optimization: frame= 128 fps= 23 q=23.0 size=N/A time=00:00:05.42 bitrate=N/A speed=0.958x H264 scale_cuda: frame= 1857 fps=598 q=27.0 Lsize=N/A time=00:01:18.69 bitrate=N/A speed=25.3x H264 resize: frame= frame= 2478 fps=570 q=35.0 Lsize=N/A time=00:01:44.59 bitrate=N/A speed=24.1x H264 no optimization: frame= 1216 fps=173 q=20.0 size=N/A time=00:00:51.98 bitrate=N/A speed=7.39x Edited February 25, 2018 by mbnwa Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 24, 2018 Author Share Posted February 24, 2018 @@Luke Some time ago I remember that you used to be able to point Emby to a custom FFMpeg directory, I can not seem to find that option any longer was it removed? If you are interested in testing out the compiled version of FFMpeg I have (assuming you have nVidia hardware, let me know and I will PM you a link via DropBox. Link to comment Share on other sites More sharing options...
Luke 37067 Posted February 24, 2018 Share Posted February 24, 2018 That option too much troubleshooting for us. You can just replace the emby executables if you want to. I'll try to incorporate the resize option. Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 24, 2018 Author Share Posted February 24, 2018 Thanks, I will be on the lookout for the resize option, I guess I will keep the P4000 at this point seeing your willing to look at the resize option as a solution for Nvidia hardware. Link to comment Share on other sites More sharing options...
Luke 37067 Posted February 24, 2018 Share Posted February 24, 2018 Well actually it looks like based on your research it should be scale_cuda instead? Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 24, 2018 Author Share Posted February 24, 2018 (edited) @@Luke scale_cuda does work and works well, however it would require the end user to compile FFmpeg with Cuda SDK - resize works with the included version of FFmpeg that is distributed with Emby today. I think for the highest level of compatibility using resize will be advantageous to all users as doing your own compile is not for everyone. If you want to take on both [emoji4] you could update the current Nvidia profile to support resize and add a new one that uses scale_cuda for advanced users that can replace the bundled version of FFmpeg. I would be happy with either one or both just depends on what you want to take on. Sent from my iPhone using Tapatalk Edited February 24, 2018 by mbnwa Link to comment Share on other sites More sharing options...
mbnwa 49 Posted February 25, 2018 Author Share Posted February 25, 2018 (edited) @@Luke I updated the post on the last page to reflect everything in a single post however to sum up my post I will post the relevant parts here. Current Emby FFMpeg can support the following --resize HxW (with the removal of -vf autoscale) C:\Users\$username\AppData\Roaming\MediaBrowser-Server\System\ffmpeg.exe -c:v hevc_cuvid -resize 1280x720 -i file:"S:\Media Storage\$videos\$media_file\$media_file (4K UHD).mkv" -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -vf "scale=trunc(min(max(iw\,ih*dar)\,640)/2)*2:trunc(ow/dar/2)*2" -pix_fmt yuv420p -preset default -b:v 416000 -maxrate 416000 -bufsize 832000 -profile:v high -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+3))" -copyts -vsync -1 -codec:a:0 libmp3lame -ac 2 -ab 384000 -af "volume=2" -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3 -individual_header_trailer 0 -segment_format mpegts -segment_list_type m3u8 -segment_start_number 0 -segment_list "T:\transcoding-temp\1355279e13b40fa96934820627bde999.m3u8" -y "T:\transcoding-temp\1355279e13b40fa96934820627bdeecf9999%d.ts" The following two commands would require custom compiled versions of FFMpeg due to the inclusion of --enable-cuda-sdk --enable-nonfree The updated command in order to use autoscaler for HEVC would be the following: C:\Users\$username\AppData\Roaming\MediaBrowser-Server\System\ffmpeg.exe -hwaccel cuvid -c:v hevc_cuvid -i file:"S:\Media Storage\$videos\$media_file\$media_file (4K UHD).mkv" -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -vf "scale_cuda=trunc(min(max(iw\,ih*dar)\,640)/2)*2:trunc(ow/dar/2)*2,hwdownload,format=p010le" -pix_fmt yuv420p -preset default -b:v 416000 -maxrate 416000 -bufsize 832000 -profile:v high -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+3))" -copyts -vsync -1 -codec:a:0 libmp3lame -ac 2 -ab 384000 -af "volume=2" -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3 -individual_header_trailer 0 -segment_format mpegts -segment_list_type m3u8 -segment_start_number 0 -segment_list "T:\transcoding-temp\1355279e13b40fa96934820627bdeecf.m3u8" -y "T:\transcoding-temp\1355279e13b40fa96934820627bdeecf%d.ts" The updated command in order to use autoscaler for H264 (based on one of my tests) would be the following: C:\Users\$username\AppData\Roaming\MediaBrowser-Server\System\ffmpeg.exe -hwaccel cuvid -c:v h264_cuvid -i file:"S:\Media Storage\$videos\$media_file\$media_file.m4v" -threads 0 -map 0:1 -map 0:0 -map -0:s -codec:v:0 h264_nvenc -vf "scale_cuda=trunc(min(max(iw\,ih*dar)\,640)/2)*2:trunc(ow/dar/2)*2,hwdownload,format=nv12" -pix_fmt yuv420p -preset default -b:v 674565 -maxrate 674565 -bufsize 1349130 -profile:v high -force_key_frames "expr:if(isnan(prev_forced_t),eq(t,t),gte(t,prev_forced_t+3))" -copyts -vsync -1 -codec:a:0 copy -f segment -max_delay 5000000 -avoid_negative_ts disabled -map_metadata -1 -map_chapters -1 -start_at_zero -segment_time 3 -individual_header_trailer 0 -segment_format mpegts -segment_list_type m3u8 -segment_start_number 0 -segment_list "T:\transcoding-temp\527943752781c5890fd839740c8b28db.m3u8" -y "T:\transcoding-temp\527943752781c5890fd839740c8b28db%d.ts" In short you have to tell the decoder to upload the frames to the GPU for processing, then during your autoscaler -vf call download the the frames back into system ram and you MUST provide the input format that was uploaded to the GPU so in this case due to it being 4K it was p010le however for my m4v's that I use I had to use format=nv12 (format I assume will be dynamic based on the source video.) I do have to say that performance is outstanding using this method (HEVC) even if it's a PITA to build, this method yielded the following: HEVC scale_cuda: frame= 8428 fps=138 q=35.0 Lsize=N/A time=00:05:51.72 bitrate=N/A speed=5.74x HEVC resize: frame= 521 fps= 84 q=24.0 Lsize=N/A time=00:00:21.93 bitrate=N/A speed=3.53x HEVC no optimization: frame= 128 fps= 23 q=23.0 size=N/A time=00:00:05.42 bitrate=N/A speed=0.958x H264 scale_cuda: frame= 1857 fps=598 q=27.0 Lsize=N/A time=00:01:18.69 bitrate=N/A speed=25.3x H264 resize: frame= frame= 2478 fps=570 q=35.0 Lsize=N/A time=00:01:44.59 bitrate=N/A speed=24.1x H264 no optimization: frame= 1216 fps=173 q=20.0 size=N/A time=00:00:51.98 bitrate=N/A speed=7.39x Edited February 25, 2018 by mbnwa Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now