Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
Intel Corporation HD Graphics 630 (rev 04)
Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/580] (rev cf)
Software version:
linux 4.12.8-1
mpv 1:0.26.0-3
libva 1.8.3-1
libva-vdpau-driver 0.7.4-2
libvdpau 1.1.1-2
mesa-vdpau 17.1.6-1
vainfo
libva info: VA-API version 0.40.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/lib/dri/radeonsi_drv_video.so
libva info: Found init function __vaDriverInit_0_40
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.40 (libva )
vainfo: Driver version: mesa gallium vaapi
vainfo: Supported profile and entrypoints
VAProfileMPEG2Simple : VAEntrypointVLD
VAProfileMPEG2Main : VAEntrypointVLD
VAProfileVC1Simple : VAEntrypointVLD
VAProfileVC1Main : VAEntrypointVLD
VAProfileVC1Advanced : VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointVLD
VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
VAProfileH264Main : VAEntrypointVLD
VAProfileH264Main : VAEntrypointEncSlice
VAProfileH264High : VAEntrypointVLD
VAProfileH264High : VAEntrypointEncSlice
VAProfileHEVCMain : VAEntrypointVLD
VAProfileHEVCMain10 : VAEntrypointVLD
VAProfileNone : VAEntrypointVideoProc
vdpauinfo
display: :1 screen: 0
API version: 1
Information string: G3DVL VDPAU Driver Shared Library version 1.0
Video surface:
name width height types
-------------------------------------------
420 16384 16384 NV12 YV12
422 16384 16384 UYVY YUYV
444 16384 16384 Y8U8V8A8 V8U8Y8A8
Decoder capabilities:
name level macbs width height
----------------------------------------------------
MPEG1 --- not supported ---
MPEG2_SIMPLE 3 65536 4096 4096
MPEG2_MAIN 3 65536 4096 4096
H264_BASELINE 52 65536 4096 4096
H264_MAIN 52 65536 4096 4096
H264_HIGH 52 65536 4096 4096
VC1_SIMPLE 1 65536 4096 4096
VC1_MAIN 2 65536 4096 4096
VC1_ADVANCED 4 65536 4096 4096
MPEG4_PART2_SP 3 65536 4096 4096
MPEG4_PART2_ASP 5 65536 4096 4096
DIVX4_QMOBILE --- not supported ---
DIVX4_MOBILE --- not supported ---
DIVX4_HOME_THEATER --- not supported ---
DIVX4_HD_1080P --- not supported ---
DIVX5_QMOBILE --- not supported ---
DIVX5_MOBILE --- not supported ---
DIVX5_HOME_THEATER --- not supported ---
DIVX5_HD_1080P --- not supported ---
H264_CONSTRAINED_BASELINE 0 65536 4096 4096
H264_EXTENDED --- not supported ---
H264_PROGRESSIVE_HIGH --- not supported ---
H264_CONSTRAINED_HIGH --- not supported ---
H264_HIGH_444_PREDICTIVE --- not supported ---
HEVC_MAIN 186 65536 4096 4096
HEVC_MAIN_10 186 65536 4096 4096
HEVC_MAIN_STILL --- not supported ---
HEVC_MAIN_12 --- not supported ---
HEVC_MAIN_444 --- not supported ---
Output surface:
name width height nat types
----------------------------------------------------
B8G8R8A8 16384 16384 y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 A8I8 I8A8
R8G8B8A8 16384 16384 y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 A8I8 I8A8
R10G10B10A2 16384 16384 y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 A8I8 I8A8
B10G10R10A2 16384 16384 y NV12 YV12 UYVY YUYV Y8U8V8A8 V8U8Y8A8 A8I8 I8A8
Bitmap surface:
name width height
------------------------------
B8G8R8A8 16384 16384
R8G8B8A8 16384 16384
R10G10B10A2 16384 16384
B10G10R10A2 16384 16384
A8 16384 16384
Video mixer:
feature name sup
------------------------------------
DEINTERLACE_TEMPORAL y
DEINTERLACE_TEMPORAL_SPATIAL -
INVERSE_TELECINE -
NOISE_REDUCTION y
SHARPNESS y
LUMA_KEY y
HIGH QUALITY SCALING - L1 y
HIGH QUALITY SCALING - L2 -
HIGH QUALITY SCALING - L3 -
HIGH QUALITY SCALING - L4 -
HIGH QUALITY SCALING - L5 -
HIGH QUALITY SCALING - L6 -
HIGH QUALITY SCALING - L7 -
HIGH QUALITY SCALING - L8 -
HIGH QUALITY SCALING - L9 -
parameter name sup min max
-----------------------------------------------------
VIDEO_SURFACE_WIDTH y 48 4096
VIDEO_SURFACE_HEIGHT y 48 4096
CHROMA_TYPE y
LAYERS y 0 4
attribute name sup min max
-----------------------------------------------------
BACKGROUND_COLOR y
CSC_MATRIX y
NOISE_REDUCTION_LEVEL y 0.00 1.00
SHARPNESS_LEVEL y -1.00 1.00
LUMA_KEY_MIN_LUMA y
LUMA_KEY_MAX_LUMA y
ffmpeg -i "$IN"
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '$IN':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf56.40.101
Duration: 00:10:00.03, start: 0.000000, bitrate: 2078 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 1967 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 101 kb/s (default)
Metadata:
handler_name : SoundHandler
/usr/bin/time -v mpv -fs --length 10 "$IN"
VO: [opengl] 1280x720 yuv420p
User time (seconds): 3.82
System time (seconds): 0.13
Percent of CPU this job got: 38%
User time (seconds): 3.26
System time (seconds): 0.18
Percent of CPU this job got: 33%
User time (seconds): 3.86
System time (seconds): 0.18
Percent of CPU this job got: 39%
/usr/bin/time -v mpv -fs --length 10 -vo=vaapi "$IN"
VO: [vaapi] 1280x720 yuv420p
User time (seconds): 4.02
System time (seconds): 0.17
Percent of CPU this job got: 41%
User time (seconds): 3.50
System time (seconds): 0.19
Percent of CPU this job got: 36%
User time (seconds): 4.11
System time (seconds): 0.12
Percent of CPU this job got: 41%
/usr/bin/time -v mpv -fs --length 10 -vo=vaapi --hwdec=auto "${IN}"
Using hardware decoding (vdpau-copy).
VO: [vaapi] 1280x720 nv12
User time (seconds): 1.14
System time (seconds): 0.33
Percent of CPU this job got: 14%
User time (seconds): 0.94
System time (seconds): 0.34
Percent of CPU this job got: 12%
User time (seconds): 1.02
System time (seconds): 0.32
Percent of CPU this job got: 13%
/usr/bin/time -v mpv -fs --length 10 -vo=vaapi --hwdec=vaapi "${IN}"
Using hardware decoding (vaapi).
VO: [vaapi] 1280x720 vaapi[nv12]
User time (seconds): 0.51
System time (seconds): 0.25
Percent of CPU this job got: 7%
User time (seconds): 0.53
System time (seconds): 0.22
Percent of CPU this job got: 7%
User time (seconds): 0.60
System time (seconds): 0.25
Percent of CPU this job got: 8%
/usr/bin/time -v mpv -fs --length 10 -vo=vdpau "${IN}"
VO: [vdpau] 1280x720 yuv420p
[vo/vdpau] Compositing window manager detected. Assuming timing info is inaccurate.
User time (seconds): 3.96
System time (seconds): 0.12
Percent of CPU this job got: 40%
User time (seconds): 4.44
System time (seconds): 0.21
Percent of CPU this job got: 45%
User time (seconds): 4.71
System time (seconds): 0.22
Percent of CPU this job got: 48%
/usr/bin/time -v mpv -fs --length 10 -vo=vdpau --hwdec=auto "${IN}"
Using hardware decoding (vdpau).
VO: [vdpau] 1280x720 vdpau[yuv420p]
[vo/vdpau] Compositing window manager detected. Assuming timing info is inaccurate.
User time (seconds): 0.55
System time (seconds): 0.21
Percent of CPU this job got: 7%
User time (seconds): 0.67
System time (seconds): 0.27
Percent of CPU this job got: 9%
User time (seconds): 0.56
System time (seconds): 0.25
Percent of CPU this job got: 8%
Selecting vdpau
or vaapi
would reduce CPU load by about 5x on supported codecs.
Resulting config ~/.config/mpv/mpv.conf
:
hwdec=auto
vo=vdpau
Tested encoding the following video which ws previously encoded to vp9:
Duration: 00:57:26.89, start: 0.000000, bitrate: 3359 kb/s
Stream #0:0(eng): Video: vp9 (Profile 0), yuv420p(tv, progressive), 1280x720, SAR 1:1 DAR 16:9, 29.97 fps, 29.97 tbr, 1k tbn, 1k tbc (default)
/usr/bin/time -v ffmpeg -hwaccel vaapi -i "${src}" \
-vaapi_device /dev/dri/renderD129 -vf 'format=nv12,hwupload' -vcodec hevc_vaapi \
-pass 1 -crf ${crf} -threads 8 -an -y -f matroska "/dev/null"
/usr/bin/time -v ffmpeg -hwaccel vaapi -i "${src}" \
-vaapi_device /dev/dri/renderD129 -vf 'format=nv12,hwupload' -vcodec hevc_vaapi \
-pass 2 -acodec copy -crf ${crf} -threads 8 -y -f matroska "${src}.hevc.hw.mkv"
Each pass took about 38 minutes.
/usr/bin/time -v ffmpeg -i "${src}" \
-vcodec hevc \
-pass 1 -crf ${crf} -threads 8 -an -y -f matroska "/dev/null"
/usr/bin/time -v ffmpeg -i "${src}" \
-vcodec hevc \
-pass 2 -acodec copy -crf ${crf} -threads 8 -y -f matroska "${src}.hevc.sw.mkv"
Each pass took about 14 minutes.
mkfifo fifo1 fifo2
mpv --msg-level=vd=debug --input-file=fifo1 input1.mkv -pause --start 10:00
mpv --msg-level=vd=debug --input-file=fifo1 input2.mkv -pause --start 10:00
echo pause | tee fifo1 fifo2
Description | Size | Encode Time | Subjective Quality |
---|---|---|---|
VP9 Source | 1.4G | n/a | Good |
HEVC SW | 1G | 2x 38m | Good, same as src |
HEVC HW | 1.3G | 2x 14m | Ok |
In conclusion, the Intel HEVC Quick Sync encoder is 2.5-3x faster then libx265 but produced videos of slightly lower quality and are approximately 30% bigger.
The HEVC/h.265/x265 video encoder/decoder is favored for hardware acceleration over VP9 implementations due to readily available decoding support (AMD RX 470 GPU).
The libx265 software codec delivers similar quality to VP9 source and is 40% smaller then the Intel Quick Sync technology.
End result was to use HEVC for re-encoding some videos due to readily available decoders and encoders. I will revisit vp9 when I buy my next graphics card in a few years, vp9 has preferrable licensing and is standardized for webm containers.
THe libx265 software codec delivers smaller files with higher visual quality at the expense of CPU encoding time. Because storage space and visual quality are the prime concern software encoding is used. Intel Quick Sync is the clear winner for high speed encoding, but seems more difficult to configure for optimal visual and size performance.
The AMD RX 470 hardware can support 10-bit VP9 decoding with UVD 6.3, but sofware support is missing in the Mesa VA driver for complete UVD 6.3 features. The GPU can also encode HEVC, but support is also missing in the Mesa VA driver for VCE 3.4. I suspect the HEVC encoding feature will deliver similar results as Quick Sync: Faster runtime performance at the expense of file size and visual quality in which case I'll still prefer libx265.
Command being timed: "ffmpeg -t 90 -i casual-test.webm -vcodec hevc -crf 22 -acodec copy -threads 8 -y -f matroska -benchmark casual-test.webm.sw-dec.90.mkv"
User time (seconds): 470.27
System time (seconds): 0.86
Percent of CPU this job got: 739%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:03.72
Command being timed: "../transcode.sh casual-test.webm"
User time (seconds): 939.82
System time (seconds): 1.63
Percent of CPU this job got: 745%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:06.34
Add -t 90 -vaapi_device /dev/dri/renderD129 -hwaccel vaapi
before input file.
Entire encoding for first 90 seconds:
Command being timed: "ffmpeg -t 90 -vaapi_device /dev/dri/renderD129 -hwaccel vaapi -i casual-test.webm -vcodec hevc -crf 22 -acodec copy -threads 8 -y -f matroska -benchmark casual-test.webm.hw-dec.90.mkv"
User time (seconds): 451.98
System time (seconds): 2.30
Percent of CPU this job got: 737%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:01.62
Command being timed: "../transcode.sh casual-test.webm"
User time (seconds): 904.02
System time (seconds): 5.10
Percent of CPU this job got: 737%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:03.31
File sizes are identical, so I'm assuming hw/sw decoding is deterministic? In which case the Intel Quick Sync is about 5% faster and essentially free and leaves slight more CPU for encoding, but encoding dominates the process.