This study compares 8 differents image formats, AOM AV1, BPG, Daala, FLIF, JPEG XR, JPEG 2000, JPEG and WebP. We use five algorithms in order to compare each format:
The image set is comprised of 50 images from the subset 1 and subset 2 maintened by Xiph. All images are YCbCr 4:2:0 Y4M files.
https://aomedia.googlesource.com/aom/
. The versions used are built from GIT revision 02affd269df5d8abbfc75f5bdad0c080308e0ce1
(october 2016), ce7272d2d00f224475849c1b1bca0a97b70ea0c4
(july 2017) and 7d3bd8daba6e51566f0458e3f842e246a559ea82
(february 2018).https://git.xiph.org/?p=daala.git
. The version used is built from GIT revision 72783687ce4963478b8ab4d97809510f40c7c855
.https://github.com/FLIF-hub/FLIF
. The version used is built from GIT revision c17459bab5399ed5009c262e9954d474f275db7f
.https://jxrlib.codeplex.com/
. The version used is built from GIT revision e922fa50cdf9a58f40cad07553bcaa2883d3c5bf
.http://kakadusoftware.com/downloads/
. The version used is 7.10.2.https://github.com/mozilla/mozjpeg
. The version used is 3.3.1.https://chromium.googlesource.com/webm/libvpx
. The version used is 1.7.0.https://chromium.googlesource.com/webm/libwebp
. The version used is 0.6.1.https://github.com/google/pik
. The version used is built from GIT revision 52f2d45cc8e35e45278da54615bb8b11b5066f16
.The VMAF (Video Multi-Method Assessment Fusion) metric is computed using vmafossexec
, provided by Netflix: https://github.com/Netflix/vmaf
. The version used is built from GIT revision 7ebbde0c64493af978da66cb7ebe2946fb12dec2
.
vmafossexec
compares two YUV files, given their subsampling and dimensions.
Y-MSSIM, Y-SSIM, RGB-SSIM and PNSR-HVS-M are computed by the tools dump_msssim
, dump_ssim
and dump_psnrhvs
, provided by the Daala repository: https://git.xiph.org/?p=daala.git
. The version used is built from GIT revision 05243557bc3e59872fd043c99dc4c17ca33bcb1b
.
Each metric compares two Y4M files.
ffmpeg
is used for image formats conversion. The version used is ffmpeg 3.3.2.gm identify
is used to determine the width and height of images. The version used is GraphicsMagick 1.3.25.Each Y4M image is exported to 4:2:0 PNG, YUV and PPM files with FFMPEG:
ffmpeg -loglevel quiet -y -i [input] -pix_fmt yuv420p [output]
All images are compressed losslessly and over a range of qualities for each codec:
BPG:
bpgenc -m 8 -f 420 -lossless -o [output] [input(PNG)]
bpgenc -m 8 -f 420 -q $q -o [output] [input(PNG)]
AV1:
aomenc --cpu-used=2 --tile-columns=4 --passes=2 --lossless=1 -o [output] [input(Y4M)]
aomenc --cpu-used=2 --tile-columns=4 --passes=2 --end-usage=q --cq-level=$q -o [output] [input(Y4M)]
Daala:
encoder_example -v 0 -o [output] [input(Y4M)]
encoder_example -v $q -o [output] [input(Y4M)]
FLIF:
flif -Q 100 [input(PNG)] [output]
flif -Q $q [input(PNG)] [output]
JPEG2000:
kdu_compress -no_info Creversible=yes -slope 0 -o [output] -i [input(PPM)]
kdu_compress -no_info -slope $q -o [output] -i [input(PPM)]
JPEG XR:
JxrEncApp -d 1 -q 1 -o [output] -i [input(PPM)]
JxrEncApp -d 1 -q $q -o [output] -i [input(PPM)]
MozJPEG:
cjpeg -rgb -quality 100 [input(PNG)] > [output]
cjpeg -quality $q [input(PNG)] > [output]
Pik:
cpik [input(PNG)] [output] --distance $q
VP9:
aomenc --cpu-used=2 --tile-columns=4 --passes=2 --lossless=1 -o [output] [input(Y4M)]
aomenc --cpu-used=2 --tile-columns=4 --passes=2 --end-usage=q --cq-level=$q -o [output] [input(Y4M)]
WebP:
cwebp -mt -z 9 -lossless -o [output] [input(PNG)]
cwebp -mt -q $q -o [output] [input(PNG)]
The Python script used to generate the compressed images are available on the GIT repository.
The images which will be displayed on the website are then chosen among all compressed images, using the following criteria:
The Python script used to select the compressed images are available on the GIT repository.
For each codec and image, the encoding and decoding speeds for lossless compression are sampled using Python timeit
.
The arithmetic mean of encoding and decoding speeds are calculated over the entire image set. We then determine a Weissman score for each codec using the following formula:
where r
is the compression ratio over PPM filesize, T
the time required to compress, ̅r
and ̅T
the same metrics for the standard compressor, and alpha is a scaling constant.
The standard compressor used is the compression of a JPG image using mozjpeg.
For each codec and image, we apply the following metrics, Y-SSIM, RGB-SSIM, Y-MSSSIM, PSNR-HVS-M and VMAF, over 15 image samples of increasing quality. For VMAF, we use the trained model nflxall_vmafv4.pkl
given by Netflix.
For each sample, we first decode the compressed image, then export the resulting file to 4:2:0 Y4M and YUV format using FFMPEG (ffmpeg -loglevel quiet -y -i [input] -pix_fmt yuv420p [output]
). Finally we apply the metrics over each sample, comparing it to the original image.
For each codec, we calculate the arithmetic mean of each metric over the entire set of images, weighted by the area of the corresponding picture, for the 15 samples of increasing quality:
We also determine the average bits per pixel for each quality sample:
The following archives contain the raw data in csv format for subset1 and subset2:
codec | avg. compression ratio | avg. space saving | wavg. encode time | wavg. decode time | Weissman score |
---|---|---|---|---|---|
daala | 2.798 | 64.26% | 0.8049 | 0.7280 | 3.3381 |
vp9 | 2.905 | 65.58% | 3.9375 | 0.4193 | 2.8011 |
av1-20160930 | 2.912 | 65.66% | 4.7511 | 0.4838 | 2.7455 |
av1-20170809 | 2.984 | 66.49% | 20.4157 | 0.6900 | 2.4003 |
kdu | 1.564 | 36.06% | 0.3875 | 0.2892 | 2.0946 |
jxr | 1.560 | 35.89% | 0.4058 | 0.3628 | 2.0730 |
av1-20180222 | 2.943 | 66.02% | 97.6492 | 1.0156 | 2.0444 |
flif | 2.473 | 59.57% | 22.1088 | 4.3042 | 1.9732 |
openjpeg | 1.564 | 36.06% | 2.1181 | 1.3794 | 1.6300 |
webp | 2.124 | 52.93% | 37.9675 | 2.4799 | 1.6079 |
bpg | 1.150 | 13.03% | 3.7503 | 3.5251 | 1.1151 |
mozjpeg | 1.137 | 12.05% | 8.7385 | 0.4144 | 1.0000 |
For each comparison algorithms, we plot the quality in dB in function of the mean bits per pixel on a logarithmic scale. We can then visualize which codec gives the best quality at a given bit per pixel (top left is better).