Video transcoding: Web and native playback overview (April 2020)
Embedding videos on a website is very easy, add a
<video> tag to your source code and it just works. Most of the time.
The thing is: Both the operating system and Browser of your client must support the container and codecs of your video. To ensure playback on every device, you have to transcode your videos to one or more versions of which they are supported by every device out there.
In this card, I'll explore the available audio and video standards we have right now. The goal is to built a pipeline that transcodes unknown videos (of any kind) to a set of suitable versions that can be embedded in the Browser and played back on all devices.
This card is heavily linked to provide additional resources on ach topic. Keep in mind that this is a "snapshot" as of early 2020 and might change quickly as new technologies evolve.
If you are new to this topic, I strongly recommend reading the MDN guide on audio and video formats.
TL;DR: To ensure that the playback for your video just works™ everywhere, you should offer a MP4 video file with the H.264 and AAC codecs to your user. A second version with modern standards like WebM can be served to compatible clients.
Video Formats and Codecs
When talking about digital videos, people often just refer to the container format, as this is what they “see” when they see the file. But digital videos consist of three parts:
- container format (e.g. MP4)
- encoded video stream(s) (e.g. H.264)
- encoded audio stream(s) (e.g. MP3)
A video file is thus represented through a container which in turn contains some metadata and one or more audio and video streams.
The main aspects we are interested in are the license and device support for each of these parts.
Video Container Format Matrix
A broader comparison of many container formats and codecs can be found on the Wikipedia.
|Container format||License||Supported Video Codecs||Supported Audio Codecs|
|MP4||patent encumbered, "nobody charges license fees for software or content."||H.264, VP8, VP9, AV1||MP3, AAC, Opus|
|MOV / "QuickTime File Format"||proprietary, but it seems to be only to affect only the software, not container||H.264, H.265||MP3 (as MP1), AAC|
|AVI / "Audio Video Interleave"||proprietary, again maybe nothing to worry about||H.264, H.265, VP8, VP9||MP3, AAC, FLAC, Opus|
|MKV||freely licensed||H.264, H.265, VP8, VP9||MP3, AAC, FLAC, Opus, Vorbis|
|WEBM||CC BY 3.0 / BSD-like||VP8, VP9||Opus, Vorbis|
Further reading: MDN: Media container formats (file types)
Video Codec Format Matrix
A broader comparison of many video codecs can be found on the Wikipedia.
|H.264 / AVC||patented||"The MPEG LA patent pool does not require license fees for streaming internet video in AVC format as long as the video is free for end users." (MDN)|
|H.265 / HEVC||see H.264||(successor, better compression as H.264)|
|VP9||royalty free||(sucessor, better bit rate as VP8)|
|AV1||royalty free||(planned successor of VP9 with better compression rates)|
Further reading: MDN: Web video codec guide
Audio Codec Format Matrix
A broader comparison of many video codecs can be found on the Wikipedia.
|MP3||all patents have expired|
|AAC||no license for distribution of AAC files||ffmpeg may not have native AAC support|
|FLAC||GPL/BSD||the only listed lossless codec|
Further reading: MDN: Web audio codec guide
Now that we know which formats are available, we have to find some combination that will actually play when using the native player or a well known Browser.
Browser Playback Support
To test the Browser support, I created a
<video> tag with a WebM/VP8 and a MP4/H.264
<source>. The Browser is charge of choosing a version which fits him best. This way, modern Browsers are likely to play the compressed WebM video while Browsers like IE11 and Safari fall back to H.264.
CanIUse gives us a quick inside on the Browser Support for an MP4 and an WebM version:
|Container / Codecs||Chrome 80||FF 74||Safari 13||IE 11||Android Browser 80||iOs Safari||Samsung Internet|
|MP4, H.264 + AAC||✓||✓||✓||✓||✓||✓||✓|
|WebM, VP8 + AAC||✓||✓||⚠||⚠||✓||⚠||✓|
Every available combination listed in this table was manually tested on BrowserStack. The chosen video codec is annotated in every cell, audio playback was verified unless stated otherwise.
|Operating system||Chrome 80||FF 74||Safari 13/9||IE 11||Mobile Chrome||Mobile Safari / Samsung||Mobile Firefox|
|OSX El Capitan||webm||webm||mp4¹³||/||/||/||/|
¹ Requires webserver support for HTTP range requests.
² Audio playback could not be confirmed, although that may be an issue regarding BrowserStack.
³ MP4 is being used as a fallback, but for a short time in the beginning "Video failed to load." is shown.
I was unable to get the video playback working on iOS <= 9, maybe because of a faulty Range header setup. However, the market share for this version is below 2%.
Native OS Playback Support Matrix
This table is only relevant to you if you are offering a download version of your video. In this case, the native player of each device must support the container and codecs you are using.
|Windows||✓||10 1903+||10 1903+||✓||10 1809+|
An up-to-date list may be found here:
MP4 + H.264 + AAC seems to work everywhere.
Reprocessing with ffmpeg
Once you have decided on one or more target formats, you need to build a pipeline to transform user provided video files to your formats. ffmpeg is great CLI tool for this purpose, take your time to understand its syntax and options.
A quick introduction to ffmpeg's CLI options
Lets get started with a minimal example. The following command takes the video/audio streams from "input.avi" and generates a transcoded video "output.mp4".
-i is used to speficy an input source.
ffmpeg -i input.avi output.mp4
Options are only applied to the next specified file. In this example, the option
-r specifies a frame rate of 24 - but only for the input file.
ffmpeg -r 24 input.avi output.avi
Let's have a look at the internal transformation process of ffmpeg. It includes (de)muxing packets and de/encoding frames to keep everything in sync. The ASCII illustration is taken from ffmpeg's man page.
_______ ______________ | | | | | input | demuxer | encoded data | decoder | file | ---------> | packets | -----+ |_______| |______________| | v _________ | | | decoded | | frames | |_________| ________ ______________ | | | | | | | output | <-------- | encoded data | <----+ | file | muxer | packets | encoder |________| |______________|
For our purpose of transcoding a single video into multiple versions, these options might be of use:
||adds an input source|
||changes the codec for the audio stream (e.g.
||changes the codec for the video stream (e.g.
||changes the bitrate for the audio stream (e.g.
||Adds a "simple filtergraph". One input, one output, same type. (e.g. deinterlace, scale, change frame count)¹|
||Adds a "complex filtergraph". Multiple files or different types.|
||sets the output's frame size (e.g.
||specifies which input stream is used for each output stream, in the order of the definition of output streams.|
||the audio stream from the first file is used³|
||the video stream from the first file is used³|
¹ A note on filters: They can be used to add a conversion step between decoding and encoding.
² To keep the aspect ratio of the input source, set the variable dimension to -1. (e.g.
³ If you are interested in either the audio or video stream, use
map to specify only the one.
Transcoding a video using the CLI
This section aims to convert an unknown video source ("input.mov") to a set of versions that are known to play nicely on every device (see the "Video Playback" section). The result can be verified with
ffmpeg -i output.mp4 and should include lines similar to these:
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'output.mp4': Stream #0:0(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080 [SAR 1:1 DAR 16:9], 9998 kb/s, 25 fps, 25 tbr, 25k tbn, 50k tbc (default) Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
Step 1: Convert the input video to a MP4 container containing a H.264 video (1080p) and AAC audio stream (48 kHz)
ffmpeg -i input.mov -filter:v scale=-1:1080 -c:v libx264 -b:a 48k -c:a aac output.mp4
Step 2: Convert the input video to a WebM container containing a VP8 video (730p) and Vorbis audio stream (48 kHz)
ffmpeg -i input.mov -filter:v scale=-1:720 -c:v libvpx -b:a 48k -c:a libvorbis output.webm
Or, do it all at the same time:
ffmpeg -i input.mov \ -filter:v scale=-1:1080 -c:v libx264 -b:a 48k -c:a aac output_1080.mp4 \ -filter:v scale=-1:720 -c:v libx264 -b:a 48k -c:a aac output_720.mp4 \ -filter:v scale=-1:720 -c:v libvpx -b:a 48k -c:a libvorbis output.webm
Et voila. We have one version that is solely offered for downloads (1080 MP4), one for modern Browsers (720p WebM) and one for legacy browsers (720p MP4).
You might consider to use a gem to interact with ffmpeg in Ruby. As of 2020 I did not find a gem that fits my needs.. these are your options:
- active_encode looks solid, but is still in an 0.x version and has quite some development dependencies. It has the Apache license.
- streamio-ffmpeg seems to be perfect, but had its last release in 2016. It offers an intuitive DSL under a permissive license, even a Carrierwave wrapper gem is available.
In conclusion to my research, I recommend you to offer your user two different video versions for the Browser playback. If the video should be downloadable, prefer offering the MP4 version in this case.
Along the lines of the MDN documentation, use a tag and embed the versions like this:
%video(controls controlslist="nodownload" preload="metadata") %source(type='video/webm; codecs="vp8, vorbis"' src="video.webm") %source(type='video/mp4' src="fallback.mp4")
In this case, the video files should match these specifications:
video.webm: WebM Container, VP8 Video, Vorbis Audio
fallback.mp4: MP4 Container, H.264 Video, AAC Audio
You can read about how to create these versions earlier on in this card.
Don't forget to configure your webserver to support HTTP range requests for Safari. Ngnix seems to support it out of the box.