-
Notifications
You must be signed in to change notification settings - Fork 26
OMAF file creation using Nokia tools
Read Nokia's wiki page first.
The profile for viewport-dependent video streaming in consideration is “MCTS-based approach for achieving 6K effective ERP resolution with HEVC-based viewport-dependent OMAF video profile”. More information about this profile can be found in the D.6.3. of OMAF specification document.
Pre-requisite:
- Nokia omaf tool: https://github.com/nokiatech/omaf
- Kvazaar encoder: https://github.com/ultravideo/kvazaar
- ffmpeg: https://ffmpeg.org/
- mp4box: https://gpac.wp.imt.fr/mp4box/
One can use the following steps to create the necessary files for viewport-dependent 360-degree video streaming with Equi-Rectangular Projection (ERP).
YUVs preparation -> Encoding -> Write JSON file
Modify the 360-degree YUV file into the required resolutions for the media profile. One can use ffmpeg (open-source project) to generate the YUVs for the required resolution. For example, to create a downscaled version of a 6K YUV file with downscaled factor of two; the following command can be used:
ffmpeg -s:v 6144x3072 -r ‘fps’ -i original_6K_resolution.yuv -vf scale=3072x1536 -c:v rawvideo -pix_fmt yuv420p downscaled_3K_resolution.yuv
According to Annex D.6.3., there would be three different resolution of the same YUVs.
To encode the YUV into HEVC bitstream one can use Kvazaar encoder. According to Annex D.6.3., two different tiling structures with different resolution are required. Thus, this would result in four different encoded streams. For example, these streams can be encoded with QP values as 27 with segment size of 9 frames. The configurations are represented as following:
a. ‘fg’ : resolution 6144x3072, non-uniform 8x3 tiles. To encode:
kvazaar -i original_6K_resolution.yuv -q 27 --gop 8 --no-open-gop --bipred -p 9 --mv-constraint frametilemargin --level 6.2 --tiles-width-split u8 --tiles-height-split 512,2560 --set-qp-in-cu --slices tiles --preset ultrafast -o original_6K_resolution.265
b. ‘bg’ : resolution 3072x1536, non-uniform 8x3 tiles. To encode:
kvazaar -i downscaled_3K_resolution.yuv -q 33 --gop 8 --no-open-gop --bipred -p 9 --mv-constraint frametilemargin --level 6.2 --tiles-width-split u8 --tiles-height-split 256,1280 --set-qp-in-cu --slices tiles --preset ultrafast -o downscaled_3K_resolution.265
c. ‘fg_polar’ : resolution 3072x1536, non-uniform 4x3 tiles. To encode:
kvazaar -i downscaled_3K_resolution.yuv -q 27 --gop 8 --no-open-gop --bipred -p 9 --mv-constraint frametilemargin --level 6.2 --tiles-width-split u4 --tiles-height-split 256,1280 --set-qp-in-cu --slices tiles --preset ultrafast -o downscaled_3K_resolution_4x3.265
d. ‘bg_polar’ : resolution 1536x768, non-uniform 4x3 tiles. To encode:
kvazaar -i downscaled_1_5K_resolution.yuv -q 33 --gop 8 --no-open-gop --bipred -p 9 --mv-constraint frametilemargin --level 6.2 --tiles-width-split u4 --tiles-height-split 268,640 --set-qp-in-cu --slices tiles --preset ultrafast -o downscaled_3K_resolution.265
After encoding the streams, the HEVC bitstreams are converted to mp4 files using MP4Box. For example, one can following command:
MP4Box -add original_6K_resolution.265 -new original_6K_resolution.mp4 -fps 30
The file contains information about the path to the encoded streams. The cropping, explained in the Annex D.6.3, is preconfigured in the omafvd tools from Nokia and applied to the videos automatically. The video resolutions and tilecounts are used as keys to identify the streams, the labels are just used for output filenames. More information on generating the files can be found at the link below: https://github.com/nokiatech/omaf/wiki/Usage-instructions-for-OMAF-Creator---Viewport-dependent-mode/
One such example of a configuration is presented below:
{
// "video" is must
"video": {
"common" : {
"projection" : "equirectangular",
"output_mode" : "6K"
},
"fg": {
"filename": "original_6K_resolution.mp4",
"quality": 1
},
"bg": {
"filename": "downscaled_3K_resolution.mp4",
"quality": 6
},
"fg_polar": {
"filename": "downscaled_3K_resolution_4x3.mp4",
"quality": 1
},
"bg_polar": {
"filename": "downscaled_1_5K_resolution.mp4",
"quality": 6
}
},
"dash": {
"output_name_base" : "Example_6K",
"mpd": {
"filename": "$Name$.mpd"
},
"media": {
"subsegments_per_segment": 1,
"segment_name": {
"video": "$Name$.video.$Segment$.mp4",
"audio": "$Name$.audio.$Segment$.mp4",
"extractor": "$Name$.extractor.$Segment$.mp4"
}
}
}
}