This the TYPE OF PLATFORM THEY ARE RIPE FOR!!!!!
Post# of 96879
Deutsche Telekom Presents World-First 6K VR Live Stream
A behind-the-scenes look at the demonstration in Bonn of another attempt to get VR off the ground at higher resolution and lower bitrates
By Adrian Pennington
Posted on December 10, 2018
The jury may still be out on virtual reality (VR) as a live event experience, but that's not stopping service providers intent on proving it can offer viewers the best seat in the house.
The latest to try is German telco Deutsche Telekom, which pulled off a world-first over the weekend by live streaming VR to consumers in 6K resolution.
"VR has to move beyond technical gimmicks and beyond a showcase," explains Stephan Heininger, Deutsche Telekom's head of virtual reality. "From a business point of view, our goal is to gain market share in VR/AR. We want to move beyond proof of concept into regular monthly or weekly live VR productions for multiple sports and music events."
It broadcast a basketball match between Bonn and Oldenburg from the Telekom Dome in Bonn to several thousand viewers using the free OTT app Magenta VR on Android smartphones, Samsung Gear VR, Oculus Go, and Daydream headsets.
The broadcast gave users the ability to switch between views from two 360° cameras, one positioned above one of the baskets and another at the half-court line.
"Most current live streamed VR is output as an Equirectangular Projection, which typically sends a single 4K 360° picture to every device," explained Carl Furgusson, VP portfolio strategy, MediaKind. "You would need 25-30Mbps in the home to realistically receive it any kind of decent HD resolution"
The telco's aim, with the help of MediaKind, is to cut the bit rate whilst improving the picture quality.
"Part of the problem with equirectangular views is that you are encoding and sending the whole 360° image when a user is only ever looking at 12% of the total picture at any one time," says Furgusson. "This is a waste of bits."
Instead, DT's R&D team T-Labs are using a process devised by Rotterdam-based Tiledmedia called ClearVR. This software segments the 360° video into 'tiles' of around 500x500 pixels and transmits only the tiles that are actually visible in the user's direct field of view. At the same time, a lower (2K) resolution copy of the panorama is transmitted essentially ensuring there are no black holes when you turn your head.
All this has slashed bitrates to 8-12Mbps.
Diving into more detail about the workflow: The courtside live VR cameras were small Z Cam S1's each recording four 4K feeds at 30hz from each of the unit's four wide angle lenses. These individual streams are fed to live production software from Imerve (developed by former members of Nokia's Ozo VR team) running on Nvidia GPUs for stitching in equirectangular format in 6K.
That file is HEVC encoded and contributed over 1Gig fibre at 200Mbps from the stadium to a local ISP peering point and from there to Google Cloud. There it undergoes cubemap conversion and segmented into tiles, encoded, and packaged in five second bursts for sending to Akamai, the origin server and CDN.
A ClearVR SDK in the Magenta VR client decodes the file and retrieves new tiles from the cloud, adapting to local bit rate. It also buffers a number of tiles at the client.
"The original Tiled Streaming technology was designed to support adaptive streaming, with multiple layers to allow zooming and panning in ultra-high-resolution imagery," explains Frits Klok, Tilemedia founder and CEO. "We applied these principles to VR streaming, where the 'client' has the logic and the flexibility to retrieve the layer that best suits the viewport and network conditions."
In the demo I witnessed on both a tablet and an Oculus over Wi-Fi in the arena, this gap was barely discernable; indeed it happens within 20-40 msec, although there were frequent pauses in the live stream. The VR stream used an audio mix taken from the 2D live broadcast, produced by NEP.
The 2K base layer is transmitted at 2Mbps. This fallback layer ensures there are no black holes while new tiles are fetched, and also takes care of an incredibly short 'motion-to-photon delay' (the delay that will make you sick if it's too long).
"That delay is as low as it can possibly be, because it only depends on the local processing," says Klok. "Typically, there will be over a hundred tiles. These are independently coded and stored on CDN, where the client can find them. The client has the logic to request the tiles it needs, decode them, and then rearrange them for rendering on the device."
All of which brings glass to glass latency in this demo to around 30 seconds, although the team believes they can cut that by half by working with 2-second chunks of video and experimenting with protocols like SRT rather than HLS.
"It's a trade-off between building in fault tolerance throughout chain and taking the latency down," says Fergusson.
MediaKind's main role in the demonstration was to transfer the streams in and out of the cloud and manage the parallel encoding in between.
"We want to offer VR encoding as a service," says Fergusson. "The workflow demonstrated here is inherently scalable since as you go up in resolution you can scale up the massively parallel encode process."
He said that Google Cloud was not only cheaper than AWS on this occasion but that on a practical level Google hosted GPUs were nearest, in a data centre in Amsterdam.
As for the HMD experience itself, it was decent and shows a maturing of this type of live event workflow. The cubemap image aggregated from the multiple 4K lenses may have been 6K but the final output was HD.