Distributed Systems and Virtual Reality

I’ve been thinking a lot lately about networking and how we use it for our co-located immersive experiences. What are the pressing questions as one designs such a system for shared experiences like these?

In order to deliver these compelling experiences that exceed the capabilities of stand-alone, inside-out tracked devices such as the recent Oculus Quest, we need to leverage the ‘fog’ (local cloud) or the cloud - once 5G conquers the world and delivers on promises of low latency cloud connectivity. By utilizing a local co-located server to maintain synchronized states across all devices, we can avoid high round-trip latency from remote hosts.

This requirement of low latency, and consistency of packet delivery times, also drives a need for an in-memory datastore with common semantics we encounter across different projects. There’s also the question of balance for your specific needs, and something that might generalize to other projects. Supposing you also wanted to be robust to server host failures, you could also consider consensus algorithms (see below for links, if you’ve got overhead to spare), and use 3 distributed versions of the datastore. You could also manage whether a synced object is dynamic, or static, or perhaps a steady stream, active / inactive, creation, deletion.

You also need to consider what your needs are for object ownership semantics, how to acquire a lock, how to release, whether the world owns objects, etc - and then you might add synced physics to these objects. You can run all your physics server-side and broadcast that to clients, run physics on each client independently, use the server tick to sync a deterministic simulation, or run on both and correct the client every so often with an authoritative source (in this case the server).

Once you have a datastore, and object sharing, you can divide your communication across a control plane, and a data plane. In this case, the control plane is embodied as a TCP stream to manage connections, and world state changes, and events that correspond to ownership, then let positions and otherwise stream over UDP, since getting an update for an avatar or object late - is rather useless, isn’t it?

Do you stream each frame’s data? Do you perform view interpolation? Do you perform the interpolation client side or server side? Do you stream everything to everyone? Can you use frustum culling server side to ignore streaming to those who can’t see an object / avatar?

You might also consider compression of messages, or quantization of your datatore further improve your experience, keeping in mind the trade offs between bandwidth and compute cycles.

The topology of the network is another consideration, should you go with purely peer-to-peer? Client / server? A combination of the two? Peer-to-peer might remove a hop in your communication but, then each peer is sending quite a few more messages when broadcasting. Typically with current hardware, it’s probably advisable to push that responsibility to the server if your load is non-trivial.

Perhaps you might also like to consider asset streaming, either at load-time, or streaming world chunks.

Can we also integrate a generative blockchain? :) This might provide a way of distributing state of seeded world chunks or entities, that can’t be deleted.

Links

https://raft.github.io/
https://en.wikipedia.org/wiki/Paxos_(computer_science)