MCU the veteran and the most efficient in terms of bandwidth
MCUs have been used for a long time, since the former H.323 era. Basically, it receives a stream from each participant and, after decoding, composes them into a new stream to be submitted to all the participants. So users send one stream and receive one.
This is efficient in terms of bandwidth usage and good for endpoints with computational limitations, as they only need to manage one stream. It’s the best approach when dealing with interoperability issues, as this is the standard proposal of the industry.
The MCU architecture is also specially indicated when your focus is to interconnect with other networks. This single point can have a single gateway point, with dedicated HW. It is more mature technology, but also expensive on computational terms.
SFU the champion on security and CPU consumption:
SFUs are relatively recent if compared with MCUs. SFUs do not mix, just relay voice/video received from all-to-all participants. This means users share a stream with their own voice/video but receive at least n-1 streams to watch all the participants.
By comparison with MCUs, these are lighter on server CPU, as they do not handle media. Other points are that SFUs allow end-to-end encryption, so security can be preserved. This stream management makes it more flexible for UI as it is easier to map/build ad-hoc layout as you are receiving video separately.
There are multiple solutions working with SFU architectures due the two main points of the architecture: low CPU consumption (ideal for cloud deployments) and the UI flexibility that it grants. But all these platforms are OTT or private oriented solutions.
This means that there is not a unique solution for multi-conferencing. The best option depends on the number of participants (in case of SFUs you receive one stream per user), type of devices involved, customer expectations for security and user-interface, interconnection with other solutions, etc.
Hybrid architecture based on MCUs and SFUs
Quobis Communication Platform is using a mixed (hybrid) approach to take advantage of the benefits of SFUs and MCUs, using SFUs for video and MCUs for voice/audio. This approach keeps the possibility of dealing with the different streams of video (i.e building ad-hoc layouts, subscribing to part of the streams, etc.) while mixing audio on a MCU unit.
This approach have several advantages, including:
- Endpoints with limitations (CPU or connection quality) can subscribe only to voice/audio.
- Short reduction of computing capacity needed in the endpoints
- Easier to interconnect with SIP-based third-party voice solutions
- Audio and video decoupled can be recorded easily in different elements (i.e. legacy voice recorder, SIPREC elements, etc).