This is a read-only archive of the Mumble forums.

This website archives and makes accessible historical state. It receives no updates or corrections. It is provided only to keep the information accessible as-is, under their old address.

For up-to-date information please refer to the Mumble website and its linked documentation and other resources. For support please refer to one of our other community/support channels.

Jump to content

RTP or not RTP? That's the question...


raul2r2
 Share

Recommended Posts

Hi...

I was analyzing a bit the source code and data structures and I'm a bit surprised that there isn't anything similar to an RTP protocol implementation. Seeing voice packets data type it hasn't got any kind of timestamp... and for my little understanding of the thing, the server is not doing some kind of "remix" of all client voice streams in only one to redistribute to the clients and save bandwith... so in theory the server is doing some kind of multicasting resending all voice streams separately to all clients connected, so in theory, the clients are consuming a lot more download bandwith than upload one when them could be the same. Please correct me if I am wrong...

I was thinking in implement something similar to RTP/RTCP for the server, remixing all voice streams in only one to resend to each client, but I'm not so clever guy to see if this was previously has been taken in consideration by the developers and discarded for whatever reason that I can't understand in this moment... for example, because it would has a big penalty in latency costs.

What do you think?

Regards,

Link to comment
Share on other sites

  • Administrators

One implication this would have is heavily increasing server load, as the server would have to process all audio into a stream.

Not only that, but potentially each user receives different audio (local muting, linked channels, talk, shout and listen permissions) thus the server would have to process … a lot. Too much.

Link to comment
Share on other sites

Ok... but processing power is a lot cheaper than bandwith in a server. Except if what you are saying is that this processing power in not only a problem of server scalability, it would be also a problem of latency. A problem of scalability could be solved throwing more servers/cpu/load balancing solution... a problem of latency has a more difficult solution.

Link to comment
Share on other sites

  • Moderators

Mixing on the server adds a lot of processing code and complexity on the server (jitter buffer etc) which will also add latency and might also impact the server stability. Another problem is that encoding the audio data twice will reduce the audio quality.


btw the audio packets have some kind of a "timestamp", they contain a sequence number but it's per transmission and per user only.

Link to comment
Share on other sites

  • Administrators

You also have to realise that multiple users able to hear each other (aka same/linked channel or whisper) talking at the same time is a very rare case to begin with. Ppl. usually don't talk over each :lol:


All in all serverside mixing does not have many benefits. It requires jitter buffering -> decoding -> mixing -> re-encoding which degrades quality, increases latency, increases resource usage and increases the servers complexity for little to no real-world benefit.

Link to comment
Share on other sites

It even becomes more complex when positional audio is asked. This will mean that the server needs to know all the positional audio settings of the client to correctly mix the different audio streams. Also the Bandwidth will increase as it will have to send a audio stream per loudspeaker channel, for a 7.1 speaker system this will mean a multiplication of 8.

Computer specs: AMD FX-8320, 8GB DDR3-SDRAM, AMD Radeon HD 7950, Asus Xonar D1, Windows 7 Ultimate 64bit/Debian Jessie AMD64.

Link to comment
Share on other sites

Sorry raul2r2,


I did not mean to kick you while you were already on the ground. :(


When I did my post I forgot to take the original starter of the topic into account. My sincere apologies for that.

Computer specs: AMD FX-8320, 8GB DDR3-SDRAM, AMD Radeon HD 7950, Asus Xonar D1, Windows 7 Ultimate 64bit/Debian Jessie AMD64.

Link to comment
Share on other sites

And from the tomb, the zombie raise his hand and say... arghhh... this is painless... after a bit of testing ... arghhh... the increment in latency wasn't noticeble, cpu power hurts a little more... arghhh... whisper, links... a bit more memory... arghhhh i'm not interested in positional audio... arghhh... and my last words... The problem is the fucking codecs, no one of them give me good quality compressing multiple voices in the same stream. Now he can rest in peace.

Link to comment
Share on other sites

 Share

×
×
  • Create New...