Normally doing audio/video over a network is a matter of downloading a file.
As Ethernet is a packed switched network time of delivery is not guaranteed.
This is solved by buffering, just wait until a big buffer is filled and start playback.


This won’t work in a recording studio. You have multiple audio streams e.g. 10 mics and all these streams must be in sync. Likewise if you feedback the monitor of the musicians, you can’t have a high latency because they will mistime.
You need real-time streaming with a very low latency.

This requires dedicated protocols like AVB, Cobranet, Dante, Ethersound, Ravenna.


This covered in more detail here.