Jump to content

Playback - Transfer and Latency Issues


Recommended Posts

schamane
Posted

Hm, maybe i can try openvpn via TCP tomorrow and yes I also tried without vpn, but didn't realize any difference .

Opnsense is running in it's own Hardware.

 

Emby is running in it's own Hardware as Well.

Will heading Home in a few days anyways.

 

Posted
5 minutes ago, schamane said:

Will heading Home in a few days anyways.

That's good because this allows for the ultimate differential diagnosis:

  1. Sitting (literally) at your doorstep and trying to connect via your VPN from outside (with minimal hops and perfect connection)
  2. From inside your home, connect via cable to the BSD machine and still go through VPN to connect
  3. Connect to your local network directly

Having the results of these three tests to compare should get you much closer to the cause of that mystery. 

schamane
Posted

Don't have to do those Tests.

 

Sounds weird, but I use that Stick usually in my bedroom and have vpn enabled, cause I am too lazy to deactivate it.

 

Works Like a Charm.

 

It's just an issue abroad, going to other continents

Posted

"Works like a charm" is not really a useful test result. Of course you have better bandwidth than abroad. 

What I meant is whether that effect exists that multiple TCP connections provide more bandwidth than a single one.

And that specific aspect compared between 1, 2 and 3.

 

  • 5 months later...
Posted (edited)

The singlestream restriction is no mystery. 
I've been dealing with it for a year. I have changed several providers and made countless measurements.
Model:
WAN access to the source server.
Source server: 1Gbit optical fiber, real speed !to the backbone UP/DOWN 200Mbit, 7-12ms. Real speed to neighbouring continent /Asia to Europe UP/DOWN 70Mbit, 90ms/ .
Emby server on windows, HW acceleration RTX 4060ti, AMD 5950X CPU, 64GB RAM, transcoding cache on RAM disk.
Nice and fast source.

Client, various.. xioami stick 4k, android, NTB windows 10, windows app or windows Firefox, Opera, etc.

Connectivity measurement: server to backbone network - Speedtest - single and multistream.
Clients to server- speed by number of threads : Network Traffic Speed Test Tool.
https://www.softether.org/4-docs/1-manual/4._SoftEther_VPN_Client_Manual/4.8_Measuring_Effective_Throughput
Clients to server - routing and packet loss - wintmr.exe

Conclusions one  case, we are measuring right now:
Client to 100Mbit down line. 
CDN servers 70Mbit connectivity.
Youtube 4k 53Mbit video running.. Other commercial video on demand services running without issue with clients on one subnet.
Private server with Emby: 
6 streams 21/33Mbit DOWN/UP /3 streams DOWN, 3 streams UP, FULL DUPLEX/
2 streams 10/15Mbit DOWN/UP /1 stream DOWN, 1 stream UP, FULL DUPLEX/
I have been dealing with this problem for a long time, I have measured different sites, different providers.

The results are similar. 
2 providers showed strong performance in singlestream /one of them a mobile provider paradoxically/.  The other 4 used with the above problems. The problems are not on the server or router side - higher end models. Tested.

In a non-net neutrality environment, please pay attention to multistream TCP technology.  It is starting to be a problem.
 

Edited by archecon
Posted
9 hours ago, archecon said:

Clients to server- speed by number of threads : Network Traffic Speed Test Tool.
https://www.softether.org/4-docs/1-manual/4._SoftEther_VPN_Client_Manual/4.8_Measuring_Effective_Throughput
Clients to server - routing and packet loss - wintmr.exe

Did you measure against speedsoftether.com or did you use the tool on both sides?

 

Is it mostly intercontinental connections where you are seeing this?


90ms latency should not be a problem to achieve a large saturation with a single TCP connection ------ IF....

...that value is persistent and continuously available - both in latency and throughput.


That's something I had thought about after our earlier discussion: If routers on the path would be "chunking" packets into groups and forwarding them in a chopped manner (maybe because some intercontinental connection is operating at its limit), then this could have indeed an effect on TCP connections. Due to the async nature of syn/ack in the tcp protocol, such chunking could possibly cause stalls in the TCP streaming where one party is waiting for the other before continuing to send.
To illustrate this, let's consider a transfer of an amount of data at the size of 10Mbit. And those are transferredwithin 1 second (for simplicity).

Now, there are different ways how this can happen. It could be 1Mbit within the first 100ms. And another 10Mbit within the next 100ms, and so on...
But it could also be  that nothing is transferred at all within the first 900ms. And all 10Mbit are transferred within the last 100ms. 
Or even nothing happens for 990ms and all 10Mbit are transferred within the last 10ms.

In all three cases, we have an effective bandwidth of 10Mbit/s. But cases like the 2nd and 3rd one are less ideal for TCP and can cause interruptions of the (ideally) continuous streams being sent.
(Note: The values above are just for illustration of the principles and purely fictional)

 

10 hours ago, archecon said:

In a non-net neutrality environment, please pay attention to multistream TCP technology.

I don't buy this at all. 

Most transfers these days are tls encrypted, so the range of possibilities for deep packet inspections is rather limited. A provider can see ports, endpoints (ips, but not hostnames and not urls) and transfer sizes (depending on http version). Sure, they can prioritize certain combinations over others and limit specific kind of traffic.

But if some provider would really be doing that - you wouldn't be able to achieve more throughput with multiple TCP connections - because that's exactly what such measure would be working against... 

Posted (edited)

Intercontinental  characteristics were cited as one indication of server connection quality, which I consider good. 
We will not now make any judgement on the intercontinental connection. It is not important in this actual model.

In our case, the Emby server is located in the same city as the clients. Only at a different provider.  Simply put, the Provider of the server is premium class /and expensive/ and the providers of the clients are regular, standard ones.

Regarding the measurement methods of the Network Traffic Speed Test Tool.
Of course, on the server side EMBY ran an instance of the Network Traffic Speed Test Tool in server mode and on the client side ran an instance of the Network Traffic Speed Test Tool in client mode. 
Otherwise it would not make sense:-)

The CDN and commercial service preference method will be different than at the single packet recognition level. Preference is done on the backbone. CDN servers have a known range of IP addresses and the connection to them is carried by dedicated connectivity at the backbone level. For the ISP, the only task is to provide the last mile from the client to its server connected to the backbone network. If it successfully secures a realistic 70Mbit connectivity on a 100Mbit line, hooray CDN services work for multiple TVs at the client and even for 1 or 2 movies in 4k. 

The same method of parallel dedicated connectivity is used by Cloudflare for their paid services such as WARP. They have a colocation center near you that will provide the data traffic you paid for over their infrastucture. For example, the distribution of your streams all over the planet /for those who pay a lot not only connectivity, but also local cache of their content/.

Some VPNs work the same way, for example NordVPN. They also use dedicated connectivity. Example while a regular client has a 2Mbit connection to the next continent and that still depends on the load according to the time of day and whether it's the weekend, a VPN has 200Mbit in real.

The result is that for regular, undifferentiated internet, there is only a gnawed bone left. Speeds fluctuate according to time of day, and load. Providers behave in such a way that they sell you the residual connectivity that is available after their contracts against dedicated lines have been fulfilled /CDN, corporate clients etc/.

A favorite trick of providers is to show customers the speed of their connection to their first server on the backbone network - the so-called last mile. But -- that's not where differentiation takes place yet. 
It shows a 100-bit contract. Show a 95Mbit speedtest... and shut up, customer. 
Nobody cares that the trunk network prioritizes CDNs and dedicated circuits anymore.  
And I didn't mention the possibility that some may limit 1 stream traffic for regular clients on purpose as a method of limiting overall load. 


 

Edited by archecon
Posted

As a matter of interest, here is the state of the EMBY server after a year of experimentation.
Emby server is accessible:
via public IP address, via DDNS domain, via paid national domain, via free domain and via Cloudflare WARP.

Cloudflare WARP handled access for a client with a particularly nasty provider. The provider routed the connection to a server in a neighbouring street of the same city via Frankfurt.. /from Asia via Europe and back/, a disastrous 3Mbit speed and a ping of 380ms. 
Using Cloudflare we bypassed the provider's routs and ensured a multiply better and usable result, at least for good 720p. 

However, Cloudflare WARP configuration is not for newbies, it required a Linux virtual machine installation and appropriate configuration on the server side. For this we had to bring in a specialist. Also, it's not free if you want to have their colocation center near you. 

The sad part of all this is that we only went through the whole anabasis to distribute the material closest to our own family.  It wasn't some piratebay.. 
And yet the material and time commitment is unbearable in an environment of non-compliance with net neutrality. And maybe that's the point...

The trend is clearly set. Please think about the direction of the future. How to resist the trend.  One option is transcoding to more economical formats. But there are many questions. The other option is to think about the technological possibilities of transmission over multiple streams.
 

Posted

Which country are you located in? I have heard similar reports like yours from the US - not about tcp stream counts but about Cloudflare having better paths.
Here in Germany, you'll rarely encounter any situations where you can get a faster path via Cloudflare (domestic).

Posted (edited)

Yes I've heard about the problem in the US with internet neutrality too. /And using clouflare/ Apparently in the EU the situation is not so dramatic.

I want to try to improve client loading by changing the number of I frames in NVENC. Please where can I find the configuration file? Is user configuration possible?
In addition to the RTX 4060ti I have an Intel A380 in my server, but unfortunately I can't use both cards at the same time.  If I set up decoding h265 via Intel and a380 and encodig via Nvidia. The system decodes using the CPU..
Is there any way to use both cards for decodong encoding? 

Edited by archecon
Posted
15 hours ago, archecon said:

I want to try to improve client loading by changing the number of I frames in NVENC.

There's no point in doing so. When transcoding, Emby serves video via HLS segments. The transcoding is done in a way that each segment starts with an IDR frame, which makes decoding independent from data in previous segments. 
The segments are 3 seconds of video. Creating additional IDR frames inside those segments won't improve loading time.

The clients wait until they have received two or three segments. There can be various reasons for long loading times, we would need to look at a specific case including log files. Please see  how to report a media playback issue

 

 

15 hours ago, archecon said:

Please where can I find the configuration file? Is user configuration possible?

Emby doesn't work with configuration files. But there are configuration options for encoding. You need to click on the gear icon next to the encoder on the transcoding settings page (only in the latest beta versions).

 

15 hours ago, archecon said:

In addition to the RTX 4060ti I have an Intel A380 in my server, but unfortunately I can't use both cards at the same time.  If I set up decoding h265 via Intel and a380 and encodig via Nvidia. The system decodes using the CPU..
Is there any way to use both cards for decoding encoding? 

We do not allow this, because it's the worst possible thing you could do. It is usually better to do decoding in CPU than to do only the decoding on a GPU. The reason is that memory transfer often plays a larger role than the CPU processing for decoding. You need to consider that after decoding, video frames are uncompressed. A single uncompressed 4k video frame is 100-120MB in size. At 30 fps, this can make 3.5 GB to transfer per second of video (from GPU to system memory) and when you want to encode with a different GPU, you need to copy the frames again - from system memory to the other GPU's memory - makes 7 GB/s. And you want transcoding to be faster than video playback time - at least 2x or 3x, so we get 14 or 21 GB/s to transfer.

The best and most efficient approach is when the video frame data stays in GPU memory after decoding. Then it can be processed (downscaling, deinterlacing, subtitle burn-in, color format changes, tone mapping) and finally encoded. This way, you have just compressed video going in and compressed video coming out again.

  • Thanks 1

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...