Lessaj 467 Posted April 30, 2025 Posted April 30, 2025 (edited) Hi, One of my remote users has reported to me that sometimes when they leave a series playing when the episode ends they end up on a backdrop screen with "next episode" highlighted, but it doesn't advance unless they interact with the button. They mentioned that sometimes this device loses its wifi connection but it often reconnects quickly on its own, despite this from a playback perspective there is no problem as it continues to play the rest of the episode. In reviewing my logs I can see that the device stops reporting /Sessions/Playing/Progress like it usually does every 10 seconds along with a 206 for original.mkv shortly after, but I can also see the GET request for original.mkv is being made a second time for the rest of the length of the episode very shortly after this occurs, so the wifi connection would have only dropped out for a few seconds in this instance. The player seems to be resilient to the wifi connection dropping, but the playback status reporting does not, and the player does not advance to the next episode as a result. When watching the playback progress on the dashboard the time continues to increase until it reports playback stop - but I don't actually see a /Sessions/Playing/Stopped call being made at that point, it looks like the server idle timeout is actually ending the session and it's just the way it's logged. I do have a reverse proxy and there is nothing within my httpd logs to suggest that anything is being dropped, failed to connect, or blocked from my end, it seems to be occurring during the brief wifi drop out. It's unclear if this occurs with any other devices, this is the only one they would use for Emby on their wifi. In the attached server log: 23:07:18 The episode Stargate SG-1 - S3, Ep1 - Into the Fire (2) reports being played and the client makes a GET request for original.mkv 23:23:33 The last time the client sends a POST /Sessions/Playing/Progress 23:23:44 The GET for original.mkv concludes with a 206 (typical response code for this request) 23:23:47 The client makes another GET request for original.mkv 23:49:20 The GET request for original.mkv completes with a 206 23:50:20 The session is reported idle and a stop is reported 23:51:31 This episode is 44 min 13s so that should mean the episode would have ended at approximately this time I find that typically the request for original.mkv will conclude with about 2 minutes left of playback - not just for this situation, in general, all direct mkv playback, maybe more specifically 1080p content around this bitrate, but either way consistent with what I've seen. For example the previous episode Stargate SG-1 - S2, Ep22 - Out of Mind (1) the request for original.mkv ends at 23:05:10 with playback stop being reported at 23:07:16, and the episode before that was 22:20:48 and 22:23:00 respectively as well, so about 2 minutes between original.mkv completing and the episode ending is accurate for this content. This suggests the entire episode played from the client perspective, but it never reported that status beyond 23:23:33 before the session was seen as idle from the server perspective 60 seconds after the request for original.mkv concluded and thus playback stopped approximately 70 seconds early for which I don't see the POST /Sessions/Playing/Stopped request being sent by the client. The playback reporting plugin reports that 00:43:01 of the episode was played, which would be the time between the playback start and end being reported, but from a client perspective it didn't report more than at most 00:16:15 worth of playback - not faulting the plugin for the way it's calculating that, just a data point that is in line with the evidence. I can try to reproduce this locally to get an app log as well, not really sure how I could do that right now, I will try to ask them to enable logging to get an app log as well to have for next time it happens but I'm not sure how easy it will be for them to get it to me. They are using Emby for Android 3.4.36 on an Nvidia Shield and should have auto update in their play store so that would be the latest available to them at the moment. I am running server version 4.8.11.0. embyserver-63881567999.txt Edited April 30, 2025 by Lessaj Added server version
Luke 42077 Posted April 30, 2025 Posted April 30, 2025 Quote but the playback status reporting does not, and the player does not advance to the next episode as a result. HI, it's true that there's no retry, but it's not connected to advancing to the next episode, meaning it won't prevent that. It may delay it though until that request succeeds or fails.
Lessaj 467 Posted April 30, 2025 Author Posted April 30, 2025 59 minutes ago, Luke said: HI, it's true that there's no retry, but it's not connected to advancing to the next episode, meaning it won't prevent that. It may delay it though until that request succeeds or fails. Okay I can see how it's not directly related. I see at 02:23:02, 02:23:12, and 02:23:22 that the client finally reported playback status and 02:23:26 there was finally a report of Stopped from the client. I will ask them if they woke up and continued playing or if it continued on its own - from what I know so far it will never advance on its own. I do see the same thing happening this morning but this time it did not continue playing beyond about 15 minutes. I've tried to replicate this locally but I don't think my approach is similar enough. I tried blocking the device from within my Unifi network controller and then quickly unblock it, but I haven't been able to get the client to reconnect fast enough. However this does seem to highlight that once the wifi connection is dropped the app can become "stale", I can exit playback to the item detail screen even after reconnecting to wifi and try to play again but I only see a spinner. It's also possible that this action may reset the device connection states, I need to look further into that. I'll continue to do some testing locally and get an app log, it may reveal some information. It does seem like there would be a benefit to some kind of retry here, in my experience switching networks is generally seamless - IE toggling off my VPN and connecting to wifi when I get home, ultimately I'm connecting to the same internal IP but my client IP changes - but reconnecting to the same network doesn't seem to attempt to retry connecting in the same way.
Luke 42077 Posted April 30, 2025 Posted April 30, 2025 Quote I can exit playback to the item detail screen even after reconnecting to wifi and try to play again but I only see a spinner. It should eventually recover. It may just take time for requests to time out and fail, and then the app will try switching to the remote address.
Lessaj 467 Posted April 30, 2025 Author Posted April 30, 2025 7 minutes ago, Luke said: It should eventually recover. It may just take time for requests to time out and fail, and then the app will try switching to the remote address. In the case of my setup clients will always use the remote address because I block access to the LAN address on 8096/8920, I want to force all traffic through my reverse proxy even internally. Having said that I reviewed my rule and it's block instead of reject, so it may still be trying to connect to the LAN address before it fails since block drops it silently. I will include testing around this and if block/reject makes any difference in trying to recover with a reconnection to the wifi. I treat all remote connections as local since I have the bandwidth and they don't need to touch any of their settings to direct play everything, that may also be playing a role here if it's trying the LAN address but they wouldn't have anything in their network at that address anyway so it should still fail and fallback in a similar manner. 1
Lessaj 467 Posted May 1, 2025 Author Posted May 1, 2025 I spoke with them and found out that it was actually their partner came into the room, saw the TV was off (the shield went to sleep) so they turned it on and relaunched the app then resumed playback of the same episode (SG-1 - S3, Ep1 - Into the Fire (2)) which resumed with 00:04:25 remaining, so it played the rest of the episode and continued on from there. Not sure how it recorded the resume point there if there was no status being reported. I was mistaken earlier in missing that they resumed playback starting at 02:19:00, so it was actually reporting playback progress since then. In the instance this morning they put the shield to sleep, so that was a normal case. I don't currently have a log from their device, I have asked them to enable app logging, not sure if they actually have or not. I have managed to reproduce the problem for myself, it's not 100% the same but I think it's basically the same - the only difference I noticed was the client would send a second GET request before I see the 206 of the first request, but it does the same thing where it stops reporting the playback progress. I didn't notice the option on my Unifi at first that I can actually tell the device to reconnect, so it loses connection for about 6 seconds based on my pings when I do this. Once I was able to reproduce the results I repeated the test within the same session, however I had different results on this second attempt. When I told the device to reconnect a second time I instantly received a playback error - this only seemed to happen every other time trying to do this, normally it wouldn't happen back to back like that anyway since it's random and many hours and even sessions apart, so I think the second playback observations specifically are essentially moot, but worth mentioning how it behaved at that point. In trying to reproduce this I also found a way to make it report the playback progress again. If I seek forward or back, so that the thumbnail timeline would appear, the thumbnails actually weren't appearing as I'm seeking but I found watching my httpd logs a short time after I complete the seek that a request is made for ThumbnailSet on the item, and after that request comes through the playback progress begins to report again. Jumping forward or back worked but did not seem to make it report the playback again, it continued to play as normal and since it wasn't reporting the playback position it no longer matched the time on the dashboard. I made notes of what I was doing and the times and reviewed the app log from my device during my testing which can be correlated to the timestamps in the logs. Note: I don't mention it below for brevity of events but I could see the client making the GET request for original.mkv after reconnecting to the wifi every time, so as long as it was playing during the reconnect and not erroring out, it always continued and I didn't need to wait to see if it would stop playing or monitor network traffic to see it pulling data, the player was resilient to playback but the playback status reporting was not. 00:29:03.966 Playback start of Stargate SG-1 - S3, Ep1 - Into the Fire (2) 00:31:39.094 Seek to near the end of the episode, little less than 5 minutes left 00:31:59.582 Last playback progress Issue a reconnect to the device from Unifi interface 00:32:08.306 Pings stop responding 00:32:15.495 Pings started to respond again 00:36:31 I let it play until the episode ended at approximately this time, at this point it was sitting on a rotating backdrop screen with a "Next episode" button highlighted 00:38:00 I tried to hit next episode, I wasn't sure if it accepted me hitting it so I tried again a few seconds later and a spinner came up which spun for a little while before it continued 00:38:44.514 Playback start of Stargate SG-1 - S3, Ep2 - Seth I tell Unifi to make the device reconnect, instantly get a playback error message 00:40:11.123 Pings stop responding 00:40:17.274 Pings start to respond I hit okay on the playback error and I'm on a backdrop screen which does not change. There's no buttons 00:42:10 I hit back, now on episode details screen 00:42:30 I hit play, a spinner comes up which spins for a short while and disappears, probably 10-15 seconds or so 00:43:00 I hit play again, this time it plays 00:43:02.746 Playback start of Stargate SG-1 - S3, Ep1 - Into the Fire (2) Issue a reconnect to the device from Unifi interface 00:43:44.562 Pings stop responding 00:43:51.744 Pings start responding 00:44:20 Started to seek - no thumbnails appear 00:44:35 Finished seeking forward 00:44:52.723 a request for the thumbnail set comes through 00:44:58.134 Player progress starts to be reported again 00:45:34.835 Playback stopped It looks like when I tell the device to reconnect that a "java.net.SocketException: Software caused connection abort" error is thrown along with "UnknownHostException (no network)" until it finishes reconnecting, then it tries to fall back to the HTTP LAN address - which it would never be able to connect to due to my firewall rules, but I don't always see timeout errors when it makes the switch to this address. When the episode ends at 00:36:31 it tries to make a request to the LAN address, which times out as expected, and I can see the blocks on my firewall as well. At 00:38:41 it changes back to the remote address. The second attempt at playback has some different results, but I think the first playback is equivalent to their experience, and the third playback is as well while also including a way to reestablish the connection. I do see it flips back and forth between trying the local and remote addresses a few times through this test period. I have not tested if using reject instead of block makes any differences, but I was thinking more about this and if their device is trying to connect to what is my LAN address that they would experience the same timeout anyway, it is very unlikely that they would have a device at that address listening on that port to even connect to, so it would be the equivalent of the packet being dropped silently because there's no reply rather than an immediate reject. I have Emby for Android 3.4.64 at the moment, I can downgrade the app and repeat the tests if necessary since I can reproduce it. I've anonymized the app log as best as I can based on things I know to look for, like my domain name or API keys. embyserver - 2025-05-01T011646.214.txt emby_android_1746073677440.txt
Lessaj 467 Posted May 7, 2025 Author Posted May 7, 2025 Hi, I received an app log from their device and it looks similar to my device's app log when the issue happens, so their device is similarly falling off and reconnecting to the wifi in the way that I reproduced and getting hung up at that point. It falls back to the LAN address which it can't use and never tries to go back to the remote address on its own once in this state based on observation. The main difference I see between the logs is that on theirs it times out the LAN address in 30 seconds and mine is in 10 but that could be down to my own network issuing a reset or something like that faster. I have control of every network device that's part of my network, theirs is provided by their building and their wired network is on a different (but neighboring) subnet than their wifi - so I don't know why the device falls off briefly, but it happens fairly frequently. It's probably more of an issue with the WAP than the client device. We are both running app version 3.4.67 at this point. I'm attempting to get them to hard wire the device instead to try to work around the issue, they need to get a small unmanaged switch to split the available ethernet jack to both the shield and their computer before they can do that though. While this may work to resolve their issue I still believe that improvements can be made to make this more resilient when the network connection is briefly lost, it should not be required to hard wire the device to resolve this. A phone may be more resilient to this situation with a data connection, that's definitely something I can try since I know how to reproduce it, but the shield or any other Android TV-like device wouldn't have a mobile network to fall back on so improvements are needed for those devices to be resilient. Please let me know if I can help in any way from either an investigation or testing standpoint.
Luke 42077 Posted May 7, 2025 Posted May 7, 2025 On 5/1/2025 at 2:21 AM, Lessaj said: I spoke with them and found out that it was actually their partner came into the room, saw the TV was off (the shield went to sleep) so they turned it on and relaunched the app then resumed playback of the same episode (SG-1 - S3, Ep1 - Into the Fire (2)) which resumed with 00:04:25 remaining, so it played the rest of the episode and continued on from there. Not sure how it recorded the resume point there if there was no status being reported. I was mistaken earlier in missing that they resumed playback starting at 02:19:00, so it was actually reporting playback progress since then. In the instance this morning they put the shield to sleep, so that was a normal case. I don't currently have a log from their device, I have asked them to enable app logging, not sure if they actually have or not. I have managed to reproduce the problem for myself, it's not 100% the same but I think it's basically the same - the only difference I noticed was the client would send a second GET request before I see the 206 of the first request, but it does the same thing where it stops reporting the playback progress. I didn't notice the option on my Unifi at first that I can actually tell the device to reconnect, so it loses connection for about 6 seconds based on my pings when I do this. Once I was able to reproduce the results I repeated the test within the same session, however I had different results on this second attempt. When I told the device to reconnect a second time I instantly received a playback error - this only seemed to happen every other time trying to do this, normally it wouldn't happen back to back like that anyway since it's random and many hours and even sessions apart, so I think the second playback observations specifically are essentially moot, but worth mentioning how it behaved at that point. In trying to reproduce this I also found a way to make it report the playback progress again. If I seek forward or back, so that the thumbnail timeline would appear, the thumbnails actually weren't appearing as I'm seeking but I found watching my httpd logs a short time after I complete the seek that a request is made for ThumbnailSet on the item, and after that request comes through the playback progress begins to report again. Jumping forward or back worked but did not seem to make it report the playback again, it continued to play as normal and since it wasn't reporting the playback position it no longer matched the time on the dashboard. I made notes of what I was doing and the times and reviewed the app log from my device during my testing which can be correlated to the timestamps in the logs. Note: I don't mention it below for brevity of events but I could see the client making the GET request for original.mkv after reconnecting to the wifi every time, so as long as it was playing during the reconnect and not erroring out, it always continued and I didn't need to wait to see if it would stop playing or monitor network traffic to see it pulling data, the player was resilient to playback but the playback status reporting was not. 00:29:03.966 Playback start of Stargate SG-1 - S3, Ep1 - Into the Fire (2) 00:31:39.094 Seek to near the end of the episode, little less than 5 minutes left 00:31:59.582 Last playback progress Issue a reconnect to the device from Unifi interface 00:32:08.306 Pings stop responding 00:32:15.495 Pings started to respond again 00:36:31 I let it play until the episode ended at approximately this time, at this point it was sitting on a rotating backdrop screen with a "Next episode" button highlighted 00:38:00 I tried to hit next episode, I wasn't sure if it accepted me hitting it so I tried again a few seconds later and a spinner came up which spun for a little while before it continued 00:38:44.514 Playback start of Stargate SG-1 - S3, Ep2 - Seth I tell Unifi to make the device reconnect, instantly get a playback error message 00:40:11.123 Pings stop responding 00:40:17.274 Pings start to respond I hit okay on the playback error and I'm on a backdrop screen which does not change. There's no buttons 00:42:10 I hit back, now on episode details screen 00:42:30 I hit play, a spinner comes up which spins for a short while and disappears, probably 10-15 seconds or so 00:43:00 I hit play again, this time it plays 00:43:02.746 Playback start of Stargate SG-1 - S3, Ep1 - Into the Fire (2) Issue a reconnect to the device from Unifi interface 00:43:44.562 Pings stop responding 00:43:51.744 Pings start responding 00:44:20 Started to seek - no thumbnails appear 00:44:35 Finished seeking forward 00:44:52.723 a request for the thumbnail set comes through 00:44:58.134 Player progress starts to be reported again 00:45:34.835 Playback stopped It looks like when I tell the device to reconnect that a "java.net.SocketException: Software caused connection abort" error is thrown along with "UnknownHostException (no network)" until it finishes reconnecting, then it tries to fall back to the HTTP LAN address - which it would never be able to connect to due to my firewall rules, but I don't always see timeout errors when it makes the switch to this address. When the episode ends at 00:36:31 it tries to make a request to the LAN address, which times out as expected, and I can see the blocks on my firewall as well. At 00:38:41 it changes back to the remote address. The second attempt at playback has some different results, but I think the first playback is equivalent to their experience, and the third playback is as well while also including a way to reestablish the connection. I do see it flips back and forth between trying the local and remote addresses a few times through this test period. I have not tested if using reject instead of block makes any differences, but I was thinking more about this and if their device is trying to connect to what is my LAN address that they would experience the same timeout anyway, it is very unlikely that they would have a device at that address listening on that port to even connect to, so it would be the equivalent of the packet being dropped silently because there's no reply rather than an immediate reject. I have Emby for Android 3.4.64 at the moment, I can downgrade the app and repeat the tests if necessary since I can reproduce it. I've anonymized the app log as best as I can based on things I know to look for, like my domain name or API keys. embyserver - 2025-05-01T011646.214.txt 659.96 kB · 0 downloads emby_android_1746073677440.txt 222.96 kB · 0 downloads Hi, there is no fallback to the LAN address, but rather it tries all known addresses simultaneously to see which one is reachable.
Lessaj 467 Posted May 7, 2025 Author Posted May 7, 2025 2 minutes ago, Luke said: Hi, there is no fallback to the LAN address, but rather it tries all known addresses simultaneously to see which one is reachable. My wording could be more precise in that regard, however the app log is only showing the LAN address as being attempted at that point and when the episode ends it fails to connect to it, and it does not continue from there on its own. In the log they sent me today it fell off wifi at 14:25:55, threw a couple no network errors, was back on by 14:25:58 and mentions onServerAddressChanged with the LAN address and when the episode ended at 14:46:12 it sat there until they backed out while talking to me asking how to get the log for me and at that point I see onServerAddressChanged with the remote address. There's no evidence of reattempting the remote address during the 20 minutes of playback that remained after the disconnect, and of course in my server log I stopped seeing the progress being reported again through that duration. Only the request for the video seems to be reattempted. 1
Lessaj 467 Posted May 23, 2025 Author Posted May 23, 2025 Thanks Luke that would be appreciated. I did end up testing with my phone and the same thing happens, it doesn't try to switch to the mobile network when the wifi network is dropped - maybe because the wifi reconnected so quickly and was not actually turned off - but either way it stopped sending POST requests with player progress. My friend did end up getting a network switch and wired the shield with ethernet and hasn't reported any issues since so I'm happy that there was a relatively inexpensive way to resolve their issue. Feel free to contact me if you have any test builds you'd like to get more data from if you're not able to reproduce this scenario with your own network equipment. 1
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now