adminExitium 355 Posted August 26, 2024 Posted August 26, 2024 Due to various reasons, I’ve been trying out a Jellyfin instance recently, and similar to Emby, one of the biggest problems there is search. Emby is quite a bit better in the search performance, but it still takes quite a lot of time when searching across large libraries (10-15 seconds on average for me) and also lacks proper fuzzy matching and typo tolerance. A user created an external search utility that can be used to answer all searches while using meilisearch (https://www.meilisearch.com/) as its backend: https://gitlab.com/DomiStyle/jellysearch. This solves all my problems with search like fuzzy matching, and the results are almost instantaneous (<100 ms on average). I would like to try porting something similar for Emby, but the problem I’m facing is that I can’t access the DB directly to update the item list in meilisearch because of the exclusive lock taken by Emby. Is there a specific reason for this exclusive lock, and if not, can it be removed to allow other applications to access the DB concurrently, which is safe anyway because of the WAL mode?
adminExitium 355 Posted August 28, 2024 Author Posted August 28, 2024 @LukeWould it be possible to remove the EXCLUSIVE lock on the DB so other processes can access it?
rbjtech 5284 Posted August 28, 2024 Posted August 28, 2024 I guess it depends on your library size, but every search I perform is returned in a few hundred ms - what hardware are you running ? Also what DB cache size vs the library.db file ?
adminExitium 355 Posted August 29, 2024 Author Posted August 29, 2024 5.5 GB library DB and 16GB as the cache size. The item count is 2169299 and it is running on an Intel Optane Drive. The average search timings are in the 4-6 seconds range. I understand this is a huge DB compared to the average DB sizes here, which is why I want to try working on a solution myself rather than bugging @Lukerepeatedly for performance improvements. The only thing blocking me is the EXCLUSIVE lock on the DB file, which I don't understand the need for at all when the DB is already opened in WAL mode, which is already safe for concurrent access from multiple processes. 1
rbjtech 5284 Posted August 29, 2024 Posted August 29, 2024 wow ok - so yes, that's big mine is 0.5GB .. Errmm over to Luke as agreed it looks like an indexing issues rather than raw underlying performance.
adminExitium 355 Posted August 30, 2024 Author Posted August 30, 2024 Luke seems to be diligently ignoring this thread, so @softworkz, do you have any idea why it's an EXCLUSIVE lock and whether it would be easy to change it back to the default locking mode of NORMAL?
Solution softworkz 5067 Posted September 1, 2024 Solution Posted September 1, 2024 The Emby database is not an independent component. This means that the database in itself does not guarantee consistency. A part of the logic for this is in the application code and only that code is able to determine and ensure that retrieved data is valid and consistent. The WAL logic alone does not suffice, and that's why multi-process access to the database cannot be allowed. For the case when we would allow this, I have a pretty clear picture of the things that people would start trying to do and all the bad things that would happen as a consequence. And all those bad things would land right on our table - not on the table of those who have caused it. The really bad things would be of course when changes are being made, but even reads at a wrong point in time can cause trouble, so that's some of the reasons why we don't allow concurrent access. You have two options: Create a plugin and access the DB from that plugin (inside the Emby Server process) Create a point-in-time snapshot and work with that (taking the risk of inconsistency) By "snapshot", I mean to simply copy the database file. On Windows, that's possible even while Emby is running. I have never tried on Linux, but it should be working in the same way. 1
adminExitium 355 Posted September 3, 2024 Author Posted September 3, 2024 Thanks for the detailed reply. It almost seems like Emby is not using transactions correctly to make any changes to the DB, thus losing the ACID benefits of databases in general. On 01/09/2024 at 06:13, softworkz said: I mean to simply copy the database file I am not sure this would result in a sane DB because sqlite may be in the process of flushing a transaction from the WAL file to the main DB file when I try to copy it. I’ve had similar issues with other applications using SQLite with WAL mode, which is why I prefer using the SQLite backup API instead, if available. Anyway, I can atleast fallback to my backup method as a combination of periodic pulls from the complete /Items API + monitoring the LibraryChanged websocket event for any intermittent changes between pulls.
softworkz 5067 Posted September 3, 2024 Posted September 3, 2024 8 hours ago, adminExitium said: It almost seems like Emby is not using transactions correctly to make any changes to the DB, thus losing the ACID benefits of databases in general. Not quite. In general, it does follow ACID principles, but in cases, it favors performance over consistency. The bottom line is rather that it's simply not designed for being accessed in parallel - primarily it's about write operations, not that much about reading. Also, the DB schema can change from one version to another without notice. The exclusive lock makes it clear that directly accessing the database is not a supported scenario. 9 hours ago, adminExitium said: Anyway, I can atleast fallback to my backup method as a combination of periodic pulls from the complete /Items API + monitoring the LibraryChanged websocket event for any intermittent changes between pulls. That sounds like a sane and safe way. 1
Nebula_01 1 Posted November 14, 2024 Posted November 14, 2024 I'm trying the same and would be interested if you found a solution for this @adminExitium Or any other solutions to speed up the search?
softworkz 5067 Posted November 14, 2024 Posted November 14, 2024 (edited) You can try the scoped search in the WMC UI (https://wmc.emby.media/) You can constrain the search by type or by library. I've never tried it with large libraries, so it would be interesting to see how that compares. PS: It might also be worse. Focus wasn't performance but better control over the results... Edited November 14, 2024 by softworkz
Nebula_01 1 Posted November 14, 2024 Posted November 14, 2024 The search by type or by library is only working on android as far as my understanding goes. At least it isn't working for me in the web browser unfortunately.
softworkz 5067 Posted November 14, 2024 Posted November 14, 2024 Do you mean the search doesn't work or the you can't connect to your server from a browser?
softworkz 5067 Posted November 14, 2024 Posted November 14, 2024 If you cannot connect from a browser, you need to change the URL from https://wmc.emby.media/app to http://wmc.emby.media/app
Nebula_01 1 Posted November 14, 2024 Posted November 14, 2024 The "search by library" feature doesnt work in the web browser. It will still seanot rch in all libraries, only the one you were trying to search in. But Jellysearch or something similar for Emby would be the best, a true dream coming true. So hope @adminExitiumhas some tips for me
softworkz 5067 Posted November 14, 2024 Posted November 14, 2024 9 minutes ago, Nebula_01 said: The "search by library" feature doesnt work in the web browser. Can you be more specific? I've never gotten a report that it's broken and it's working here: 20241114-2302-54.6905035.mp4
adminExitium 355 Posted November 15, 2024 Author Posted November 15, 2024 7 hours ago, Nebula_01 said: if you found a solution for this Nothing at the moment. I know the changes that need to be done to Jellysearch or an equivalent app to make it compatible with Emby based on this discussion but I haven't had the time to actually make them.
adminExitium 355 Posted November 17, 2024 Author Posted November 17, 2024 Btw, any thoughts on how filtering could be supported with the external tool approach? I guess I could implement the filtering in the external tool, but that seems to be a lot of work and continuous maintenance for any possible API or tag changes. Another possibility would be to just fetch the IDs via the external tool and re-do the API call to Emby but using the IDs obtained from the app as an additional filter (via the IDs param in the API), which should narrow down the result scope considerably. The problem with this is how many IDs can be passed in a single API and if I will need to split it up into multiple calls and combine them later before returning.
Nebula_01 1 Posted November 22, 2024 Posted November 22, 2024 On 11/15/2024 at 12:04 AM, softworkz said: Can you be more specific? I've never gotten a report that it's broken and it's working here: 20241114-2302-54.6905035.mp4 10.96 MB · 0 downloads I meant the "normal" Emby website. Not this weird looking Windows Media Stuff
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now