Jump to content

Exclusive DB Lock


Go to solution Solved by softworkz,

Recommended Posts

adminExitium
Posted

Due to various reasons, I’ve been trying out a Jellyfin instance recently, and similar to Emby, one of the biggest problems there is search. Emby is quite a bit better in the search performance, but it still takes quite a lot of time when searching across large libraries (10-15 seconds on average for me) and also lacks proper fuzzy matching and typo tolerance.

A user created an external search utility that can be used to answer all searches while using meilisearch (https://www.meilisearch.com/) as its backend: https://gitlab.com/DomiStyle/jellysearch. This solves all my problems with search like fuzzy matching, and the results are almost instantaneous (<100 ms on average).

I would like to try porting something similar for Emby, but the problem I’m facing is that I can’t access the DB directly to update the item list in meilisearch because of the exclusive lock taken by Emby. Is there a specific reason for this exclusive lock, and if not, can it be removed to allow other applications to access the DB concurrently, which is safe anyway because of the WAL mode?

adminExitium
Posted

@LukeWould it be possible to remove the EXCLUSIVE lock on the DB so other processes can access it?

rbjtech
Posted

I guess it depends on your library size, but every search I perform is returned in a few hundred ms - what hardware are you running ?   Also what DB cache size vs the library.db file ?

adminExitium
Posted

5.5 GB library DB and 16GB as the cache size. The item count is 2169299 and it is running on an Intel Optane Drive. The average search timings are in the 4-6 seconds range.

I understand this is a huge DB compared to the average DB sizes here, which is why I want to try working on a solution myself rather than bugging @Lukerepeatedly for performance improvements.

The only thing blocking me is the EXCLUSIVE lock on the DB file, which I don't understand the need for at all when the DB is already opened in WAL mode, which is already safe for concurrent access from multiple processes.

  • Like 1
rbjtech
Posted

wow ok - so yes, that's big 🤪 mine is 0.5GB .. Errmm over to Luke as agreed it looks like an indexing issues rather than raw underlying performance.

adminExitium
Posted

Luke seems to be diligently ignoring this thread, so @softworkz, do you have any idea why it's an EXCLUSIVE lock and whether it would be easy to change it back to the default locking mode of NORMAL?

  • Solution
Posted

The Emby database is not an independent component. This means that the database in itself does not guarantee consistency. A part of the logic for this is in the application code and only that code is able to determine and ensure that retrieved data is valid and consistent. The WAL logic alone does not suffice, and that's why multi-process access to the database cannot be allowed. 

For the case when we would allow this, I have a pretty clear picture of the things that people would start trying to do and all the bad things that would happen as a consequence. And all those bad things would land right on our table - not on the table of those who have caused it. The really bad things would be of course when changes are being made, but even reads at a wrong point in time  can cause trouble, so that's some of the reasons why we don't allow concurrent access.

You have two options:

  1. Create a plugin and access the DB from that plugin (inside the Emby Server process)
  2. Create a point-in-time snapshot and work with that (taking the risk of inconsistency)
    By "snapshot", I mean to simply copy the database file. On Windows, that's possible even while Emby is running. I have never tried on Linux, but it should be working in the same way. 
  • Thanks 1
adminExitium
Posted

Thanks for the detailed reply. It almost seems like Emby is not using transactions correctly to make any changes to the DB, thus losing the ACID benefits of databases in general.

On 01/09/2024 at 06:13, softworkz said:

I mean to simply copy the database file

I am not sure this would result in a sane DB because sqlite may be in the process of flushing a transaction from the WAL file to the main DB file when I try to copy it. I’ve had similar issues with other applications using SQLite with WAL mode, which is why I prefer using the SQLite backup API instead, if available.

Anyway, I can atleast fallback to my backup method as a combination of periodic pulls from the complete /Items API + monitoring the LibraryChanged websocket event for any intermittent changes between pulls.

Posted
8 hours ago, adminExitium said:

It almost seems like Emby is not using transactions correctly to make any changes to the DB, thus losing the ACID benefits of databases in general.

Not quite. In general, it does follow ACID principles, but in cases, it favors performance over consistency. 

The bottom line is rather that it's simply not designed for being accessed in parallel - primarily it's about write operations, not that much about reading. Also, the DB schema can change from one version to another without notice. The exclusive lock makes it clear that directly accessing the database is not a supported scenario.

9 hours ago, adminExitium said:

Anyway, I can atleast fallback to my backup method as a combination of periodic pulls from the complete /Items API + monitoring the LibraryChanged websocket event for any intermittent changes between pulls.

That sounds like a sane and safe way.

  • Like 1
  • 2 months later...
Posted

I'm trying the same and would be interested if you found a solution for this @adminExitium
Or any other solutions to speed up the search?

Posted (edited)

You can try the scoped search in the WMC UI (https://wmc.emby.media/)

You can constrain the search by type or by library.
I've never tried it with large libraries, so it would be interesting to see how that compares.

 

PS: It might also be worse. Focus wasn't performance but better control over the results...

Edited by softworkz
Posted

The search by type or by library is only working on android as far as my understanding goes.
At least it isn't working for me in the web browser unfortunately.

Posted

Do you mean the search doesn't work or the you can't connect to your server from a browser?

Posted

The "search by library" feature doesnt work in the web browser.
It will still seanot rch in all libraries, only the one you were trying to search in.

But Jellysearch or something similar for Emby would be the best, a true dream coming true. 😆
So hope @adminExitiumhas some tips for me

Posted
9 minutes ago, Nebula_01 said:

The "search by library" feature doesnt work in the web browser.

Can you be more specific? I've never gotten a report that it's broken and it's working here:

 

adminExitium
Posted
7 hours ago, Nebula_01 said:

if you found a solution for this

Nothing at the moment. I know the changes that need to be done to Jellysearch or an equivalent app to make it compatible with Emby based on this discussion but I haven't had the time to actually make them.

adminExitium
Posted

Btw, any thoughts on how filtering could be supported with the external tool approach?

I guess I could implement the filtering in the external tool, but that seems to be a lot of work and continuous maintenance for any possible API or tag changes.

Another possibility would be to just fetch the IDs via the external tool and re-do the API call to Emby but using the IDs obtained from the app as an additional filter (via the IDs param in the API), which should narrow down the result scope considerably. The problem with this is how many IDs can be passed in a single API and if I will need to split it up into multiple calls and combine them later before returning.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...