MySQL Support

November 25, 2022

On 23/11/2022 at 20:55, softworkz said:

@cd01 - We have recently introduced settings to control the RAM usage for SQLite. If you haven't done so, please try to set this to something like 1.5x the size of your library.db. This may still greatly improve the experience with a high number of rows, even when it won't be the same like for 100k items.

Thanks for the tip mate, i have actually removed my photos from my library and i can live without it, it would just "be nice" to be able to utilize emby for this purpose as well, but not really required. It takes weeks if not months to rebuild the db to where it was when everything was indexing and i was trying it out so i rather not do it again, but i am thankful for your input Happy thanksgiving everyone

December 12, 2022

In case somebody would wonder what happened to the SQLite MMIO discussion - it's been split out to a new topic:

February 18

I know omthis topic is old but wanted to see if any options for external db are avaliable, and if options exist to contribute to its development.

In my case looking into using emby with postgres.

February 18

20 minutes ago, jose said:

I know omthis topic is old but wanted to see if any options for external db are avaliable, and if options exist to contribute to its development.

In my case looking into using emby with postgres.

I am not on the Emby team but having following this for years, this is my take.

- No there is no current implementation for external DB available.

- No there is no work being done on this currently.

- No there is no plan in place to do this work or add this.

Until the Emby team state otherwise, this is the current status.

February 19

7 hours ago, jose said:

I know omthis topic is old but wanted to see if any options for external db are avaliable, and if options exist to contribute to its development.

In my case looking into using emby with postgres.

What @TeamBsaid above is all correct at this point, even though it doesn't mean that it couldn't change in the future.

@jose- What do you want to achieve?

March 4

mysql is a much better database than sqlite.

Can bring higher performance and stability.

March 4

6 hours ago, zhijiandev said:

mysql is a much better database than sqlite.

And "MS SQL is much better than MySQL" and "Oracle is much better than MS SQL".
Without differentiation and consideration of the specific use case, such generalized statements are easy to make.

You may familiarize yourself with the concept of in-process databases and the pros and cons vs. external RDBMS first, before making such comparisons. If you read back in this conversation, you will find tons of details on this subject.

As it has been laid out already, there would undoubtedly be certain advantages in using an external RDBMS but also disadvantages.
Performance would be none of such advantages. In fact, performance would be rather somewhat worse, not better (in most cases and unless having extreme library sizes, like >10^6 records).
Also, there aren't any known stability issues.

March 4

As far as in memory, keep in mind mysql can do that also (referencing longago post) though it really isn't needed with a real server as it's likely all in memory anyway. It can also be run on the same machine avoiding the network latency penalty. In those cases, any in process advantage is minimal except maybe on slower cpus. But yes there is good and bad. For me, the good far outweighs the bad.

March 4

Simply put, a single internal database is the predominant limiting factor for scalability. That was my primary concern, not performance. I would love to run Emby in my kubernetes cluster but currently cannot do so.

March 5

1 hour ago, sfatula said:

As far as in memory, keep in mind mysql can do that also

I said in-process, not in-memory. This is a very different thing.

In-Memory

Means that the database system keeps all or parts of the data in memory while writing only changes to disk (simplified). This does not only apply to data itself but to all kinds of structures (indexes, precompiled/cached queries, tempdb, locks, etc.). Both, SqLite and MySQL (same like and other RDBMS) can do that. Without doubt, MySQL (and others) have much more sophisticated ways for handling this, but these are rather for use cases which are out of scope for SqLite (e.g. multi-user and other concurrency scenarios) and in case of Emby it's the most simple case anyway since - normally - all data fits into memory with ease (for which he had added the configuration a while ago).

In-Process

This is a whole different story, even though it also has to do with memory: Generally, no process is able to directly access memory of another process. When two processes want to exchange data, they need to use a way of IPC (Inter-Process Communication). There are many such mechanisms (e.g. RPC or DCOM/COM+ on Windows, Binders on Android, of course things like shared-memory, shared-file access, etc.). The probably most typical on unix based systems is using local loopback network access, which is the most used communication channel for RDBMS in general (besides named pipes on Windows), since it does not only cover local but also remote communication.
So, when an application communicates with MySQL for example - even when on the same machine - communication goes through the network loopback.
The application creates an SQL query as a string, sends it as a string via network, which means it gets sent, then received by the RDBMS, which parses it, executes it and once it has the data ready, it needs to send the data back via network. To do so, it needs to format the data as a streeam in a certain (defined) wire format. At the other end, the application needs to parse that wireformat first and cut out all the single pieces and put it into some structure first - which is not even the final structure that the application needs. It's just an interim format and the application iterates over these structures to read out and put the data into its own structures.
SqLite is an in-process database. It runs in the same process like Emby Server and both have access to the same memory.. Sql-queries can be pre-created on startup already, so there's no need to create an sql query string, you just reference a pre-compiled query when requesting data. The sqlite code can start directly to gather the data as needed without any network operation in-between. When the sqlite engine has the data ready, it doesn't need to create a stream of data to send back. Instead it provides the strucures right away as the query result and ideally doesn't even need to copy the data into these structures. These can (ideally) just point to the place in memory where the data is located, which means no copying happens here, whereas in the MySQL scenario, the result data has already been copied several times at this stage. Only the final step (copying into the application's own structures) is the same.

That's the big difference. It's technicaly big, in practical scenarios, the performance difference may vary from case-to-case, but in no case can an external db be better in this regard.

Edited March 5 by softworkz

March 5

17 minutes ago, rsvg said:

Simply put, a single internal database is the predominant limiting factor for scalability. That was my primary concern, not performance. I would love to run Emby in my kubernetes cluster but currently cannot do so.

Yup, that's one of the valid reasons.

"better in performance" is a fairytale story, though (in the specific use case of Emby).

March 5

2 hours ago, softworkz said:

(in most cases and unless having extreme library sizes, like >10^6 records).

*grin*

March 5

1 minute ago, Dibbes said:

2 hours ago, softworkz said:

(in most cases and unless having extreme library sizes, like >10^6 records).

*grin*

Yup. If most Emby users would have library sizes in your range, then SqLite would clearly be a sub-optimal choice..

March 5

1 minute ago, softworkz said:

Yup. If most Emby users would have library sizes in your range, then SqLite would clearly be a sub-optimal choice..

And that's just it...folks generating tons of rows are very much the exception. In addition to having to continue to maintain the SQLite code that works for the vast majority of users, you'd have to build a new back-end with a schema that probably bears little resemblance to what you already have, and probably have to maintain both of those for far longer than you could possibly imagine...and possibly worse, as you'd probably have to cook up an abstraction layer that would almost certainly would impact performance (and definitely complicate code maintenance).

March 5

11 minutes ago, softworkz said:

Yup. If most Emby users would have library sizes in your range, then SqLite would clearly be a sub-optimal choice..

When back home, I'll do an export and see what it's at... Responsiveness is still fairly ok though

March 5

2 minutes ago, sydlexius said:

And that's just it...folks generating tons of rows are very much the exception. In addition to having to continue to maintain the SQLite code that works for the vast majority of users, you'd have to build a new back-end with a schema that probably bears little resemblance to what you already have, and probably have to maintain both of those for far longer than you could possibly imagine...and possibly worse, as you'd probably have to cook up an abstraction layer that would almost certainly would impact performance (and definitely complicate code maintenance).

Different backend structures are a no-go for this. And migrating to a "compatible" DB layer (which can be used with multiple backend DBs) comes with performance penalties, specifically for use with SqLite.

Yet, the current structure is far from ideal and employs quite a number of anti-patterns, which well - it is like it is - don't want to elaborate on that, but that - at least bears - some potential for improvement.

The small chance on the horizon for this would be when structural changes could compensate the performance penalties of migrating to a universal db layer. Then it might find acceptance (nobody wants to publish an update with degraded overall performance), but the elephant in the room is still the amount of effort it takes and the trouble involved in updating all installations (automatically).

March 5

5 hours ago, softworkz said:

I said in-process, not in-memory. This is a very different thing.

Of course it is, and I said, in case you missed it: "(referencing long ago post)". Guess what that means?

i.e., I was not referencing the post you are requoting. I am well aware of the difference and how it works, but nothing wrong with documenting your thoughts.

I am not going to stop asking for mysql though, and it's not for performance reasons. I have documented previously why I want it but it's likely they are not big reasons for a lot of people admittedly. I'll add another one though, backups and logging, quite different. There are recovery advantages to MySQL in addition to all the things I previously pointed out. I don't think that's a big advantage in Emby's case though.

Does SqLite as you are using it use any database enforcement of rules to prevent bad data, various techniques there, are you actually using referential integrity constraints and using triggers? Curious

I will say again, as much as I'd love to see MySQL today, I am not convinced it is likely at all. That's really too bad, for me at least. Using SqLite causes me so many issues (well, limitations is more what I mean here). But it does work in it's intended narrow usage. When I used MythTV there was so much data I could get out of it.

But please, adding a few lines of extra code to be a little more flexible is not a performance penalty on modern systems. I don't know the state the code is in though, maybe it's just straight line code (awfully efficient for sure). In that case, moving from current code to supporting at least 2 databases not a small task. Given that Emby is older software, unless all that has been rewritten, it's likely it's not in the best shape for such a task.

Edited March 5 by sfatula

March 5

The biggest advantage of MySQL and equivalent databases that I would see is the possibility of a proper query cache for any kind of item lists like the dashboard latest items (which take 10+ seconds on my setup), collections, playlists etc. rather than the page cache of sqlite which needs to recalculate those rarely changing query results repeatedly.

Another advantage would be the possibility of creating materialized views of any time taking queries, so whenever the items change, it takes very little time to update the results.

March 5

I have no idea why people are still debating this, it's like talking about if we should use fairies vs unicorns, it's just not going to happen.

If you are just comparing and sqlite vs mysql vs PostgreSQL vs mssql etc for the fun of it then yeah go ahead but at no point is it worthwhile talking about how this will affect Emby cos it is not going to happen.

March 5

2 hours ago, sfatula said:

Does SqLite as you are using it use any database enforcement of rules to prevent bad data, various techniques there, are you actually using referential integrity constraints and using triggers? Curious

Yes, foreign key constraints, sometimes coluimn data constraints, indexes, also partial indexes. Triggers only for full-text index updating IIRC.

2 hours ago, sfatula said:

But please, adding a few lines of extra code to be a little more flexible is not a performance penalty on modern systems. I don't know the state the code is in though, maybe it's just straight line code (awfully efficient for sure). In that case, moving from current code to supporting at least 2 databases not a small task. Given that Emby is older software, unless all that has been rewritten, it's likely it's not in the best shape for such a task.

It is using many patterns which are SqLite specific and non-standard sql, but what makes this a difficult task is not the db layer alone - it's the way how it is being used from everywhere, and that's in many cases based on the paradigm that (the non-complex) db queries are extremely cheap (which they are in fact), so when for example the same data is needed multiple times in a certain context, there's no local caching done to avoid multiple requests, since that doesn't matter. It would matter though, with a remote db. When you have the db in your own memory, you also don't need to think about using bulk updates for certain tasks - and that's just two examples to illustrate what I mean.

2 hours ago, sfatula said:

Given that Emby is older software, unless all that has been rewritten, it's likely it's not in the best shape for such a task.

It's not about old or new. It's just optimized in this way. And the required changes are reaching far out from the db layer.

March 5

1 hour ago, adminExitium said:

for any kind of item lists like the dashboard latest items (which take 10+ seconds on my setup),

That sounds wrong. How many items do you have in the mediaItems table?

March 5

It's probably one of the largest DBs on this forum atm, so I expect it to be quite slow

The count is 2169299.

Edited March 5 by adminExitium

March 5

1 hour ago, adminExitium said:

Another advantage would be the possibility of creating materialized views of any time taking queries, so whenever the items change, i

Materialized views can improve querying in certain specific cases only. Sometimes it's a great feature indeed. But when you have self-referencing tables with variable hierarchy relations (like Emby has), you can quickly end up with crazy figures. Just hypothetical example: for case with 1 Million base records you might quickly end up having a billion of records when materializing, which is not helpful at all.

March 5

7 minutes ago, adminExitium said:

It's probably one of the largest DBs on this forum atm, so I expect it to be quite slow

The count is 2169299.

Maybe not that large as you might think... @Dibbes?

March 5

6 minutes ago, softworkz said:

having a billion of records when materializing

Yeah, but you can always split it up into multiple material views based on various parameters so no single view should have those many rows. Even so, storage is cheap enough these days that I can easily put in another NVMe (or even Optane) disk if it will save the 10 seconds or so that it takes to load the dashboard and the library listings every time.

Anyway, the major advantage is the query cache, with material views a distant second.

MySQL Support

Recommended Posts

cd01 8

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

jose 73

Link to comment

Share on other sites

TeamB 2352

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

zhijiandev 4

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

sfatula 185

Link to comment

Share on other sites

rsvg 13

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

Dibbes 430

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

sydlexius 238

Link to comment

Share on other sites

Dibbes 430

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

sfatula 185

Link to comment

Share on other sites

adminExitium 170

Link to comment

Share on other sites

TeamB 2352

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

adminExitium 170

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

softworkz 3324

Link to comment

Share on other sites

adminExitium 170

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in