Jump to content

MySQL Support


dcrdev
Abobader
Message added by Abobader,

Last warning: Attacking other members & team will cause a ban for your account.

Recommended Posts

It's not really a particular flavor, it only allows one concurrent connection. It can hardly be considered a db server. I used it once for a game in college and I always considered it a stop gap tbh

Your roadmap is your consideration, I'm fine with that I just hope you prioritize it before someone comes out with a solution that allows it, so I don't have to switch and waste the money i spent on this. 

nayr's comment above perfectly highlights why this will continue to be a requested feature and I urge everyone that reads this that wants a real database to comment on this post.

Link to comment
Share on other sites

15 minutes ago, rsvg said:

It can hardly be considered a db server

I acknowledged that in my response.  But, our major use-case did not require a true server (and the complexities that come with that).  So the simple solution serves our current user base better.

Link to comment
Share on other sites

Dibbes
4 hours ago, ebr said:

But, are you having an issue that can be directly linked to the internal storage implementation? 

Yes, Eric, I do... Adding a few hundred thousand music tracks to Emby slows the whole server down to a point it becomes a real pain and it's getting worse with newer server versions. I've been through the whole "get rid of plugins", is Emby on separate storage, where is the cache folder, new installs, etc. with you guys a few times already, but somehow it always dies off... This is happening on various flavours of Emby (NAS, Windows, Linux and on various pieces of hardware) and only when adding a large amount of music tracks this really becomes an issue...
Would this happen with another "flavour" of SQL? Possibly... but it's less likely that if I have Emby installed on 1 server and the database on my dedicated SQL machine with a 2.5Gbit connection, that I have these issues...

  • Like 1
Link to comment
Share on other sites

22 minutes ago, Dibbes said:

Yes, Eric, I do... Would this happen with another "flavour" of SQL? Possibly...

I don't think that situation definitively indicates that your issue would be solved by a different storage engine.

6 minutes ago, Spaceboy said:

"few hundred thousand" could be quite a range but i'm in that ball park and i don't see this at all.

 

Link to comment
Share on other sites

Dibbes
5 minutes ago, Spaceboy said:

"few hundred thousand" could be quite a range but i'm in that ball park and i don't see this at all.

I have about 8k movies, 40k series episodes. No issues with this whatsoever... Issues start when adding around 150k tracks to that. Even loading the webpage get slow. Getting rid of 20k tracks then restores functionality completely. Funnily enough it doesn't make any difference what the hardware is. I've tried this on an old Intel i7 Gen 2 mobile CPU, an AMD FX8370 and an AMD 3700 with sufficient RAM, 16GB plus, Emby on SSD's and more.
 

4 minutes ago, ebr said:

I don't think that situation definitively indicates that your issue would be solved by a different storage engine.

Agreed. I'm not saying it definitely will. Funnily enough though it's the only thing that has not changed over the last 5 years and the upper limit has gone up a little (it was around 120k with the FX CPU) but not a lot...

Link to comment
Share on other sites

I personally don't believe performance to be as big of an issue  regarding SQLite alternatives, when compared to concurrent connections/externally hosted gains. SQLite is performant enough to be considered negligible unless you have an insane amount of content and even then I'd wager that any slowdowns could be caused by a multitude of other factors before SQLite itself, even possibly disk read speeds when querying the db.

You may see (probably undetectably) small performance gains using something other than SQLite, but my guess is that you could see much larger gains by hosting it on another machine entirely.  But please correct me if I'm wrong. All this is obviously still recognizing, as stated, that this falls out of scope of current user interests. Just thinking about it aloud.

Edited by rsvg
  • Like 1
Link to comment
Share on other sites

sfatula
2 hours ago, ebr said:

I don't think that situation definitively indicates that your issue would be solved by a different storage engine.

 

Perhaps not. But I used to manage a server and a process just a couple years ago. On that machine every night, we'd process around 400,000,000 rows. That was across around 100 tables. The process ran for about 4 hours. Not a super fast server. So.... properly done I would say it's quite likely, assuming that's the issue of course. That was Mariadb on Centos, and it was even on hardware Raid6.

Edited by sfatula
Link to comment
Share on other sites

15 hours ago, sfatula said:

So.... properly done I would say it's quite likely

Unless you also ran the same exact process with the same exact data and hardware with the only difference being SQLite instead of the other DB, I don't understand the above conclusion...

Link to comment
Share on other sites

sfatula

And the same can be said to your point. What you seem to keep implying is there would not be any difference and there is no issue the way things are. And, when someone says it's slow, it likely isn't SQLite, etc.

My point was, there is a huge performance difference between a properly tuned Mariadb database and SQLite, there's simply no debating that. The capabilities are different as well. As are specific features of specific statements. This in itself makes comparing the exact same query on the exact same hardware, etc pointless as why would you even do things the same way necessarily. I honestly don't understand your point.

Link to comment
Share on other sites

34 minutes ago, sfatula said:

What you seem to keep implying is there would not be any difference and there is no issue the way things are.

No, what I'm saying is that a lot of people are assuming that changing the underlying storage engine would have a large impact on performance and I'm just saying that is not a valid assumption at this point.   There could be a huge number of other reasons for any particular performance issue and we should look at those and see what can be determined from them.

Link to comment
Share on other sites

pwhodges
38 minutes ago, sfatula said:

My point was, there is a huge performance difference between a properly tuned Mariadb database and SQLite, there's simply no debating that.

You have measurements, when undertaking the kind of work that is required in Emby?

It cannot be assumed that the more capable package is always faster; sometimes the code necessary to do more advanced things (like transactional stuff) gets in the way of the simple usage and maybe slows things up a bit.

Paul

  • Like 1
Link to comment
Share on other sites

sfatula

I would agree with that statement, and one of those things to look at would be...... SQLite right? That's of course one potential reason for performance issues for people with larger databases.

There are numerous other advantages to a "real" database. I am sure they have all been mentioned previously. I'd love to be able to get some data out of the database into my own (read only) processes. And no, I don't want to divulge what those might be. Multi user would sure be nice by itself!

Edited by sfatula
Link to comment
Share on other sites

sfatula
9 minutes ago, pwhodges said:

You have measurements, when undertaking the kind of work that is required in Emby?

It cannot be assumed that the more capable package is always faster; sometimes the code necessary to do more advanced things (like transactional stuff) gets in the way of the simple usage and maybe slows things up a bit.

Paul

Didn't say that. My point about Mariadb vs SQLite was there seemed to be suggestions by several that they are somehow in the same league IN GENERAL. That being said, with 30 years of DBA experience on a load of different types of systems and understanding internals, I'd have to say yes, I know that Mariadb would be faster for Emby and require zero proof, experience does provide that. Faster that makes any difference? Maybe not and don't know as that would require access to the internals of Emby and a large database of data. Does this mean anything to you? Likely not. I agree with ebr on that point, specific examples need to be looked at. The whole performance angle is moot unless an example install is tested and shows issues with performance.

Edited by sfatula
Link to comment
Share on other sites

I too am not buying the performance implications, I've got 80TB of media and my data folder is only 530MB, I've got databases elsewhere a couple orders of magnitude in this size.

6.3M	./config
3.9G	./cache
5.2M	./plugins
1.0K	./.gnupg
393K	./root
 31K	./.dotnet
 18K	./sync
 33G	./metadata
366K	./.nv
 21K	./localization
529M	./data
4.0M	./fonts
 40K	./.cache
273K	./transcoding-temp
 14M	./logs
 37G	.

You are not really going to gain anything performance wise out of having an external database with a pittance of a dataset like this.

You will however, as I pointed out before.. gain the ability to configure high availability.. and potentially performance gains from distributing transcoding jobs across machines. These are not enterprise desires, we can spin up highly available setups with COTS hardware and recycled gear.. 

I understand this is not a priority, and there are other more important features I'd like to see first (like HEVC output on transcode) that potentially have a bigger impact on more users first.. but this is definitely a feature I'd like to see work its way out sooner than later. If the underlying design always had this in mind then the effort required to implement this should not be as technically hard as some of the other things on the todo list, low hanging fruit. I believe this would make a fine premium feature, none of your competition supports highly available configurations and those evaluating the options available will see that and it will stand out among the crowd, even if they dont have immediate plans to use it  its one of those things I believe if you build it, they will come.. People will keep it in the back of their minds and once they have a highly impactful outage the chances are high they will consider building towards a HA goal.. its unlikely to be a feature widely used by the masses, but it is a feature that I believe will be attractive to those whom use Emby as the primary/only source of media.

I can pretty much guarantee soon after you finally implement this feature you will have several users publish quick/simple/easy to follow guides for spinning up a HA k3s config on whatever hardware they can scrape together, from lil arm boards to big xeon recycler servers... The community will start writing charts, other automation and dramatically simplify this for everyone else, it will then take off as people see the advantages container orchestration brings and the vail of technical debt is lifted, it will only take a few days before less technical Emby admins can understand and deploy their own HA setup too with a few simple commands.. this is not a bespoke feature request that will languish such as the LDAP implementation, k8s is a major technology player just gaining more and more adoption, You are going to have a ton of HomeLab users jump on this so they can run personal production workloads at home simply for the exposure and experience they will use in their professional careers.. The Staff must see this, how much uptick have you seen in your docker container downloads since you released it.

Edited by nayr
  • Like 1
Link to comment
Share on other sites

sfatula

My library.db is 75MB, so, easily all cached and no way that is an issue for me. I have no idea how large this can get though. But I have other uses including HA as you say. Replication, database triggers would be sooo useful to me for some automation, etc.

The previous media server I was using before Emby was mythtv, and they do use MySQL. That was nice! Just didn't have very many clients on many platforms. And of course Kodi can use MySQL as well.

Edited by sfatula
Link to comment
Share on other sites

2 hours ago, nayr said:

I can pretty much guarantee soon after you finally implement this feature you will have several users publish quick/simple/easy to follow guides for spinning up a HA k3s config on commodify hardware, from lil arm boards to big xeon recycler servers

Me being one of them, was planning on putting this into a chart anyway. 

Link to comment
Share on other sites

This is simply asking for better micro-services architecture really, emby's monolithic approach is not always the best strategy for everyone and scaling, for example many people like me dont see things like Raspberry Pi's as viable right now, I would crush one of those with my needs, those only seem suitable for very light users yet Emby targets em.. but if I could scale them out horizontally as demands grow instead of building a huge mega server you know what that would be a really viable option given its got good hardware support.

Would really complement alot of features emby already provides, like being one of tho only ones to support bespoke transcoding hardware on lil arm boards.. Three lil Pi's with 2-4gig of ram and a NAS doing netboot, running the database server on flash, and feeding em media would do rather well for most power users I suspect, and then you can scale out to 5 nodes, then 7 and etc if demands grow.. also makes using cheap lil hardware like that in a production a bit more comfortable knowing a hardware failure will have a minimal impact.. For the price of video cards now days we could get a Pi Cluster up and going for a fraction of that.. $350 for a Nvidia P2000 right now, I paid $280 for mine almost 4 years ago wtf.. I could buy 3x oDroid H2+ which have modern QuickSync, sata, nvme and 2.5GigE nic's for that and end up with whole high performance transcoding cluster for not much more than the cost of a basic video card. Scaling out lets people start off cheap and grow if needed, instead of overbuilding and then hoping to grow into it now.. this reduces costs for people wanting to get into Emby.

Emby's Docker container that writes to a sqlite db through a bind mount is not really adopting the stateless container philosophy at all, its a poor shoe horn like most other container based services out there.. despite having all inputs and outputs going over the network its not portable if I've got to transport local bind mounts that could also be a standard network transport. 

One of the features I want most, the HEVC output requires pretty good hardware backing it.. some of the excuses to implementing it have been few have access to the hardware given the cost.. well if we can scale on cheap little arm/x86 boards w/hevc encoders that can handle only 1 job then there's a great solution.. <$50 in hardware per 4k output stream is a heck of a deal compared to discrete ~$400+ cards that can do mebe a half dozen of em.

You can look all over github and find distributed transcoding projects for all the major players in this field, emby included, that people try to shoe horn in, the demand has always existed if people are willing to put that much effort into trying to make it happen. However all these implementations are not robust and still have a single point of failure, the main emby database so the extra complexity just makes it more fragile and less scalable IMO.

Decoupling the database from a single instance to a multi instance would really, really be a great leap in capabilities that is going to make Emby even more attractive, we're not asking for Emby to support k8's today with s3 storage backends and an emby operator and all that.. hopefully one day, but lets not drag our feet on this thinking containers are a fad, I assure you they are not.

I'm not even asking for a particular backend, support em all or just pick one you think is best for the job.. as long as I can fire up multiple containers against it we can start to really scale horizontally and setup HA. 

Edited by nayr
  • Like 1
Link to comment
Share on other sites

Dibbes

Without music I have a database of just over 1.4GB and I don't see performance issues. Somewhere on this forum though, and I'm too lazy to go and look for these posts, I have posted screenshots of databases I had in the past when part of my music collection was added. These were well over 12GB after a vacuum and were only about a week or two old, started from scratch. 

Also, when on Windows the Emby process reaches 1.5GB of RAM utilization, it slow down a lot as well. The server has 32GB, so it shouldn't really be an issue...

I don't care about micro-services in the slightest as I'm not going to be using K8s for Emby as that will give me issues with hardware transcoding, I don't care about high-availability as it's a home server, so if it's down, then it's a quick app to the family groups and that's it. What I really would like though is something robust enough that I can add all my media to. So next to movies, series, documentaries and recordings, I'd love to add my audiobooks, normal books and comics as well... currently I'm not even getting half way with music and am running various services in Docker for Audiobooks and Music and have a separate instance for comics...

Edited by Dibbes
Link to comment
Share on other sites

Still 12GB is not really all that big, especially for a read heavy dataset like Emby has.. SQLite can handle databases 100GB in size for read heavy workloads just fine.. its not til you get write heavy that it really struggles with its single write lock strategy, but by this size most are onto BerkleyDB or other better options with fantastic concurrent write performance.. but nothing emby does needs concurrent write performance as you'll see a few orders of magnitude more selects than inserts. 

What storage backend are you using for the database? If your not on SSD already you should be. 

I've got a sneaky suspicion you are bottlenecking somewhere else and if Emby handed you MySQL today it'd not get you anywhere.. 

Link to comment
Share on other sites

I still stand by my conclusion, and the staff responses.. there may be a poor design choice in the data structure its self complicating issues for you, but the fix for that is actually root causing your problem and finding the issue.. moving the same data structure to another backend would not improve the situation if something is not indexed properly and a simple query is capable of trashing the disk IO while it has to read in the whole dataset to find your rows.

I've got production databases that are serverless and writing to filesystem directly much like SQLite that are an order of magnitude larger than what you are having troubles with.. ifact we use DQlite in k3s as default storage backend, along with etcd and others all over the kubernetes realm in large production environments and they all scale perfectly fine to datasets bigger than you are dealing with.

Speaking of Emby could just drop in DQLite and I'd have what I'm looking for. 

Link to comment
Share on other sites

sfatula
1 hour ago, nayr said:

Still 12GB is not really all that big, especially for a read heavy dataset like Emby has.. SQLite can handle databases 100GB in size for read heavy workloads just fine.. its not til you get write heavy that it really struggles with its single write lock strategy, but by this size most are onto BerkleyDB or other better options with fantastic concurrent write performance.. but nothing emby does needs concurrent write performance as you'll see a few orders of magnitude more selects than inserts.

This is true, 100Gb can be handled. That being said, the server emby is running on can be limited, perhaps it has low memory, not using SSD, etc. Why should the Emby server need to have those. This isn't necessarily a server, which is the idea behind storing the DB elsewhere from the emby server. This makes SQLite not the best choice under those conditions. But sure, if you have a decent spec server, SQLite can do that size. The fact that it may perform well in most instances (at a minimum) doesn't negate the other uses of a better database though, performance was merely possibly one of those reasons, many others in this thread. SQLite is indeed used all over the place. I even use it on my router, to store power usage information over time from various devices. In the case of Emby, there are numerous other uses however. You'll find many examples (and reasons) where people use MySQL and Kodi for instance.

Many well designed apps have objects/classes to deal with the database code. In those cases, it's pretty simple to replace a backend with another backend.

Edited by sfatula
Link to comment
Share on other sites

Emby is far more compute and bandwidth intensive than its on the IO front, if you have a situation where decoupling the DB storage you also have a situation where Emby is already pretty heavily crippled in the first place.

Tho fundamentally I agree with you, this is where microservices architecture comes in to play.. you can accomplish alot more with much less once things are capable of scaling horizontally instead of vertically.. a monolithic approach only scales up, the monolith just has to get bigger as its only way to move.. thats why so many users here are running mega servers to support our personal needs, that sit mostly idle. 

Many of us suffer same problems any other provider has, just like the big gaming services have to build out absurd capacity for the day after xmas only to let it sit idle the rest of the year.. this is what microservices architecture and kubernetes was designed to solve. The largest video providers on the net use it, pretty interesting to see hundreds of nodes spin up just to handle the extra load at the top of the hour when everyone changes LiveTV channels, then spin back down a few mins after the hour when the load drops back down.

We can have the same cool things, run your Emby on a tiny lil cheap and power efficient server and once it starts to reach its capacity, another node is powered up and brought in to take care of the work.. now you dont have to build a big server just to handle once a month Sunday Movie night when the whole gang is transcoding.

Link to comment
Share on other sites

Dibbes
42 minutes ago, nayr said:

I still stand by my conclusion, and the staff responses.. there may be a poor design choice in the data structure its self complicating issues for you, but the fix for that is actually root causing your problem and finding the issue.. moving the same data structure to another backend would not improve the situation if something is not indexed properly and a simple query is capable of trashing the disk IO while it has to read in the whole dataset to find your rows.

I've got production databases that are serverless and writing to filesystem directly much like SQLite that are an order of magnitude larger than what you are having troubles with.. ifact we use DQlite in k3s as default storage backend, along with etcd and others all over the kubernetes realm in large production environments and they all scale perfectly fine to datasets bigger than you are dealing with.

Speaking of Emby could just drop in DQLite and I'd have what I'm looking for. 

I'm fairly sure this can be tested, I think I've also offered access to my systems once or twice over the years. I'd love to get this resolved, but it also seems I'm one of the very few with these issues or other people are just not reporting this. I'm not saying you're wrong either. I sincerely doubt it's a hardware issue though, as the SSD's (Emby is on a different SSD than the OS and the Cache folder on a different one again) are reporting less than 900 IOPS peak over 24hours. Also the CPU is barely even registering activity and transcoding is offloaded to the NVidia 1660. I know it's a midrange card, but more than sufficient for purpose. Also the server runs on 32GB of RAM, so that should be enough...

Again, I'm not saying that all my troubles will be over migrating to MySQL, MSSQL, MariaDB or any of the others mentioned, but since everything else has been swapped out, replaced and/or upgraded over the years from the Server OS to all the hardware, I have a sneaky suspicion that it will make quite a difference... and yes, I'm also aware I'm making assumptions here about concurrent read/write, concurrent access, etc.
 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...