Jump to content

MySQL Support


dcrdev
Abobader
Message added by Abobader,

Last warning: Attacking other members & team will cause a ban for your account.

Recommended Posts

sfatula
10 minutes ago, nayr said:

Emby is far more compute and bandwidth intensive than its on the IO front, if you have a situation where decoupling the DB storage you also have a situation where Emby is already pretty heavily crippled in the first place.

 

Except for the other dozen reasons in this thread to use something other than SQLite. I absolutely agree with statement here. Maybe someone will take the time to summarize the reasons why it would be useful based on all the previous posts in this and other threads. Performance being an unproven reason. So many good things could come out of it. Looking at Emby as a simple single server application, is not thinking of other possibilities and markets. Just try and take away MySQL from the Kodi users of it, not going to happen.

  • Like 1
Link to comment
Share on other sites

I agree, the performance aspect is not going to win over whomever the product manager is.. like they said, they got more pending matters to attend to than squeak a few microseconds out of the database for 99% of the user base.

I believe the parallel servers sharing the same dataset is the most justified use case for this change, now it seems some of the Staff here believe that would be such a rarely implemented feature it'd only be desirable on the "service provider" level. I dont know if their perspective makes them think this a technically complex solution or financially prohibitive solution for the average user, but I'm here to dispel that notion.

If implemented it could have dramatic and quick tangible impacts on Emby Admins total operating costs of this service.. If I'm currently tapping out my existing hardware and looking to upgrade, or going to a dedicated computer and proper setup for the first time..  having the option to scale out of your problems cheaper without throwing your existing stuff in the garbage heap will make converting to premium seem and learning how to setup a cluster a heck of a deal.

Just sayin.

  • Like 2
Link to comment
Share on other sites

sfatula

There are others, some of which may even reduce dev time. For example, there would be more and likely better plugins / addons as data not currently available would be exposed in an easier to get to manner. Perhaps, negating the development of some APIs even. I know even code I've developed for Emby for my own uses would be more functional than they are today as I am limited to what the API exposes, which isn't all the data I need. Details not to be provided in this thread. But I can see a number of the requests be asked for in the feature requests section being possible and not too difficult for coders outside of Emby to tackle.

All that being said, the downside is of course more people messing with the database. Read only is one thing, but writing would cause support headaches, that could be mitigated with good techniques. I do agree that it's a minority of people that would benefit. But since I am one of them, I want it. 😇

Link to comment
Share on other sites

yeah but I know from a developer standpoint the thought of having users mucking about in the dataset is a recipe for disaster, thats why most like using things like sqlite in the first place, its just enough abstraction to keep em from messing it up, dont even ship w/a sqlite CLI client.. I agree tho, would love to be scraping the database directly and building tools off it, but I dunno if thats helping our sales pitch here heh.. at least with an external dataset I can have a read only replica that has no impact on an instance, so there's that but we're definitely into bespoke feature territory. 

If the thought of requiring an external server, making too many changes to core code, or users mucking about in dataset creating additional false bug reports are all concerns of the staff then they have this option they can implement: https://dqlite.io/docs/architecture but this requires Emby to make more effort into basic clustering instead of offloading all that work to other layers using existing tooling. I would prefer to just spin up a sql cluster personally but this would be a better path if clustering is actually something they get behind.

Its not going to give you guys who really want an external DB what you want unfortunately, but it would open up clustering options which I believe is what many are seeking out of this FR.

Edited by nayr
Link to comment
Share on other sites

shocker

# du -hs library.db
1.5G    library.db

And Emby is working great. I do believe there will be an improvement in performance switching to mySQL/MariaDB/Percona for PRO users but the most valuable thing will be the network DB to play with :)

Easier to create/integrate 3pp plugins without adding overhead to Emby via API (take example of Ombi), load balancing and many many more good staff :)

Link to comment
Share on other sites

ombi is really no strain on the api's, the clients use the api's way harder than anything we've came up with.. API's are good, ensures supportability because if you manage to mess it up via the API its a bug.. also consider api performance scales horizontally too, so more instances behind a load balancer will handle more api hits. 

Emby's API's and plugin interface are largely adequate and well documented, I've been able to work around any limitations just fine, it might be a little more work but its not been a major road block on anything I wanted to do.. not like this db backend that makes it nearly impossible to scale any way but up has been.

Having my emby in k8s gives me visibility into all the performance metrics one needs, and nothing on the api front or database io seem to be the least bit heavy or likely to be performance binding, outside tossing it on a solid state drive for the low latency (the throughput is never going to get used) theres not a lot to be gained here. You will be saturating GigE interfaces long before you find the limits of the db and web traffic any modern COTS server w/SSD is capable of. Compared to the other media solutions I've used Emby's doing a rather good job on keeping that dataset sane.. after coming over from another solution 4 or so years ago I found the storage allocation I made specifically for my media servers metadata and databases was grossly oversized w/emby.. I came to emby ready for 200GB of metadata and databases and only ended up needing like ~40GB when all was said and done. TBH I cant see how they can squeak anything else outta that and is a strong indication that database performance was something they baked right into the current designs.

I was designing a system last night as a thought experiment and found nothing else in current Emby setup that would be an issue with a parallel setup using existing FOSS tooling, all the configuration options needed are already exposed..

  • would need to turn off many of the automated tasks like library scans so you dont end up with all nodes scanning library in unison, but that can be replaced with a simple script to perform same actions via API that would only trigger it on a single instance..
  • it supports being behind a proxy and terminating SSL externally, yet dont really need load balancers as simpler segregation may be desirable, like one handling remote users likely to transcode, and local users direct streaming everything can run on another and not be impacted by the compute heavy users..
  • Transcoding files could be cached on a shared filesystem so if a node goes away another can take over seamlessly, I can already restart containers quick enough it has no impact on clients proving the clients can seamlessly handle getting kicked off an overloaded node within reason and is not hard to accomplish thanks to client caches. 
  • TV Recording can be offloaded to external plugins like TVHeadend..
  • Webhook extensions can be developed to handle alot of automation and coordination.. like overloaded nodes having a path to inform the load balancer its had enough, allowing a mishmash of nodes w/varying capacity thats not a pain to distribute work effectively. 
  • It has no problems working off a read only media collection so no collisions about people changing the media out from other users..
  • Basically all file inputs (scraping/downloading/etc) into Emby have long been fully automated and are ready to run as pods in the same namespace.

all of the foundations for emby clusters are already in place, just that darn sqlite database in my way.

Edited by nayr
  • Like 1
Link to comment
Share on other sites

If you dont have multiple workloads that need to access the same GPU at the same time its even easier than that..  https://rancher.com/blog/2020/get-up-and-running-with-nvidia-gpus

TLDR: Helm install Nvidia's GPU Operator with all default values, and sit back while it finds all the nodes with hardware, builds drivers for em, tags the nodes and just works.. you do nothing on the nodes, except mebe blacklist the open source noveau driver from trying to grab the GPU.

 

 

Screen Shot 2021-03-05 at 11.21.55 AM.png

Link to comment
Share on other sites

I have a question to all of you, who are interested in advances in this area.

In case that we would move into a direction like that, we surely wouldn't go just a half step. I mean that it should also be able to achieve true high availability. For high availability, you can't go with a single database server - you need a failover cluster setup, and that doesn't come for free - not even with MySQL (you' won't get far with 'community versions'). 

Such a setup would require at least $1000-$3000 for the database licenses alone. Assuming that licenses for a professional edition of Emby (with a lot more pro features than just the database access) would be in a similar range - Who would still be interested?

(this doesn't reflect any actual plans, but it could help making plans in that area)

Link to comment
Share on other sites

externally we can deploy self healing replicated external postgres clusters using the pgo kubernetes operator with a few easy commands at no extra cost and it will run on commodity hardware, installing it and setting it up in an existing cluster is ridiculously easy.. like 5mins if its your first time, seconds once you know what your doing.

https://access.crunchydata.com/documentation/postgres-operator/4.6.1/

and as mentioned earlier, if this is a route you want to go and KISS for your users can keep SQLite backend you got now and wrap it in DQLite: https://dqlite.io/ but then you are going to need to put more effort into making emby more cluster aware, adding API's and all that so we can add new nodes in since the database is still managed by emby.

I would not be willing to pay anything for database backend that can do this, there's tons of free solutions out there already from etcd, dqlite, postgres you can name it.. you've got the wrong idea about high availability backends clearly and have not been paying attention to the technology available with that post you just made. 

I've been willing to pay the monthly premium cost for last several years and avoiding converting to lifetime to support yeh all, but if yeh think I'd pay what you are considering to offer my family some high availability you're not going to get anywhere with me.. simply not going to happen.. I'll run it in my free kubernetes cluster with no licensing and get as close to what I need as I can get using the limitations before me.

Edited by nayr
Link to comment
Share on other sites

@nayr  Thanks, but you missed the point of my question. Development of such pro features will have a significant cost. And you'll surely understand that it would be unfair when we would fund the development for this feature in a way that it would be paid by the 99% of Emby users that have no need for those features.

So let me re-phrase the question: Who would still be interested in these pro features when it would involve license cost in a range from $1000 to $3000?

  • Like 1
Link to comment
Share on other sites

7 minutes ago, softworkz said:

Who would still be interested in these pro features when it would involve license cost in a range from $1000 to $3000?

PS: Anybody who can answer that question with 'YES', feel free to contact me via PM if you like.

 

For all others, I still got some great news: we will soon have some significant performance improvements for large SqLite databases! 🙂 

  • Like 3
Link to comment
Share on other sites

Thanks, but I think you missed the point of my post. I see now that I've completely wasted my time here.

Hey guys, who here wants to pay $1-3k to run a pi-cluster? mahaha, absurd.. yet you guys spend development time supporting those jankity lil heaps that individually are worthless but together in a cluster might be something worthwhile.

1 minute ago, softworkz said:

PS: Anybody who can answer that question with 'YES', feel free to contact me via PM if you like.

 

For all others, I still got some great news: we will soon have some significant performance improvements for large SqLite databases! 🙂 

go take a long walk off a short dock.

Link to comment
Share on other sites

2 minutes ago, nayr said:

go take a long walk off a short dock.

I don't know what that means, but I consider it as a 'No'.

No need to feel offended - I wouldn't pay that price for private use myself, but for business use it's something different. And that's the only way how development for this could get funded.

Link to comment
Share on other sites

Q-Droid
5 hours ago, softworkz said:

I don't know what that means, but I consider it as a 'No'.

No need to feel offended - I wouldn't pay that price for private use myself, but for business use it's something different. And that's the only way how development for this could get funded.

I'm not surprised you got that response though it's one we could have seen coming. Taking Emby from its current design to seamless HA would indeed take real development time and effort. And getting it partway there with some "jankity" half-assed HA (HA/HA?) would lead to more support requests and you guys already have your hands full. Many enterprise software products started as customer funded efforts. Taking Emby from a consumer product to commercial grade should be no different.

 

Your response in this other thread was the correct one but too many people don't get that.

 

Link to comment
Share on other sites

roaku

The original feature request just asks for external database support, specifically MySQL.

The high availability stuff is really another request altogether.

  • Like 2
Link to comment
Share on other sites

29 minutes ago, roaku said:

The original feature request just asks for external database support, specifically MySQL.

The high availability stuff is really another request altogether.

Yes that's correct. It belongs to the same work context, but it's not necessarily dependent.

Link to comment
Share on other sites

To be clear, no one on this thread wants, or is trying to run an enterprise version of Emby, especially if it costs anything more than premiere. That is not of interest to us and it's never been our goal to convince you all to work on something for corporations to buy (which i can't imagine there is a market for).

 

9 hours ago, softworkz said:

Such a setup would require at least $1000-$3000 for the database licenses alone. Assuming that licenses for a professional edition of Emby (with a lot more pro features than just the database access) would be in a similar range - Who would still be interested?

This is incorrect. This is a pre-container orchestration mentality of what it takes for a DB to be HA. In K8s, Mariadb, Postgres, or cockroachdb offer free self healing, fault tolerant, highly available deployments that (with only three nodes) provide all the SQL you'd need and scale horizontally. This is the new norm.

It seems some people here are confusing an HA version of emby with architecting an application to be container friendly/optimized to be run as containers.

But for now, all this boils down to is that softworkz, Luke, et al are simply not interested in building a modern container friendly application in regards to emby, and that's their choice even if we consider it a mistake. It was built to be run on an OS. Yes there is a docker container, but that doesn't mean the container is anything more than a wrapper around their application.

It doesn't follow container friendly application design. Particularly SQLite is stopping Emby from being horizontally scalable. https://medium.com/faun/5-tips-for-building-container-friendly-applications-9920db4f3dc9

I've looked into Jellyfin again recently because of this conversation, and it's come a long way, I'm really hoping it succeeds. But even k8s @ home's Jellyfin helm chart simply provides a highly available SINGLE INSTANCE of Jellyfin with persistent config and media. That's great, and a step in the right direction. I suggest Emby pursue this at a minimum. 

The next obvious step is to separate Jellyfin (or Emby if they were interested at all) into individual responsibilities, starting with an external DB, and allow scaling of those micro-services independently (or automatically).

I think my personal goal is to have 3 instances of a media server with load balanced requests. And yes that means I'll also have pods managing a highly available DB, and maybe someday something like this will be involved https://github.com/joshuaboniface/rffmpeg. But,..

Just the discussion of allowing a configuration option to use an external db has caused this much pushback and mention of it being "enterprise" with tons of costs. Now take that a step further and imagine them investing time in a proper micro-service architecture or a k8s deployment and you'll realize as I have that the chances of that are about as good as hillary winning in 2024. 

 

edit: Clarifying

I feel like I wrote this poorly, but the point I'm trying to make here, is that Emby was not architected to be a horizontally scalable container based application. 

When we ask for the option of an external db, it's because THAT is the primary obstacle of us running Emby in that way.

There may be other issues, but they may be trivial. Things such as cron jobs may run too often, etc. We've already used iSCSI or NFS for the storage separation obviously. But we may be unaware of other issues with the architecture that would prevent multiple instances from running.

Also, it may constitute a separate project/effort entirely. But the only way to really know is to try, and I think that's all we have been looking to do. Thanks for all the consideration and it should go without saying that the only reason we're interested in this is because we already like emby so much.

Edited by rsvg
  • Like 1
Link to comment
Share on other sites

@rsvg - Please see my sentence above about the 99%. That is the problem here. If there would be some commercial funding that would make it much easier to go into that direction where to some extent also regular Premiere users would benefit from. I'm all for it, not against, just need to find a feasible way.

Edited by softworkz
Link to comment
Share on other sites

Thuzad

Although I would like to have high availability, I am not ready to pay this price for the few users I have. Although I understand that a high availability solution is expensive to develop. 

I think that opensource solutions could do the job very well (Patroni which is based on postgres, keepalived, haproxy, etc...). But I think there are some limitations that I don't know.

  • Like 1
Link to comment
Share on other sites

kingy444
On 14/03/2021 at 02:12, softworkz said:

@rsvg - Please see my sentence above about the 99%. That is the problem here. If there would be some commercial funding that would make it much easier to go into that direction where to some extent also regular Premiere users would benefit from. I'm all for it, not against, just need to find a feasible way.

Definitely not trying to suggest this in a negative way - but a-lot of features are minority use cases - this one has some potential to open the flood gates so to say and there are a lot of people out there who simply dont realise it (end users not devs)

I assume the current backend code is using a DB Class to pump data into SQLite and not making SQLite calls directly so extending to mySQL wouldnt be stretch.

It's important to note that as suggested by a few others an external DB like mariaDB, mySQL, whatever you want to use is the primary roadblock for a lot of use cases and yes, with an external DB comes talks about High Availability etc but at its essence that is not what we are asking for here - we want the flexibility an external DB adds.

For instance...

I currently run a 'backup server' that is mainly used for when no transcoding is required (shared library) - and another server for when it is and have written some API commands to keep them in sync using the backup/restore functionality in terms of user data, libraries etc as they both use the same library (UNRAID)

If there was external DB support i could host the DB on an UNRAID Docker container and then would not need to worry about stupid scheduled tasks / changes to API etc Just one example - there are plenty more complex uses than mine out there.

Link to comment
Share on other sites

On 3/13/2021 at 8:12 AM, softworkz said:

@rsvg - Please see my sentence above about the 99%. That is the problem here. If there would be some commercial funding that would make it much easier to go into that direction where to some extent also regular Premiere users would benefit from. I'm all for it, not against, just need to find a feasible way.

so then propose a bounty for all and not a per licensing deal.. Ask me for $1k license to make this happen and I'll tell you to walk into the ocean.. throw up a $1k bounty to make this happen for the whole community and I might just give you a big chunk of a Stimulus check.. You'll catch more flies w/sugar, but I'm getting a feeling analogies are not your strong point.

I dont think you guys have the insights into what 99% of your users are doing, unless you are spying on everyone I didnt see any option to opt out of any telemetry.. how many features have been implemented already that 1% of people use? Webhooks? LDAP Auth? Legacy MediaBrowser XML Metadata? IPTV? STRM Files? I'd wager the average emby user has no idea wtf any of that is but you guys implemented it.

Clearly Emby has been catering to power users with its feature set, you guys made it easy to run behind a SSL terminating load balancer, give us fine tune control over our bandwidth and users, you let us do creative things you didnt intend for in many areas of the application.. I simply dont get the resistance to an external database, especially since all of us know this is not a technically challenging request that would require massive man-hours to implement. This is not a feature going to be used on accident, people are not going to intentionally opt out of the bundled sqlite backend for an external database on accident or without intention and understanding. 

  • Like 2
Link to comment
Share on other sites

kingy444
1 hour ago, nayr said:

so then propose a bounty for all and not a per licensing deal.. Ask me for $1k license to make this happen and I'll tell you to walk into the ocean.. throw up a $1k bounty to make this happen for the whole community and I might just give you a big chunk of a Stimulus check.. You'll catch more flies w/sugar, but I'm getting a feeling analogies are not your strong point.

I dont think you guys have the insights into what 99% of your users are doing, unless you are spying on everyone I didnt see any option to opt out of any telemetry.. how many features have been implemented already that 1% of people use? Webhooks? LDAP Auth? Legacy MediaBrowser XML Metadata? IPTV? STRM Files? I'd wager the average emby user has no idea wtf any of that is but you guys implemented it.

Clearly Emby has been catering to power users with its feature set, you guys made it easy to run behind a SSL terminating load balancer, give us fine tune control over our bandwidth and users, you let us do creative things you didnt intend for in many areas of the application.. I simply dont get the resistance to an external database, especially since all of us know this is not a technically challenging request that would require massive man-hours to implement. This is not a feature going to be used on accident, people are not going to intentionally opt out of the bundled sqlite backend for an external database on accident or without intention and understanding. 

Go one step further even - make it for advanced users only

 

dont include any installer or anything for MySQL and allow an ‘advanced user’ to simply manage the DB themselves and just need to punch in host user account etc into the setup wizard 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...