chef 3746 Posted March 18, 2019 Share Posted March 18, 2019 (edited) I was browsing Github and I came across the repo for ML.Net. What d'ya know, there is an entire machine learning sample page and project which is based specifically on Movie Recommendation. I realize that the aspect of media recommendation it written quite well in Emby already, however adding a machine learning aspect to the server kind of puts the server in a whole new league. I don't think any of the other media server platforms have jumped on the bandwagon. This is the link to the project below. https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/MatrixFactorization_MovieRecommendation It is currently utilizing a MovieLens Dataset, but I'm pretty sure a more personalized Dataset based from user libraries would be neat, at least unique. That's cool, I think... Edited March 18, 2019 by chef Link to comment Share on other sites More sharing options...
Luke 37096 Posted March 19, 2019 Share Posted March 19, 2019 Possibly but given that we're dealing with personal media the amount of possible suggestions is not that large. It would be a nice buzz word to throw around though. Link to comment Share on other sites More sharing options...
chef 3746 Posted March 22, 2019 Author Share Posted March 22, 2019 (edited) I have been doing quiet a bunch of research on how ML.Net could be used to better the experience of the Emby server. I have started a proof on concept .netcore console app which is kind of interesting. I read through both the ML.Net GitHub and this MSDN page: https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/movie-recommmendation I think it is possible to do something really cool with these libraries. Basically, the machine learning algorithms starts by calculating whether a user should be recommended an item based on other users acceptance of the same item. Incredibles 2 (2018) The Avengers (2012) Guardians of the Galaxy (2014) User 1 Watched and liked movie Watched and liked movie Watched and liked movie User 2 Watched and liked movie Watched and liked movie Has not watched -- RECOMMEND movie Not super smart, but things can get really interesting when applying larger datasets for a deeper learning algorithm. But, in order to make something like that happen, there would need to be a control group of Emby users, who would be interested in submitting some viewer information. Specifically, running a console app which gathers anonymous like/favorite information, and create a CSV for the learning program ("cringy" I know). Once that information was gathered ML.Net would be able to work magic on the out come and create a fairly deep AI to recommend movies. These recommendations would be based on more then just a shallow calculation. Actors, Genres, Titles, Years, and my favorite -> Media Overviews Sentiments. Fairly Deep Learning. It's pretty nifty stuff. But, the submission of Likes/Favorites is where I understand people would be hesitant. Edited March 22, 2019 by chef Link to comment Share on other sites More sharing options...
BillOatman 501 Posted March 22, 2019 Share Posted March 22, 2019 (edited) I have been doing quiet a bunch of research on how ML.Net could be used to better the experience of the Emby server. I have started a proof on concept .netcore console app which is kind of interesting. I read through both the ML.Net GitHub and this MSDN page: https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/movie-recommmendation I think it is possible to do something really cool with these libraries. Basically, the machine learning algorithms starts by calculating whether a user should be recommended an item based on other users acceptance of the same item. Incredibles 2 (2018) The Avengers (2012) Guardians of the Galaxy (2014) User 1 Watched and liked movie Watched and liked movie Watched and liked movie User 2 Watched and liked movie Watched and liked movie Has not watched -- RECOMMEND movie Not super smart, but things can get really interesting when applying larger datasets for a deeper learning algorithm. But, in order to make something like that happen, there would need to be a control group of Emby users, who would be interested in submitting some viewer information. Specifically, running a console app which gathers anonymous like/favorite information, and create a CSV for the learning program ("cringy" I know). Once that information was gathered ML.Net would be able to work magic on the out come and create a fairly deep AI to recommend movies. These recommendations would be based on more then just a shallow calculation. Actors, Genres, Titles, Years, and my favorite -> Media Overviews Sentiments. Fairly Deep Learning. It's pretty nifty stuff. But, the submission of Likes/Favorites is where I understand people would be hesitant. Could you not use IMDB as your data source? They have an API that I believe would let you get data at the level you are talking about. Oh maybe not. To train a model you are needing individual ratings on a movie per user. It doesn't look like IMDB APIs give you that detail. Too bad, would have been a perfect source Edited March 22, 2019 by BillOatman Link to comment Share on other sites More sharing options...
chef 3746 Posted March 23, 2019 Author Share Posted March 23, 2019 (edited) This just got more interesting. I was able to use my TmDb Developer keys to create a really nice DataSet for the ML CSV. The Survey app will find users favorite media then get the Tmdb ID for the item. This way the Machine can learn the best way it knows how, by comparing numbers. The output looks like this: user, mediaId, 2,10681 0,49047 0,20526 0,196 0,300668 0,155 0,299536 0,127380 0,335984 0,353081 0,369972 0,330459 0,75612 0,209112 0,12 0,339403 0,363088 0,102899 0,13475 0,24428 0,284053 0,324849 0,118340 0,10138 0,474395 0,10681 2,10437 2,260514 2,862 2,10681 The above CSV shows Three users, but "user 1" doesn't have and Favorite or Liked content. Only user 0 and user 2 If there where a bunch of volunteers, then the Dataset could get big enough for the Learning Algorithm to actually predict things. Edited March 23, 2019 by chef Link to comment Share on other sites More sharing options...
chef 3746 Posted March 23, 2019 Author Share Posted March 23, 2019 (edited) Well if anyone wants to participate in this little survey Just run the attached exe, and post back the CSV file it creates in this thread. ML Recommendation Survey.zip It doesn't save any personal info, it just looks for items you liked or favorited, and then finds the TMdb code for the item. It then places it in a CSV file. NOte: Make sure you have actually "Liked" items in your library or else it won't work. LOL Edited March 23, 2019 by chef Link to comment Share on other sites More sharing options...
PenkethBoy 2063 Posted March 23, 2019 Share Posted March 23, 2019 Chef - Interesting stuff Question - re exe above - how is it getting data, how do you tell it which server to use etc, how is it authenticating - tad light on details Not keen on d/l and running an exe without some idea what it will do etc - for obvious reasons Link to comment Share on other sites More sharing options...
chef 3746 Posted March 23, 2019 Author Share Posted March 23, 2019 (edited) Chef - Interesting stuff Question - re exe above - how is it getting data, how do you tell it which server to use etc, how is it authenticating - tad light on details Not keen on d/l and running an exe without some idea what it will do etc - for obvious reasons Yes, I completely understand. Using the Emby Api: 1. It does a UDP broadcast: "whoIsEmbyServer?" to locate the server on the network, and get the Connection IP. 2. The console app will ask you to log in. (Authenticating an admin user would probably be best, and someone with lots of views and "likes/favorites"). It will then scan the Emby Database for each user items marked as "favorite" or "liked". Nothing else is saved, everything will be completely anonymous. Each user is assigned a number (0 to user.count, no names), and if the app sees that "user[0]" likes a particular Movie or Series, it goes online to TMdb and gets the TMdb.id for the item. It then creates a CSV file which looks like this: 2,10681 0,49047 0,20526 0,196 0,300668 0,155 0,299536 0,127380 0,335984 0,353081 0,369972 0,330459 0,75612 0,209112 0,12 0,339403 0,363088 0,102899 0,13475 0,24428 0,284053 0,324849 0,118340 0,10138 0,474395 0,10681 2,10437 2,260514 2,862 2,10681 Where the first integer is the user[int], and the second integer is the Tmdb.Id. Afterward, that CSV file can be attached here, and I can create a master CSV file (Dataset) that we can feed to the Machine Learning algorithm. The more information, the better a prediction will occur. If we can create a massive Dataset, and show someone (like Luke, for instance) that a Machine Learning algorithm might predict and recommend media items better then some conditional statements, Emby could be the first "Smart" Media Server with an actual AI inside it. I get giddy just thinking about that! Edited March 23, 2019 by chef 1 Link to comment Share on other sites More sharing options...
PenkethBoy 2063 Posted March 23, 2019 Share Posted March 23, 2019 Thanks Ok - couple of follow up questions 1. i have four servers in my network will the exe allow me to choose which server to run this against? ( i could turn three off but would prefer not to) 2. Why go online for a TMDB id when it already exists in the db? Can understand going online if its missing? 3. Curious - whats the min dataset size thats need for the ML to work - obvs more the better - but wondering if a server with a few users could benefit from a "local" instance of MI rather than a central DB of "likes" 1 Link to comment Share on other sites More sharing options...
PenkethBoy 2063 Posted March 23, 2019 Share Posted March 23, 2019 (edited) Ha got me thinking now I have more than one Movie Library - "Movies" and "Movies 4K" - so likes for "Movies" have the possibility of being duplicated for a user so this is likely to produce double likes for some movies - is that a problem? Edited March 23, 2019 by PenkethBoy 1 Link to comment Share on other sites More sharing options...
chef 3746 Posted March 23, 2019 Author Share Posted March 23, 2019 (edited) Thanks Ok - couple of follow up questions 1. i have four servers in my network will the exe allow me to choose which server to run this against? ( i could turn three off but would prefer not to) 2. Why go online for a TMDB id when it already exists in the db? Can understand going online if its missing? 3. Curious - whats the min dataset size thats need for the ML to work - obvs more the better - but wondering if a server with a few users could benefit from a "local" instance of MI rather than a central DB of "likes" 1. That's a good question, I'm not sure how the API handles the udp broadcast when there is more then one server. I think it will return first or default. 2. I couldn't find the TMDB Ids in the Emby API. I realize that it is saved somewhere, but wasn't sure which namespace it was located. I'll look again because I agree requesting that from TMDB is an added expense for the survey utility and on my API key too . 3. I'm not exactly sure what a minimum dataset should be. There are a couple examples on GitHub that are pretty large. A couple thousand lines. From what I gather reading the GitHub on ML, it will use a "regressive algorithm" to compare likes between a bunch of people. So even if a person where to submit a short list, it could still be used to add nodes in a neural network, which would help the regression better predict a recommendation. Edited March 23, 2019 by chef Link to comment Share on other sites More sharing options...
BillOatman 501 Posted March 23, 2019 Share Posted March 23, 2019 (edited) I don't keep movies around that others in my family won't watch. So I don't like or favorite anything on my own server. But it is a cool idea, I'll go through the ones on my server now and like the ones I did like and get at least a little data to you Could be expanded to TV Series as well once you get it dialed in. Edited March 23, 2019 by BillOatman 1 Link to comment Share on other sites More sharing options...
BillOatman 501 Posted March 23, 2019 Share Posted March 23, 2019 I sent the file to you in a private message. Link to comment Share on other sites More sharing options...
PenkethBoy 2063 Posted March 23, 2019 Share Posted March 23, 2019 @@chef Sent you a DM The TMDB Id's etc are returned as "Provider ID's" and a api call can return them easily but usually you have to specify them "Fields=ProviderIds" as most times they are not returned by default Link to comment Share on other sites More sharing options...
chef 3746 Posted March 23, 2019 Author Share Posted March 23, 2019 Thank you so much for participating in this. Guess we'll see how big the dataset can get. Link to comment Share on other sites More sharing options...
PenkethBoy 2063 Posted March 23, 2019 Share Posted March 23, 2019 Final Question for now - with TV - the Series has a TVDB id but not the seasons or Episode - so could the data be broadened to include say season and episode numbers in its analysis? Link to comment Share on other sites More sharing options...
chef 3746 Posted March 23, 2019 Author Share Posted March 23, 2019 Final Question for now - with TV - the Series has a TVDB id but not the seasons or Episode - so could the data be broadened to include say season and episode numbers in its analysis? Absolutely, infact, that would give a larger number of outcomes. 1 Link to comment Share on other sites More sharing options...
BillOatman 501 Posted March 24, 2019 Share Posted March 24, 2019 This might have some data that could be used, http://ai.stanford.edu/~amaas/data/sentiment/ Link to comment Share on other sites More sharing options...
chef 3746 Posted March 25, 2019 Author Share Posted March 25, 2019 This might have some data that could be used, http://ai.stanford.edu/~amaas/data/sentiment/ This is great for sentiment analysis! Thanks, Bill! I think this will come in handy when it comes time to attempt a really in-depth neural network. Once there's a bit more survey information, it can be broken down into, genres, actors and overview sentiments... pretty much feed all the movie information into the network, and see what happens. That's the coolest part, you don't even know what the computer will come up with on its own... It's going to be pretty cool. 1 Link to comment Share on other sites More sharing options...
Vicpa 559 Posted March 26, 2019 Share Posted March 26, 2019 Hi Chef This is cool stuff. This may be a stupid question but how do I mark something as Liked? or Unliked for that matter. I can make something a Favorite but that is a whole other thing. I think that we used to be able to do that -vicpa Link to comment Share on other sites More sharing options...
chef 3746 Posted March 26, 2019 Author Share Posted March 26, 2019 Hi Chef This is cool stuff. This may be a stupid question but how do I mark something as Liked? or Unliked for that matter. I can make something a Favorite but that is a whole other thing. I think that we used to be able to do that -vicpa Hey Vicpa! Horray another volunteer! Just go through your emby library and Mark the "heart" icon. Whatever you mark as "liked/favorite" in the Emby library is what we use for the dataset. Then run the survey app above and PM me the CSV! Judging by the amount of volunteers, I have a feeling it's going to take a while to put together a dataset. But I'm hopeful! Link to comment Share on other sites More sharing options...
chef 3746 Posted March 27, 2019 Author Share Posted March 27, 2019 (edited) We'll the more I read about machine learning the more I am convinced that this is a worth while project. I think inorder to make this work, a plug-in for emby should be created to take the survey. This would make people feel better about submitting results. Not running a exe on their machine. So I'll build a plug-in. Edited March 27, 2019 by chef 4 Link to comment Share on other sites More sharing options...
chef 3746 Posted March 28, 2019 Author Share Posted March 28, 2019 In the meantime I have created the beginning of a dataset for my home automation. I wonder how long it would take for a neural network to learn my families routine? I wonder if a light will turn on one day when I walk into a room? Or the TV will just turn on and log into emby theater on my Xbox when I sit down on the couch. Well... I'm excited to find out! LOL Link to comment Share on other sites More sharing options...
PenkethBoy 2063 Posted March 28, 2019 Share Posted March 28, 2019 Lol - that would more than likely freak you out "Morning Chef, what do you want to watch on Emby today?" "Maybe something based on those weird favourite choices you keep feeding me?" 1 Link to comment Share on other sites More sharing options...
horstepipe 356 Posted March 28, 2019 Share Posted March 28, 2019 We'll the more I read about machine learning the more I am convinced that this is a worth while project. I think inorder to make this work, a plug-in for emby should be created to take the survey. This would make people feel better about submitting results. Not running a exe on their machine. So I'll build a plug-in. Hey chef, Will the plugin allow an opt-in for every Emby user or will it simply collect data of all users? Is there a way to like a movie with Emby for Kodi? @@Angelblue05 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now