Jump to content

Ever thought about (ML.Net) Machine Learning Media Recommendations?


chef

Recommended Posts

chef

I was browsing Github and I came across the repo for ML.Net.

 

What d'ya know, there is an entire machine learning sample page and project which is based specifically on Movie Recommendation.

 

I realize that the aspect of media recommendation it written quite well in Emby already, however adding a machine learning aspect to the server kind of puts the server in a whole new league.

 

I don't think any of the other media server platforms have jumped on the bandwagon.

 

This is the link to the project below.

 

https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/MatrixFactorization_MovieRecommendation

 

It is currently utilizing a MovieLens Dataset, but I'm pretty sure a more personalized Dataset based from user libraries would be neat, at least unique.

 

That's cool, I think...

Edited by chef
Link to comment
Share on other sites

Possibly but given that we're dealing with personal media the amount of possible suggestions is not that large. It would be a nice buzz word to throw around though.

Link to comment
Share on other sites

chef

I have been doing quiet a bunch of research on how ML.Net could be used to better the experience of the  Emby server.

 

I have started a proof on concept .netcore console app which is kind of interesting.

 

I read through both the ML.Net GitHub and this MSDN page:

 

https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/movie-recommmendation

 

 

I think it is possible to do something really cool with these libraries.

 

Basically, the machine learning algorithms starts by calculating whether a user should be recommended an item based on other users acceptance of the same item. 

 

 

 

              Incredibles 2 (2018)                      The Avengers (2012)                     Guardians of the Galaxy (2014)

 

User 1    Watched and liked movie             Watched and liked movie                   Watched and liked movie

User 2      Watched and liked movie           Watched and liked movie                    Has not watched -- RECOMMEND movie

 

 

Not super smart, but things can get really interesting when applying larger datasets for a deeper learning algorithm.

 

But, in order to make something like that happen,  there would need to be a control group of Emby users, who would be interested in submitting some viewer information.

 

Specifically, running a console app which gathers anonymous like/favorite information, and create a CSV for the learning program ("cringy" I know).

 

 

Once that information was gathered ML.Net would be able to work magic on the out come and create a fairly deep AI to recommend movies.

 

These recommendations would be based on more then just a shallow calculation. 

 

Actors, Genres, Titles, Years, and my favorite  -> Media Overviews Sentiments.

 

Fairly Deep Learning.

 

It's pretty nifty stuff.

 

But, the submission of Likes/Favorites is where I understand people would be hesitant.

Edited by chef
Link to comment
Share on other sites

BillOatman

I have been doing quiet a bunch of research on how ML.Net could be used to better the experience of the  Emby server.

 

I have started a proof on concept .netcore console app which is kind of interesting.

 

I read through both the ML.Net GitHub and this MSDN page:

 

https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/movie-recommmendation

 

 

I think it is possible to do something really cool with these libraries.

 

Basically, the machine learning algorithms starts by calculating whether a user should be recommended an item based on other users acceptance of the same item. 

 

 

 

              Incredibles 2 (2018)                      The Avengers (2012)                     Guardians of the Galaxy (2014)

 

User 1    Watched and liked movie             Watched and liked movie                   Watched and liked movie

User 2      Watched and liked movie           Watched and liked movie                    Has not watched -- RECOMMEND movie

 

 

Not super smart, but things can get really interesting when applying larger datasets for a deeper learning algorithm.

 

But, in order to make something like that happen,  there would need to be a control group of Emby users, who would be interested in submitting some viewer information.

 

Specifically, running a console app which gathers anonymous like/favorite information, and create a CSV for the learning program ("cringy" I know).

 

 

Once that information was gathered ML.Net would be able to work magic on the out come and create a fairly deep AI to recommend movies.

 

These recommendations would be based on more then just a shallow calculation. 

 

Actors, Genres, Titles, Years, and my favorite  -> Media Overviews Sentiments.

 

Fairly Deep Learning.

 

It's pretty nifty stuff.

 

But, the submission of Likes/Favorites is where I understand people would be hesitant.

Could you not use IMDB as your data source?  They have an API that I believe would let you get data at the level you are talking about.

Oh maybe not.  To train a model you are needing individual ratings on a movie per user.  It doesn't look like IMDB APIs give you that detail.  Too bad, would have been a perfect source :)

Edited by BillOatman
Link to comment
Share on other sites

chef

This just got more interesting.

 

I was able to use my TmDb Developer keys to create a really nice DataSet for the ML CSV.

 

The Survey app will find users favorite media then get the Tmdb ID for the item.

 

This way the Machine can learn the best way it knows how, by comparing numbers.

 

The output looks like this:

user, mediaId,
2,10681
0,49047
0,20526
0,196
0,300668
0,155
0,299536
0,127380
0,335984
0,353081
0,369972
0,330459
0,75612
0,209112
0,12
0,339403
0,363088
0,102899
0,13475
0,24428
0,284053
0,324849
0,118340
0,10138
0,474395
0,10681
2,10437
2,260514
2,862
2,10681

The above CSV shows Three users, but "user 1" doesn't have and Favorite or Liked content. Only user 0 and user 2

 

If there where a bunch of volunteers, then the Dataset could get big enough for the Learning Algorithm to actually predict things. 

Edited by chef
Link to comment
Share on other sites

chef

Well if anyone wants to participate in this little survey

 

Just run the attached exe, and post back the CSV file it creates in this thread.

 

ML Recommendation Survey.zip

 

It doesn't save any personal info, it just looks for items you liked or favorited, and then finds the TMdb code for the item.

 

It then places it in a CSV file.

 

NOte: Make sure you have actually "Liked" items in your library or else it won't work. LOL

Edited by chef
Link to comment
Share on other sites

PenkethBoy

Chef - Interesting stuff :)

 

Question - re exe above - how is it getting data, how do you tell it which server to use etc, how is it authenticating - tad light on details  :)

 

Not keen on d/l and running an exe without some idea what it will do etc - for obvious reasons  :P

Link to comment
Share on other sites

chef

Chef - Interesting stuff :)

 

Question - re exe above - how is it getting data, how do you tell it which server to use etc, how is it authenticating - tad light on details  :)

 

Not keen on d/l and running an exe without some idea what it will do etc - for obvious reasons  :P

 

Yes, I completely understand.

 

 

Using the Emby Api:

 

1. It does a UDP broadcast: "whoIsEmbyServer?" to locate the server on the network, and get the Connection IP.

 

2. The console app will ask you to log in. (Authenticating an admin user would probably be best, and someone with lots of views and "likes/favorites").

 

It will then scan the Emby Database for each user items marked as "favorite" or "liked".

 

Nothing else is saved, everything will be completely anonymous.

 

 

Each user is assigned a number (0 to user.count, no names), and if the app sees that "user[0]" likes a particular Movie or Series, it goes online to TMdb and gets the TMdb.id for the item.

 

It then creates a CSV file which looks like this:

2,10681
0,49047
0,20526
0,196
0,300668
0,155
0,299536
0,127380
0,335984
0,353081
0,369972
0,330459
0,75612
0,209112
0,12
0,339403
0,363088
0,102899
0,13475
0,24428
0,284053
0,324849
0,118340
0,10138
0,474395
0,10681
2,10437
2,260514
2,862
2,10681

Where the first integer is the user[int], and the second integer is the Tmdb.Id.

 

Afterward, that CSV file can be attached here, and I can create a master CSV file (Dataset) that we can feed to the Machine Learning algorithm.

 

The more information,  the better a prediction will occur.

 

If we can create a massive Dataset, and show someone (like Luke, for instance)  that a Machine Learning algorithm might predict and recommend media items better then some conditional statements, Emby could be the first "Smart" Media Server with an actual AI inside it.

 

I get giddy just thinking about that!  

Edited by chef
  • Like 1
Link to comment
Share on other sites

PenkethBoy

Thanks

 

Ok - couple of follow up questions :)

 

1. i have four servers in my network will the exe allow me to choose which server to run this against? ( i could turn three off but would prefer not to)

 

2. Why go online for a TMDB id when it already exists in the db? Can understand going online if its missing?

 

3. Curious - whats the min dataset size thats need for the ML to work - obvs more the better - but wondering if a server with a few users could benefit from a "local" instance of MI rather than a central DB of "likes"

  • Like 1
Link to comment
Share on other sites

PenkethBoy

Ha got me thinking now  :)

 

I have more than one Movie Library - "Movies" and "Movies 4K" - so likes for "Movies" have the possibility of being duplicated for a user so this is likely to produce double likes for some movies - is that a problem?

Edited by PenkethBoy
  • Like 1
Link to comment
Share on other sites

chef

Thanks

 

Ok - couple of follow up questions :)

 

1. i have four servers in my network will the exe allow me to choose which server to run this against? ( i could turn three off but would prefer not to)

 

2. Why go online for a TMDB id when it already exists in the db? Can understand going online if its missing?

 

3. Curious - whats the min dataset size thats need for the ML to work - obvs more the better - but wondering if a server with a few users could benefit from a "local" instance of MI rather than a central DB of "likes"

1. That's a good question, I'm not sure how the API handles the udp broadcast when there is more then one server. I think it will return first or default.

 

2. I couldn't find the TMDB Ids in the Emby API. I realize that it is saved somewhere, but wasn't sure which namespace it was located.

 

I'll look again because I agree requesting that from TMDB is an added expense for the survey utility and on my API key too :).

 

3. I'm not exactly sure what a minimum dataset should be. There are a couple examples on GitHub that are pretty large. A couple thousand lines.

 

From what I gather reading the GitHub on ML, it will use a "regressive algorithm" to compare likes between a bunch of people. So even if a person where to submit a short list, it could still be used to add nodes in a neural network, which would help the regression better predict a recommendation.

Edited by chef
Link to comment
Share on other sites

BillOatman

I don't keep movies around that others in my family won't watch.  So I don't like or favorite anything on my own server.

But it is a cool idea, I'll go through the ones on my server now and like the ones I did like and get at least a little data to you :)

 

Could be expanded to TV Series as well once you get it dialed in.

Edited by BillOatman
  • Like 1
Link to comment
Share on other sites

PenkethBoy

@@chef

 

Sent you a DM

 

The TMDB Id's etc are returned as "Provider ID's" and a api call can return them easily but usually you have to specify them "Fields=ProviderIds" as most times they are not returned by default

Link to comment
Share on other sites

chef

Thank you so much for participating in this.

 

Guess we'll see how big the dataset can get.

Link to comment
Share on other sites

PenkethBoy

Final Question for now - with TV - the Series has a TVDB id but not the seasons or Episode - so could the data be broadened to include say season and episode numbers in its analysis?

Link to comment
Share on other sites

chef

Final Question for now - with TV - the Series has a TVDB id but not the seasons or Episode - so could the data be broadened to include say season and episode numbers in its analysis?

Absolutely, infact, that would give a larger number of outcomes.

  • Like 1
Link to comment
Share on other sites

chef

This might have some data that could be used,  

http://ai.stanford.edu/~amaas/data/sentiment/

 

This is great for sentiment analysis! Thanks, Bill! I think this will come in handy when it comes time to attempt a really in-depth neural network.

 

Once there's a bit more survey information, it can be broken down into,  genres, actors and overview sentiments... pretty much feed all the movie information into the network, and see what happens. 

 

That's the coolest part, you don't even know what the computer will come up with on its own...  :blink:

 

It's going to be pretty cool.

  • Like 1
Link to comment
Share on other sites

Vicpa

Hi  Chef

 

This is cool stuff.  :)

 

This may be a stupid question but how do I mark something as Liked? or Unliked for that matter. I can make something a Favorite but that is a whole other thing. 

 

I think that we used to be able to do that

 

-vicpa

Link to comment
Share on other sites

chef

Hi Chef

 

This is cool stuff. :)

 

This may be a stupid question but how do I mark something as Liked? or Unliked for that matter. I can make something a Favorite but that is a whole other thing.

 

I think that we used to be able to do that

 

-vicpa

Hey Vicpa! Horray another volunteer!

 

Just go through your emby library and Mark the "heart" icon.

 

Whatever you mark as "liked/favorite" in the Emby library is what we use for the dataset.

 

Then run the survey app above and PM me the CSV!

 

Judging by the amount of volunteers, I have a feeling it's going to take a while to put together a dataset. But I'm hopeful!

Link to comment
Share on other sites

chef

We'll the more I read about machine learning the more I am convinced that this is a worth while project.

 

I think inorder to make this work, a plug-in for emby should be created to take the survey.

 

This would make people feel better about submitting results. Not running a exe on their machine.

 

So I'll build a plug-in.

Edited by chef
  • Like 4
Link to comment
Share on other sites

chef

In the meantime I have created the beginning of a dataset for my home automation. I wonder how long it would take for a neural network to learn my families routine?

 

I wonder if a light will turn on one day when I walk into a room?

 

Or the TV will just turn on and log into emby theater on my Xbox when I sit down on the couch.

 

Well... I'm excited to find out! LOL

Link to comment
Share on other sites

PenkethBoy

Lol - that would more than likely freak you out

 

"Morning Chef, what do you want to watch on Emby today?"

 

"Maybe something based on those weird favourite choices you keep feeding me?"

 

:)

  • Like 1
Link to comment
Share on other sites

horstepipe

We'll the more I read about machine learning the more I am convinced that this is a worth while project.

I think inorder to make this work, a plug-in for emby should be created to take the survey.

This would make people feel better about submitting results. Not running a exe on their machine.

So I'll build a plug-in.

Hey chef,

 

Will the plugin allow an opt-in for every Emby user or will it simply collect data of all users?

 

 

Is there a way to like a movie with Emby for Kodi? @@Angelblue05

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...