Ever thought about (ML.Net) Machine Learning Media Recommendations?

March 18, 2019

I was browsing Github and I came across the repo for ML.Net.

What d'ya know, there is an entire machine learning sample page and project which is based specifically on Movie Recommendation.

I realize that the aspect of media recommendation it written quite well in Emby already, however adding a machine learning aspect to the server kind of puts the server in a whole new league.

I don't think any of the other media server platforms have jumped on the bandwagon.

This is the link to the project below.

https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/MatrixFactorization_MovieRecommendation

It is currently utilizing a MovieLens Dataset, but I'm pretty sure a more personalized Dataset based from user libraries would be neat, at least unique.

That's cool, I think...

Edited March 18, 2019 by chef

March 19, 2019

Possibly but given that we're dealing with personal media the amount of possible suggestions is not that large. It would be a nice buzz word to throw around though.

March 22, 2019

I have been doing quiet a bunch of research on how ML.Net could be used to better the experience of the Emby server.

I have started a proof on concept .netcore console app which is kind of interesting.

I read through both the ML.Net GitHub and this MSDN page:

https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/movie-recommmendation

I think it is possible to do something really cool with these libraries.

Basically, the machine learning algorithms starts by calculating whether a user should be recommended an item based on other users acceptance of the same item.

Incredibles 2 (2018) The Avengers (2012) Guardians of the Galaxy (2014)

User 1 Watched and liked movie Watched and liked movie Watched and liked movie

User 2 Watched and liked movie Watched and liked movie Has not watched -- RECOMMEND movie

Not super smart, but things can get really interesting when applying larger datasets for a deeper learning algorithm.

But, in order to make something like that happen, there would need to be a control group of Emby users, who would be interested in submitting some viewer information.

Specifically, running a console app which gathers anonymous like/favorite information, and create a CSV for the learning program ("cringy" I know).

Once that information was gathered ML.Net would be able to work magic on the out come and create a fairly deep AI to recommend movies.

These recommendations would be based on more then just a shallow calculation.

Actors, Genres, Titles, Years, and my favorite -> Media Overviews Sentiments.

Fairly Deep Learning.

It's pretty nifty stuff.

But, the submission of Likes/Favorites is where I understand people would be hesitant.

Edited March 22, 2019 by chef

March 22, 2019

I have been doing quiet a bunch of research on how ML.Net could be used to better the experience of the Emby server.

I have started a proof on concept .netcore console app which is kind of interesting.

I read through both the ML.Net GitHub and this MSDN page:

https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/movie-recommmendation

I think it is possible to do something really cool with these libraries.

Basically, the machine learning algorithms starts by calculating whether a user should be recommended an item based on other users acceptance of the same item.

Incredibles 2 (2018)    The Avengers (2012) Guardians of the Galaxy (2014)

User 1    Watched and liked movie Watched and liked movie Watched and liked movie

User 2    Watched and liked movie Watched and liked movie    Has not watched -- RECOMMEND movie

Not super smart, but things can get really interesting when applying larger datasets for a deeper learning algorithm.

But, in order to make something like that happen, there would need to be a control group of Emby users, who would be interested in submitting some viewer information.

Specifically, running a console app which gathers anonymous like/favorite information, and create a CSV for the learning program ("cringy" I know).

Once that information was gathered ML.Net would be able to work magic on the out come and create a fairly deep AI to recommend movies.

These recommendations would be based on more then just a shallow calculation.

Actors, Genres, Titles, Years, and my favorite -> Media Overviews Sentiments.

Fairly Deep Learning.

It's pretty nifty stuff.

But, the submission of Likes/Favorites is where I understand people would be hesitant.

Could you not use IMDB as your data source? They have an API that I believe would let you get data at the level you are talking about.

Oh maybe not. To train a model you are needing individual ratings on a movie per user. It doesn't look like IMDB APIs give you that detail. Too bad, would have been a perfect source

Edited March 22, 2019 by BillOatman

March 23, 2019

This just got more interesting.

I was able to use my TmDb Developer keys to create a really nice DataSet for the ML CSV.

The Survey app will find users favorite media then get the Tmdb ID for the item.

This way the Machine can learn the best way it knows how, by comparing numbers.

The output looks like this:

user, mediaId,
2,10681
0,49047
0,20526
0,196
0,300668
0,155
0,299536
0,127380
0,335984
0,353081
0,369972
0,330459
0,75612
0,209112
0,12
0,339403
0,363088
0,102899
0,13475
0,24428
0,284053
0,324849
0,118340
0,10138
0,474395
0,10681
2,10437
2,260514
2,862
2,10681

The above CSV shows Three users, but "user 1" doesn't have and Favorite or Liked content. Only user 0 and user 2

If there where a bunch of volunteers, then the Dataset could get big enough for the Learning Algorithm to actually predict things.

Edited March 23, 2019 by chef

March 23, 2019

Well if anyone wants to participate in this little survey

Just run the attached exe, and post back the CSV file it creates in this thread.

ML Recommendation Survey.zip

It doesn't save any personal info, it just looks for items you liked or favorited, and then finds the TMdb code for the item.

It then places it in a CSV file.

NOte: Make sure you have actually "Liked" items in your library or else it won't work. LOL

Edited March 23, 2019 by chef

March 23, 2019

Chef - Interesting stuff

Question - re exe above - how is it getting data, how do you tell it which server to use etc, how is it authenticating - tad light on details

Not keen on d/l and running an exe without some idea what it will do etc - for obvious reasons

March 23, 2019

Chef - Interesting stuff

Question - re exe above - how is it getting data, how do you tell it which server to use etc, how is it authenticating - tad light on details

Not keen on d/l and running an exe without some idea what it will do etc - for obvious reasons

Yes, I completely understand.

Using the Emby Api:

1. It does a UDP broadcast: "whoIsEmbyServer?" to locate the server on the network, and get the Connection IP.

2. The console app will ask you to log in. (Authenticating an admin user would probably be best, and someone with lots of views and "likes/favorites").

It will then scan the Emby Database for each user items marked as "favorite" or "liked".

Nothing else is saved, everything will be completely anonymous.

Each user is assigned a number (0 to user.count, no names), and if the app sees that "user[0]" likes a particular Movie or Series, it goes online to TMdb and gets the TMdb.id for the item.

It then creates a CSV file which looks like this:

Where the first integer is the user[int], and the second integer is the Tmdb.Id.

Afterward, that CSV file can be attached here, and I can create a master CSV file (Dataset) that we can feed to the Machine Learning algorithm.

The more information, the better a prediction will occur.

If we can create a massive Dataset, and show someone (like Luke, for instance) that a Machine Learning algorithm might predict and recommend media items better then some conditional statements, Emby could be the first "Smart" Media Server with an actual AI inside it.

I get giddy just thinking about that!

Edited March 23, 2019 by chef

March 23, 2019

Thanks

Ok - couple of follow up questions

1. i have four servers in my network will the exe allow me to choose which server to run this against? ( i could turn three off but would prefer not to)

2. Why go online for a TMDB id when it already exists in the db? Can understand going online if its missing?

3. Curious - whats the min dataset size thats need for the ML to work - obvs more the better - but wondering if a server with a few users could benefit from a "local" instance of MI rather than a central DB of "likes"

March 23, 2019

Ha got me thinking now

I have more than one Movie Library - "Movies" and "Movies 4K" - so likes for "Movies" have the possibility of being duplicated for a user so this is likely to produce double likes for some movies - is that a problem?

Edited March 23, 2019 by PenkethBoy

March 23, 2019

Thanks

Ok - couple of follow up questions

1. i have four servers in my network will the exe allow me to choose which server to run this against? ( i could turn three off but would prefer not to)

2. Why go online for a TMDB id when it already exists in the db? Can understand going online if its missing?

3. Curious - whats the min dataset size thats need for the ML to work - obvs more the better - but wondering if a server with a few users could benefit from a "local" instance of MI rather than a central DB of "likes"

1. That's a good question, I'm not sure how the API handles the udp broadcast when there is more then one server. I think it will return first or default.

2. I couldn't find the TMDB Ids in the Emby API. I realize that it is saved somewhere, but wasn't sure which namespace it was located.

I'll look again because I agree requesting that from TMDB is an added expense for the survey utility and on my API key too .

3. I'm not exactly sure what a minimum dataset should be. There are a couple examples on GitHub that are pretty large. A couple thousand lines.

From what I gather reading the GitHub on ML, it will use a "regressive algorithm" to compare likes between a bunch of people. So even if a person where to submit a short list, it could still be used to add nodes in a neural network, which would help the regression better predict a recommendation.

Edited March 23, 2019 by chef

March 23, 2019

I don't keep movies around that others in my family won't watch. So I don't like or favorite anything on my own server.

But it is a cool idea, I'll go through the ones on my server now and like the ones I did like and get at least a little data to you

Could be expanded to TV Series as well once you get it dialed in.

Edited March 23, 2019 by BillOatman

March 23, 2019

I sent the file to you in a private message.

March 23, 2019

@@chef

Sent you a DM

The TMDB Id's etc are returned as "Provider ID's" and a api call can return them easily but usually you have to specify them "Fields=ProviderIds" as most times they are not returned by default

March 23, 2019

Thank you so much for participating in this.

Guess we'll see how big the dataset can get.

March 23, 2019

Final Question for now - with TV - the Series has a TVDB id but not the seasons or Episode - so could the data be broadened to include say season and episode numbers in its analysis?

March 23, 2019

Final Question for now - with TV - the Series has a TVDB id but not the seasons or Episode - so could the data be broadened to include say season and episode numbers in its analysis?

Absolutely, infact, that would give a larger number of outcomes.

March 24, 2019

This might have some data that could be used,

http://ai.stanford.edu/~amaas/data/sentiment/

March 25, 2019

This might have some data that could be used,

http://ai.stanford.edu/~amaas/data/sentiment/

This is great for sentiment analysis! Thanks, Bill! I think this will come in handy when it comes time to attempt a really in-depth neural network.

Once there's a bit more survey information, it can be broken down into, genres, actors and overview sentiments... pretty much feed all the movie information into the network, and see what happens.

That's the coolest part, you don't even know what the computer will come up with on its own... :blink:

It's going to be pretty cool.

March 26, 2019

Hi Chef

This is cool stuff.

This may be a stupid question but how do I mark something as Liked? or Unliked for that matter. I can make something a Favorite but that is a whole other thing.

I think that we used to be able to do that

-vicpa

March 26, 2019

Hi Chef

This is cool stuff.

This may be a stupid question but how do I mark something as Liked? or Unliked for that matter. I can make something a Favorite but that is a whole other thing.

I think that we used to be able to do that

-vicpa

Hey Vicpa! Horray another volunteer!

Just go through your emby library and Mark the "heart" icon.

Whatever you mark as "liked/favorite" in the Emby library is what we use for the dataset.

Then run the survey app above and PM me the CSV!

Judging by the amount of volunteers, I have a feeling it's going to take a while to put together a dataset. But I'm hopeful!

March 27, 2019

We'll the more I read about machine learning the more I am convinced that this is a worth while project.

I think inorder to make this work, a plug-in for emby should be created to take the survey.

This would make people feel better about submitting results. Not running a exe on their machine.

So I'll build a plug-in.

Edited March 27, 2019 by chef

March 28, 2019

In the meantime I have created the beginning of a dataset for my home automation. I wonder how long it would take for a neural network to learn my families routine?

I wonder if a light will turn on one day when I walk into a room?

Or the TV will just turn on and log into emby theater on my Xbox when I sit down on the couch.

Well... I'm excited to find out! LOL

March 28, 2019

Lol - that would more than likely freak you out

"Morning Chef, what do you want to watch on Emby today?"

"Maybe something based on those weird favourite choices you keep feeding me?"

March 28, 2019

We'll the more I read about machine learning the more I am convinced that this is a worth while project.

I think inorder to make this work, a plug-in for emby should be created to take the survey.

This would make people feel better about submitting results. Not running a exe on their machine.

So I'll build a plug-in.

Hey chef,

Will the plugin allow an opt-in for every Emby user or will it simply collect data of all users?

Is there a way to like a movie with Emby for Kodi? @@Angelblue05

Ever thought about (ML.Net) Machine Learning Media Recommendations?

Recommended Posts

chef 3745

Link to comment

Share on other sites

Luke 37064

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

BillOatman 500

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

PenkethBoy 2063

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

PenkethBoy 2063

Link to comment

Share on other sites

PenkethBoy 2063

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

BillOatman 500

Link to comment

Share on other sites

BillOatman 500

Link to comment

Share on other sites

PenkethBoy 2063

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

PenkethBoy 2063

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

BillOatman 500

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

Vicpa 554

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

chef 3745

Link to comment

Share on other sites

PenkethBoy 2063

Link to comment

Share on other sites

horstepipe 356

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in