Jump to content

Plugin to control results displayed in "More Like This"


nxenos83

Recommended Posts

nxenos83

Being able to create a recommendation for similar items based on an item or past user behavior will greatly increase user satisfaction-- and there are many varying ways  to address this.  Creating an interface that would allow for this default behavior to be overridden by a plugin would allow to offload the work of creating this mapping to external services (like tmdb) or allow for devs to experiment with AI based user specific recommendation algorithms.  This would also allow system admins (and potentially end user) to better control the output:  show/hide watched items, use a different method based upon the user/device/content type, create curated recommendations, etc

If there is already a method for a plugin to take control of this function,  can you provide some pointers on how to leverage it? 

Link to comment
Share on other sites

Hi, there is no way to do this, but once we have smart views and playlists then users will be able to create these kinds of views on their own, and perhaps plugins will also be able to create them.

Link to comment
Share on other sites

nxenos83

Thanks @Luke and @ebr.  Excited to see what you all come out with in smartplaylists. I was more interested in being able to fiddle with the underlying algorithm that pairs "similar" media content and being able to predict content that a user might enjoy based upon previous viewing patterns.  MS has some free-use .net packages aimed at doing this type of work.  https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet. User specific modeling will be difficult since we don't have a user rating system in emby -- and even if we did, I doubt there are many emby instance running with enough different users accessing the system to be able to create a model. But with some external data sources + watch history I think we might have some success.

If I demonstrated a working ML in .net  would you give consideration into either 1. merging  with the core code base, or 2. pull the "similar" endpoint out of the core server and make it controllable by a plugin?

Or do you already have AI recommendations planned as a future Supporter only feature :)

Link to comment
Share on other sites

On 9/13/2021 at 9:44 PM, nxenos83 said:

Thanks @Luke and @ebr.  Excited to see what you all come out with in smartplaylists. I was more interested in being able to fiddle with the underlying algorithm that pairs "similar" media content and being able to predict content that a user might enjoy based upon previous viewing patterns.  MS has some free-use .net packages aimed at doing this type of work.  https://dotnet.microsoft.com/apps/machinelearning-ai/ml-dotnet. User specific modeling will be difficult since we don't have a user rating system in emby -- and even if we did, I doubt there are many emby instance running with enough different users accessing the system to be able to create a model. But with some external data sources + watch history I think we might have some success.

If I demonstrated a working ML in .net  would you give consideration into either 1. merging  with the core code base, or 2. pull the "similar" endpoint out of the core server and make it controllable by a plugin?

Or do you already have AI recommendations planned as a future Supporter only feature :)

What I would suggest doing is looking at the existing database tables. It would have to be something that could be handled at a low level in a database query so that all filtering, sorting, etc will work as well as any other supplied params. So if you can stay within that and find some improvements, then yes certainly we'd take a look at them.

Link to comment
Share on other sites

nxenos83

Makes sense. To stay in that boundary, and not introduce high cost queries every time the similarTo parameter is passed, thinking we would need a staging table.  A scheduled task (after a library run?)  can run the model and populate a staging table of all items. Record count would end up being #Items * #Items * #Users .  Just 4 columns (UserId, ItemId, RelatedItemId, SimilaryScore)    QueryBuilder then just needs to join to the this table to get the similarity score, the rest of the query builder remains intact.

Any issues with this approach?

1st pass I will stick with just the local dataset already available in the user libraries.  Enhancements over the current algorithm that I hope to accomplish:

  •  Make some personalized tunning  available to admin user:, allowing them to select which features, and how much weight to give to each feature
  • Create Term Frequency-Inverse Document Frequency vectors from Plot summary and taglines
  • Incorporate recent user watch activity to personalize results
  • Determine similarity using cosine similarity method (and perhaps other methods like Manhattan distance) 

 

 

  • Thanks 1
Link to comment
Share on other sites

No issues aside from possibly being heavy handed, e.g. suggestions won't react right away to metadata updates and a lot of processing will potentially be done on items that won't be seen.

Link to comment
Share on other sites

horstepipe
On 9/25/2021 at 6:37 PM, nxenos83 said:

Makes sense. To stay in that boundary, and not introduce high cost queries every time the similarTo parameter is passed, thinking we would need a staging table.  A scheduled task (after a library run?)  can run the model and populate a staging table of all items. Record count would end up being #Items * #Items * #Users .  Just 4 columns (UserId, ItemId, RelatedItemId, SimilaryScore)    QueryBuilder then just needs to join to the this table to get the similarity score, the rest of the query builder remains intact.

Any issues with this approach?

1st pass I will stick with just the local dataset already available in the user libraries.  Enhancements over the current algorithm that I hope to accomplish:

  •  Make some personalized tunning  available to admin user:, allowing them to select which features, and how much weight to give to each feature
  • Create Term Frequency-Inverse Document Frequency vectors from Plot summary and taglines
  • Incorporate recent user watch activity to personalize results
  • Determine similarity using cosine similarity method (and perhaps other methods like Manhattan distance) 

 

 

That would be really nice to have!

Also common tags would be nice to be taken considered 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...