Offline IMDB Data Parser


So I've looked around a bit and it is quite clear that the Emby project does not condone the scraping of websites to aquire data in accordance with the websites terms of use.


However I was wondering whether the non-commercial IMDB data dump that they make available for download via FTP would be a viable source of metadata as it is free for non-commercial personal use as long as you don't put it onto the WWW as a competitor to imdb itself.


The metadata is a rather large datadump especially when decompressed but should be manageable especially if you parse it once and put it into a non text file based database (which you may do for non-commercial personal use)


Is there any specific argument against using this data or has noone bothered so far to write a plugin that leverages this data but such a plugin would be appreciated?

Writing a plug-in for our system that used this data would be the same thing as creating a competitor to IMDb and would not fall under non-commercial personal use.


I had a line of communications with IMDb a while back and they said they hoped to provide some less-expensive options for products like us but they haven't done it yet.  I'll ping my contact again and see the status.


But, I'm curious as to what data you feel is missing from our capabilities right now that this would fill in?

Why would that be the same thing? I think there is a big difference between providing a central source that everyone accesses and having a sanctioned local copy that is accessed by a plugin.


I guess this is a matter of opinion as it is often the case but to me the personal use clause would cover something like this, especially seeing as how they already provide software to setup a local mediadb instance.


What about this would be not non-commercial personal use? Its not like you would host the data itself online for everyone, rather every user would have to download the data themselves and set it up personally.


As to why I'm asking about this. I personally don't find the capabilities lacking at this point but after reading a couple of threads it appeared to me that some people at least would be interested in such a feature and I was simply curious why the route I laid out wasn't taken to implement IMDb as a content provider.


Don't get me wrong I'm not complaining about the current state so much as that I'd like to contribute to improve things.

Edited by Blackclaws
I interpreted what you wanted as having a plug-in that accessed a centralized copy of that data - thus providing basically 1,000s of people access to that data through us.


if, instead, each individual had to manually download and set up the data and keep it up to date and then point the plug-in to where it was then, theoretically that would be okay.


However, I see a couple issues with that as well:


1) That seems like a lot of trouble for an end-user for an undefined amount of benefit and


2) It would then be very easy for someone to actually point the plug-in at some sort of centralized copy of the data which would then, IMO, start to violate the spirit of the IMDb terms.

