Jump to content

Rendering a HTML Page in a Plugin


DaveClarkUK

Recommended Posts

DaveClarkUK

I have been looking at the BBC iplayer plugin, trying to adapt it to scrape the BBC iplayer site 

 

In oder to do this I need to actually render the raw page through a browser to run the embedded JavaScript to actually generate the content links, is there any way to do this within emby or will i have to build a headless browser within the plugin?

Link to comment
Share on other sites

DaveClarkUK

  • "Using 'web scraping' techniques to obtain data that would otherwise be only available via subscription or other pay-for means" - there is currently no the way to obtain this data - ie Nitro is not available, and even when it is it will not be a pay service.....

Link to comment
Share on other sites

I think that description was changed at some point and needs to be re-visited because scraping any site, pay or not, is technically copyright infringement in many places.

Link to comment
Share on other sites

http://www.bbc.co.uk/terms/business.shtml#2

 

Its not really a question of web-scraping. Web scraping is legal, this is why websites have robots.txt.

 

It is when the website claims copyright over content, and issues a term of service that you are bound by law. It is illegal to scrape bbc websites or its affiliates under its umbrella because of the terms of service.

 

Emby is a middleware product. Emby also has terms of service on use of its middleware, but this is moot since the data is already protected by bbc and their terms.

 

You can still publish the plugin on your own, host it somewhere and let users manually install it. Then you assume all the burden of legal action from bbc. Emby doesnt advertise your plugin, nor allow its download, or otherwise make your plugin available for others. @@radeon has done this with his youtube trailer download plugin. Youtube hasnt issued him any cease and desist orders. Its all in what you do with it and how you attempt to monetize it if at all.

 

Sent from my Nexus 7 using Tapatalk

Link to comment
Share on other sites

  • 2 weeks later...
dcrdev

You'd probably want to use HtmlAgilityPack to parse the DOM tree, if looking to create an addon like this.

 

It's been about a year since I've worked with C# (change of job) but I've started to brush up, looking to create exactly this addon.

 

Looking at the source for the ITV plugin - is that not scraping exactly as this is. Assuming I'm mistaken on that front - if an individual was to create an addon that was not illegal, but contrary to the development terms - would they be able to reference it on these forums?

Link to comment
Share on other sites

As far as this particular site is concerned there are several restrictions that would make such a plug-in fall outside our allowable development policy.

 

1) The site pages contain a copyright statement which means scraping them for the data on them is technically copyright infringement.  Whether this holds up in court is not really our concern as we don't really wish to go to court :).

 

2) The TOS restrict the use of the service to people physically in the UK with a valid UK Television license.

 

3) The TOS further restrict in this manner:

 

 

 

 You agree to use BBC Online Services and access, download, view and/or listen to BBC Content as supplied to you by the BBC and you may not, and you may not assist anyone to, or attempt to, reverse engineer, decompile, disassemble, adapt, modify, copy, reproduce, lend, hire, rent, perform, sub-license, make available to the public, create derivative works from, broadcast, distribute, commercially exploit, transmit or otherwise use in any way BBC Online Services and/or BBC Content in whole or in part except to the extent permitted in these Terms of Use, any relevant Additional Terms and at law.
Link to comment
Share on other sites

dcrdev

1) Yes but so does the ITV website, we have a plugin for that?

 

2) Again this applies to ITV, the point regarding a TV License only applies to live streams; this is a well known loophole here in the UK - you can essentially avoid paying for license by watching all your content after the original broadcast.

Link to comment
Share on other sites

DaveClarkUK

You'd probably want to use HtmlAgilityPack to parse the DOM tree, if looking to create an addon like this.

 

It's been about a year since I've worked with C# (change of job) but I've started to brush up, looking to create exactly this addon.

 

Looking at the source for the ITV plugin - is that not scraping exactly as this is. Assuming I'm mistaken on that front - if an individual was to create an addon that was not illegal, but contrary to the development terms - would they be able to reference it on these forums?

 

the problem is the Iplayer website renders the channel content via a javascript - document.write which means it needs to be rendered via a browser to generate the HTML - Agility cant do this...

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...