Jump to content


Photo

Ebook metadata fetcher?

ebook epub metadata fetch plugin feature identify images automate

  • Please log in to reply
9 replies to this topic

#1 computerprep OFFLINE  

computerprep

    Advanced Member

  • Members
  • 342 posts
  • Local time: 05:57 PM
  • LocationCentral Florida

Posted 14 March 2016 - 11:21 PM

So I wish I had the time or pre-existing skills to write a plugin here.

 

I know several languages, but alas... nobody wants my life story. My hope is that someone already involved in plugin development or development in the main server will be inspired by this idea.

 

If Ebooks can be "identified" through Emby with their ISBN, I can logically see a plugin/feature of Emby that would take the following information, generate the appropriate URL, parse the URL for a specific div, and copy the contents of that div into various metadata fields.

 

For instance... The Book Thief by Markus Zusak...

 

BOOK OVERVIEW (see attached screenshot)

http://www.barnesandnoble.com/w/book-thief-markus-zusak?ean=9780375842207

URL format:

 "http://www.barnesandnoble.com/w/" + full-title + author-first-last + "?ean=" + isbn13

Inside this page, copy contents of div.overview-desc, stripping out the h2 tags and their contents

note: isbn13 from above is stripped of hyphens

 

BOOK COVER IMAGE

http://prodimage.images-bn.com/pimages/9780375842207.jpg

simple url format: 

"http://prodimage.images-bn.com/pimages/" + isbn-13 + ".jpg"

note: isbn13 from above is stripped of hyphens

 
------------------------------------
 
Of course, this may need to be built into Emby Server in order to integrate the "identify" feature as seen in movies, television, etc. But the ISBN could always be manually input into the "website" part of the metadata, or the "comic vine volume id" since these ebooks aren't going to be in their database.
 
This process can be duplicated for other information as well... Star ratings (see attached screenshot), reviews, and most other sites. As long as those sites are database driven, we can find that pattern. I'd love to help, but time is in short supply.

Attached Files


Edited by computerprep, 14 March 2016 - 11:22 PM.


#2 ebr OFFLINE  

ebr

    Chief Bottle Washer

  • Administrators
  • 49818 posts
  • Local time: 05:57 PM

Posted 15 March 2016 - 08:26 AM

If there is a publicly available data source with an API then, yes, we can do this.  However, scraping a company's web page for their data is not something our development policy will allow.  It is also incredibly difficult to maintain as any change in format can break it.



#3 computerprep OFFLINE  

computerprep

    Advanced Member

  • Members
  • 342 posts
  • Local time: 05:57 PM
  • LocationCentral Florida

Posted 15 March 2016 - 03:49 PM

ebr, thanks for that update.

I started looking at the developers wiki last night... Just to see what there was to see, and I saw that policy. Makes sense. I'll be in the lookout for an online service that will work for this type of project. If I find one, I'll post it here for any other devs interested in taking a crack at something.

I'll start working on a plugin for this, too... If I find a source that doesn't conflict with the policies. My progress is likely to be show. One question about that, though.

After someone puts all the time into coding a fully functional plugin, how easy out hard is it to port that over into a core feature? Is there a lot of recoding that would need to be done at that point?

#4 Redshirt OFFLINE  

Redshirt

    Android Adept

  • Alpha Testers
  • 5078 posts
  • Local time: 03:57 PM
  • LocationBritish Columbia, Canada

Posted 15 March 2016 - 03:56 PM

When I first wrote the plugin, the only online book source I could find at the time was Google Books and they had a daily look up limit that was per-app instead of per-user. 2000 lookups per day was exhausted pretty quickly since it's 2-3 per book multiplied by number of users.



#5 CashMoney OFFLINE  

CashMoney

    Advanced Member

  • Members
  • 252 posts
  • Local time: 10:57 PM
  • LocationEngland, UK

Posted 19 March 2016 - 03:56 PM

Does goodreads have an api?

.... yes it does.

https://www.goodreads.com/api

#6 Luke OFFLINE  

Luke

    System Architect

  • Administrators
  • 148755 posts
  • Local time: 05:57 PM

Posted 19 March 2016 - 04:00 PM

That's great, looks promising. Thanks for the info. The main thing the plugin really needs now is a community member to take some ownership of it and help enhance it with new features.



#7 mediacowboy OFFLINE  

mediacowboy

    Advanced Member

  • Alpha Testers
  • 1828 posts
  • Local time: 05:57 PM
  • LocationTexas, United States

Posted 22 March 2016 - 12:17 PM

Rules for Goodreads API.

"Developer Terms of Service
In order to use the Goodreads API, you agree to:
Not request any method more than once a second. Goodreads tracks all requests made by developers.
Clearly display the Goodreads name or logo on any location where Goodreads data appears. For instance if you are displaying Goodreads reviews, they should either be in a section clearly titled "Goodreads Reviews", or each review should say "Goodreads review from John: 4 of 5 stars..."
Link back to the page on Goodreads where the data data appears. For instance, if displaying a review, the name of the reviewer and a "more..." link at the end of the review must link back to the review detail page. You may not nofollow this link.
Not use the API to harvest or index Goodreads data without our explicit written consent.
You may store information obtained from the Goodreads API for up to 24 hours. Goodreads needs the ability to modify, remove, and update the order of our data, which caching would prevent. An exception to this rule is if the data is from your own account or the OAuth-authenticated users account, in which case the data may be stored permanently.
Not sublicense or redistribute Goodreads data to any 3rd parties.
Not modify or change Goodreads data, including reviews, in any way. Reviews may be truncated for display purposes, but must link to the full review on Goodreads.
Obtain each user's explicit consent before adding, removing or otherwise changing book reviews or other data on their behalf - usually in the form of clicking a button or checking a box. A user authenticating with your application does not constitute consent.
Not use the Goodreads data as part of a commercial product without our explicit written consent. If you would like to include Goodreads data in a commercial product, please contact us.
Not name your application "Goodreads". Do not use "Goodreads" in your applications name. You may use the Goodreads logo to acknowledge your apps association with Goodreads, but not as the main logo for or within your app.
Acknowledge that your developer account can be suspended for any infraction of these terms.
Acknowledge that these terms may be updated or amended at any time without prior notice, and that your continued use of the API constitutes your acceptance of the new TERMS"

What does that mean for a program like Emby?

Edited by mediacowboy, 22 March 2016 - 12:19 PM.


#8 ebr OFFLINE  

ebr

    Chief Bottle Washer

  • Administrators
  • 49818 posts
  • Local time: 05:57 PM

Posted 22 March 2016 - 12:25 PM

 

 

Not request any method more than once a second. Goodreads tracks all requests made by developers

 

If that is per developer as opposed to app instance, then it probably makes it a no-go.

 

We probably also would qualify as a commercial product so we'd have to talk to them about how to proceed.



#9 mediacowboy OFFLINE  

mediacowboy

    Advanced Member

  • Alpha Testers
  • 1828 posts
  • Local time: 05:57 PM
  • LocationTexas, United States

Posted 22 March 2016 - 06:13 PM

Okay, so I don't know if this would be the right thread or not so please move it if needed.

 

Here is what I would like to see happen with the future development of a eBook fetcher or MBBookshelf.

 

1. Scrape Cover, Plot, Arthur, Release Date, and any other useful information

2. If book is a series automatically adjust the sort order to put the books in the right order

 

*Bonus would be iOS and portable clients able to use this information and be able to sync page location and what not to the server*

 

Attached are snip's of what I am kinda of talking about. I know we need a developer and a api fetcher but getting it out there is a start.

Attached Files



#10 softworkz ONLINE  

softworkz

    Advanced Member

  • Developers
  • 2348 posts
  • Local time: 11:57 PM

Posted 24 March 2016 - 09:22 AM

Unfortunately a global registry for ISBN numbers does not exist.

But there is huge amount of potential sources. Just see here:

 

https://en.wikipedia...rces/3453043243

 

Or here for German:

 

https://de.wikipedia...uche/3453043243







Also tagged with one or more of these keywords: ebook, epub, metadata, fetch, plugin, feature, identify, images, automate

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users