Ebook metadata fetcher?

March 15, 2016

So I wish I had the time or pre-existing skills to write a plugin here.

I know several languages, but alas... nobody wants my life story. My hope is that someone already involved in plugin development or development in the main server will be inspired by this idea.

If Ebooks can be "identified" through Emby with their ISBN, I can logically see a plugin/feature of Emby that would take the following information, generate the appropriate URL, parse the URL for a specific div, and copy the contents of that div into various metadata fields.

For instance... The Book Thief by Markus Zusak...

BOOK OVERVIEW (see attached screenshot)

http://www.barnesandnoble.com/w/book-thief-markus-zusak?ean=9780375842207

URL format:

 "http://www.barnesandnoble.com/w/" + full-title + author-first-last + "?ean=" + isbn13

Inside this page, copy contents of div.overview-desc, stripping out the h2 tags and their contents

note: isbn13 from above is stripped of hyphens

BOOK COVER IMAGE

http://prodimage.images-bn.com/pimages/9780375842207.jpg

simple url format:

"http://prodimage.images-bn.com/pimages/" + isbn-13 + ".jpg"

note: isbn13 from above is stripped of hyphens

------------------------------------

Of course, this may need to be built into Emby Server in order to integrate the "identify" feature as seen in movies, television, etc. But the ISBN could always be manually input into the "website" part of the metadata, or the "comic vine volume id" since these ebooks aren't going to be in their database.

This process can be duplicated for other information as well... Star ratings (see attached screenshot), reviews, and most other sites. As long as those sites are database driven, we can find that pattern. I'd love to help, but time is in short supply.

Edited March 15, 2016 by computerprep

March 15, 2016

If there is a publicly available data source with an API then, yes, we can do this. However, scraping a company's web page for their data is not something our development policy will allow. It is also incredibly difficult to maintain as any change in format can break it.

March 15, 2016

ebr, thanks for that update.

I started looking at the developers wiki last night... Just to see what there was to see, and I saw that policy. Makes sense. I'll be in the lookout for an online service that will work for this type of project. If I find one, I'll post it here for any other devs interested in taking a crack at something.

I'll start working on a plugin for this, too... If I find a source that doesn't conflict with the policies. My progress is likely to be show. One question about that, though.

After someone puts all the time into coding a fully functional plugin, how easy out hard is it to port that over into a core feature? Is there a lot of recoding that would need to be done at that point?

March 15, 2016

When I first wrote the plugin, the only online book source I could find at the time was Google Books and they had a daily look up limit that was per-app instead of per-user. 2000 lookups per day was exhausted pretty quickly since it's 2-3 per book multiplied by number of users.

March 19, 2016

Does goodreads have an api?

.... yes it does.

https://www.goodreads.com/api

March 19, 2016

That's great, looks promising. Thanks for the info. The main thing the plugin really needs now is a community member to take some ownership of it and help enhance it with new features.

March 22, 2016

Rules for Goodreads API.

"Developer Terms of Service

In order to use the Goodreads API, you agree to:

Not request any method more than once a second. Goodreads tracks all requests made by developers.

Clearly display the Goodreads name or logo on any location where Goodreads data appears. For instance if you are displaying Goodreads reviews, they should either be in a section clearly titled "Goodreads Reviews", or each review should say "Goodreads review from John: 4 of 5 stars..."

Link back to the page on Goodreads where the data data appears. For instance, if displaying a review, the name of the reviewer and a "more..." link at the end of the review must link back to the review detail page. You may not nofollow this link.

Not use the API to harvest or index Goodreads data without our explicit written consent.

You may store information obtained from the Goodreads API for up to 24 hours. Goodreads needs the ability to modify, remove, and update the order of our data, which caching would prevent. An exception to this rule is if the data is from your own account or the OAuth-authenticated users account, in which case the data may be stored permanently.

Not sublicense or redistribute Goodreads data to any 3rd parties.

Not modify or change Goodreads data, including reviews, in any way. Reviews may be truncated for display purposes, but must link to the full review on Goodreads.

Obtain each user's explicit consent before adding, removing or otherwise changing book reviews or other data on their behalf - usually in the form of clicking a button or checking a box. A user authenticating with your application does not constitute consent.

Not use the Goodreads data as part of a commercial product without our explicit written consent. If you would like to include Goodreads data in a commercial product, please contact us.

Not name your application "Goodreads". Do not use "Goodreads" in your applications name. You may use the Goodreads logo to acknowledge your apps association with Goodreads, but not as the main logo for or within your app.

Acknowledge that your developer account can be suspended for any infraction of these terms.

Acknowledge that these terms may be updated or amended at any time without prior notice, and that your continued use of the API constitutes your acceptance of the new TERMS"

What does that mean for a program like Emby?

Edited March 22, 2016 by mediacowboy

March 22, 2016

Not request any method more than once a second. Goodreads tracks all requests made by developers

If that is per developer as opposed to app instance, then it probably makes it a no-go.

We probably also would qualify as a commercial product so we'd have to talk to them about how to proceed.

March 22, 2016

Okay, so I don't know if this would be the right thread or not so please move it if needed.

Here is what I would like to see happen with the future development of a eBook fetcher or MBBookshelf.

1. Scrape Cover, Plot, Arthur, Release Date, and any other useful information

2. If book is a series automatically adjust the sort order to put the books in the right order

*Bonus would be iOS and portable clients able to use this information and be able to sync page location and what not to the server*

Attached are snip's of what I am kinda of talking about. I know we need a developer and a api fetcher but getting it out there is a start.

March 24, 2016

Unfortunately a global registry for ISBN numbers does not exist.

But there is huge amount of potential sources. Just see here:

https://en.wikipedia.org/wiki/Special:BookSources/3453043243

Or here for German:

https://de.wikipedia.org/wiki/Spezial:ISBN-Suche/3453043243

July 24, 2024

I think the problem isn't finding an API but getting started with the plugin.

Would it be an idea if Emby would provide an official metadata plugin skeleton on GitHub? That way people could focus on the APIs and stuff and would have a starting point. Or publicly releasing the existing metadata plugins as readonly repos.

As we have several medatypes that still miss metadata providers: books, audiobooks, games etc. And there are APIs for all/most of these.

July 24, 2024

8 hours ago, Gummibeer said:

I think the problem isn't finding an API but getting started with the plugin.

Would it be an idea if Emby would provide an official metadata plugin skeleton on GitHub? That way people could focus on the APIs and stuff and would have a starting point. Or publicly releasing the existing metadata plugins as readonly repos.

As we have several medatypes that still miss metadata providers: books, audiobooks, games etc. And there are APIs for all/most of these.

Hi, there are several existing plugins that could be used as examples, for example: https://github.com/MediaBrowser/Emby.Plugins.AniSearch

Sign In

Ebook metadata fetcher?

Recommended Posts

computerprep 148

ebr 15667

computerprep 148

Redshirt 1487

CashMoney 94

Luke 40079

mediacowboy 438

ebr 15667

mediacowboy 438

softworkz 4569

Gummibeer 5

Luke 40079

Create an account or sign in to comment

Create an account

Sign in

Activity