Jump to content

FR: Server - Ebook, extract PDF front page


Recommended Posts

Posted (edited)

So, got several ebooks and papers in PDF format, but no *.jpg to go along with it. Consequently, what would otherwise become a neat look is not so neat. Was wondering, since usual PDF ebook convention is to start with the book's front page, the mediabrowser server could extract this front page and use it? Furthermore, ISBN is located within the first few pages of the book that can be used to retrieve metadata(http://en.wikipedia.org/wiki/Wikipedia:ISBN is a good start).

 

This an idea worth looking into -- or too niche?

 

Thanks for all the good work, guys! So much potential with this software, loving it!

 

edit: I realize third party tools might automate what I wish to achieve, but the idea of simplicity and having it all under one interface is very tempting. One solution that handles all my media... all the time I could save, haha! Keep up progress!

Edited by metaman
Beardyname
Posted

I would like to see more functionality for ebook's and similar as well :) but for me this is more of a low priority feature but still +1

Posted

for example, lookupbyisbn com will allow you to search by ISBN and provide the details needed for a nice library structure. All the data is already in the html and need only be parsed. It would make the handling of ebooks more functional and implement well we how the server already behaves. With the relatively small amount of work needed compared to the benefit that follows, surely this an idea worth moving high on the priority list? Yes, yes?!?

Posted

for example, lookupbyisbn com will allow you to search by ISBN and provide the details needed for a nice library structure. All the data is already in the html and need only be parsed. It would make the handling of ebooks more functional and implement well we how the server already behaves. With the relatively small amount of work needed compared to the benefit that follows, surely this an idea worth moving high on the priority list? Yes, yes?!?

 

Do they have a public API?  Scraping data from web pages for use in a product like ours is technically illegal but kinda depends on the source.  Even with the legality figured out, scraping is a very hard paradigm to maintain because any change in structure on the page can break your scraping.

 

But, notwithstanding all of that we don't really need to obtain anything externally for this particular request.  We just need to be able to read the first page of the book and generate an image from it.

Koleckai Silvestri
Posted

Ghostscript is a public library. I've seen it used in PHP applications for this purpose. Not sure if it can be integrated into MediaBrowser, a plugin, or if the end-user will need to install compiled versions of their own.

 

http://www.ghostscript.com/

Posted (edited)

Yes, you're of course right about the maintainability -- sorry. Good news is what appear to be a developed isbn API environment. Without diving into the spesifics on the ToUs; isbnDB for example(http://isbndb.com/api/v2/docs) as well as Google Books(https://developers.google.com/books/) seem to offer options that blend in flawlessly with the current interface. There are numerous others that come off as viable upon first glance. I'd be happy to gather more information if it'd speed things up. An isbn lookup solution would make Mediabrowser a more than acceptable choice in the handling of ebooks with the potential of reaching out to a whole new group of people! 

 

Dare ask for an eta, haha :D?

 

edit: adding API alternatives

Edited by metaman
Redshirt
Posted

I tried using Google Books when I first created the plugin. They had a daily api call limit of 1000 calls, which with their system meant I could look up 333 books before hitting the limit. That wasn't enough to service one user never mind the whole community. A free API, without limits is what I really need.

Posted

I'll look into and get an overview of the inner workings and come back when it's clear. What initially strikes me are the many ethical/legal technicalities surrounding the retrival of data, but I am under the impression that you strive to be well inside the white and away from the greyer zones. If this was to be written as a plugin -- would asking the user to sign up for a site, so the server could use these login details -- come off as too impractical/clumsy a solution? Just wondering, so I know what to look for :).

  • 10 years later...
Posted

There is now an OpenLibrary plugin in the Emby Plugin catalog for books, and it supports ISBN lookup.

Please try it out and report your experience. Thanks !

  • 1 month later...
Posted

An update to the Emby Open Library plugin has gone out to support lookups using either ISBN or the Open Library Work Id. The Work Id should help make matching easier, although you can still lookup using ISBN. Here is clarification on the differences between Open Library works and editions:

https://openlibrary.org/help/faq/editing#work-edition

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...