pir8radio 1312 Posted April 12, 2015 Posted April 12, 2015 (edited) Looking through my server logs, noticing lots of spiders are going through my emby server... Looks like they are able to add all kinds of media browser stuff to the search engines... Like the image below. Shouldn't EVERYTHING but the login page be blocked if your not logged in? Now people doing image searches will come across my cover art... They can also hotlink to it using my bandwidth if they wanted....... EDIT: Removed old link Edited January 30, 2017 by pir8radio
pir8radio 1312 Posted April 12, 2015 Author Posted April 12, 2015 Here are some of the spider stats for my emby server:
Luke 42080 Posted April 12, 2015 Posted April 12, 2015 We do this already with a meta tag on every html page <meta name="robots" content="noindex, nofollow, noarchive"> If there's a better way it can be configured then by all means please research and present your findings. and no the crawler won't be able to get past the login page unless you have public users without a password.
pir8radio 1312 Posted April 12, 2015 Author Posted April 12, 2015 (edited) @@Luke I guess my worry is that they can reach the images at all whether or not they abide by the no follow meta tag...... Usually a web application only has a front door, until logged in (door opened) you cant get to anything inside the walls. I am behind a proxy so i guess i could block anything that isnt being referred to from my own site. Noticed you changed the title, it is still a security hole if anyone can reach content without being logged in no? But the crawler CAN reach my content directly as in this example. Every account on my server has a password assigned. EDIT: Removed old link Edited January 30, 2017 by pir8radio
Luke 42080 Posted April 12, 2015 Posted April 12, 2015 there's only a couple api endpoints that aren't secure, and images are one, along with subtitles. it was done to make things like device screen savers easier. we'll improve and secure them at some point, but those are the exception to the rule. the api is secure, everything else is going to throw 401 response errors.
pir8radio 1312 Posted April 12, 2015 Author Posted April 12, 2015 (edited) Ok thanks for the input.. @@Luke Wasn't trying to upset you... It just surprised me how much stuff the spiders are actually grabbing.. Also how much they are "trying" to get to, like you say, they get 401's on the other items... Edited April 12, 2015 by pir8radio
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now