Jump to content

Question about data duplication and off-site backups


Recommended Posts

negativzeroe
Posted

I'm brainstorming the best way to have an off-site backup without spending multi-terabyte costs in the cloud. I'm currently pondering sending a friend/relative a raspberry pi and hdd with a vpn set to connect back to my home upon boot. That way I can schedule copy jobs over the Internet and we have the added benefit of that friend/relative having a local stream - for our data limits and their performance/quality of the stream.

So my question is this: Is there a way I can manage how to prioritize the data to the local install without installing emby independently on that device? Either by adding that device as a library source and having it choose the copy in his/her home, or by running an instance of emby on that device that somehow relays the config of my main server? I'm thinking if I have to go that route, I'll need to have the config directory copy over as well and have it shut down and spin up the instance after. Then have the apps they use point to the local instance - but that brings up the concern that the server name will be the same when trying to manage it.

Granted, I'm not married to this idea. I'd welcome some insight of what others currently have set up.

 

Gilgamesh_48
Posted

The simple solution is get a safety deposit box and buy a few multi-terabyte drives and rotate them in and out of the box keeping a full backup in the box. You can use any of the many backup programs to accomplish that and you will be about as safe as you can get. If some disaster happens that destroys both your in home setup and the backup in the bank then there is probably much greater problems you need to deal with.

I do not trust any "online" backup system and I trust even less other people to maintain my backup.

If you want reasonable safety and do not want to involve a bank a good fireproof safe can be acquired and your backups can be kept in there. I have a safe that is designed to withstand any house fire short of something like a white phosphorous and magnesium fire. It did not cost too much and my system and my backups are always accessible at all times.

I just periodically hook the backup drives to my computer and run a backup program to get everything copied and I then rotate incremental backups to keep everything up to date. About once every couple of years I reset my entire backup and then I just keep to a schedule until the next time.

I do not really trust automation for backups and it takes very little time to do it the way I do.

  • Like 2
negativzeroe
Posted

Oh I'd be managing the remote backup. That wouldn't be a problem. Luckily enough the individual I have in mind has a good job so there isn't much chance of not paying the power bill or anything and he doesn't care enough about electronics to begin to want to tinker. He also lives about 3.5 hours away so any catastrophe would be hard-pressed to hit them both. Though I guess if he got flooded it may be a pain to bring it back online and rebuild the data. But I just run rsync currently. That is from one drive to another. But I'm closing in on 4TB and will need to think about 8-10. But again I'm trying to find the most cost effective way to do this. I actually have a fireproof safe I can put the drive in though.

Posted

It's not really ideal to use any online backup.  First the drives are constantly running for them to be available to you.  Well they may spin down but you are putting a lot of life on them.  Use external USB drives and just connect them to your system once a week or month to backup anything new without the archive bit set and put the drive back away and NOT connected to the computer.

If you want it offsite, a fire safe that you can lock is the way to go.  Just give the locked box to a friend or relative to hold for you or use a bank as already suggested.  Invite your friend over once a month for beers or coffee and do the incremental backup while they're with you and send them packing with your drives again when they leave.

The problem with trying to run your own "online backup" is connectivity issues, using both your bandwidth and you're friend's bandwidth.  Having to manage it, wear and tear on the equipment and the very good chance the backup can wear/out before you main drives or similar time frame.

With an incremental backup you can use the command line or run a batch file that's super simple.  It only needs to copy things since the last backup and uses no proprietary software and will just be a clone in some way of your current directories making them super simple to restore as well if needed.  You don't need to go crazy after your first full backup and might only need to backup once a month or every three months depending on how much new content you add and how much you care about loosing a few.

As an example with well over 15K movies and 60K TV shows I don't sweat loosing a couple months of content as I can easily add it back if needed (if I even want to) as I've already got so much other content it almost doesn't matter.  So now I backup content every couple of months and call it a day knowing I can restore 99+% of my system if needed.

Now besides the content you could use an online system or two and backup specific directories (ie Emby backup).  Heck you could zip them with a password and just email them to yourself so they would be available that way if you ever need them.

Keep in mind the nature of the beast.  This is just a personal entertainment system, not your household finances, work info or important information so you don't need to go crazy with solutions to keep up to date nightly or anything. 

Carlo

 

  • Like 1
negativzeroe
Posted

I mean yeah I guess I can just put my external drive in the safe, I have rsync set to run every few hours but I could do off/on schedule. I'm just needing to upgrade by storage all around and trying to plan ahead and weigh options.

Gilgamesh_48
Posted
53 minutes ago, negativzeroe said:

I mean yeah I guess I can just put my external drive in the safe, I have rsync set to run every few hours but I could do off/on schedule. I'm just needing to upgrade by storage all around and trying to plan ahead and weigh options.

Get as much storage as you can. "Data grows to fill available space"

To ease management you can use a tool like StableBit's DrivePool. I use DrivePool and keep all my media files in a pool that is always attached to my computer. On those drives I have duplication turned on so I have redundancy, NOT BACKUP, always active. Also I keep a "pool" in my safe. When I connect those drives to my computer DrivePool recognizes the pool and it is a simple matter to run my backup software and when it completes I disconnect the drives and they go back into my safe.

Using DrivePool I do not have to worry about where the files are as I know they are in the pool. DrivePool also has the advantage of storing the files in normal Window's format so you do not even have to have DrivePool active to get the files from a pooled drive.

negativzeroe
Posted
1 hour ago, Gilgamesh_48 said:

Get as much storage as you can. "Data grows to fill available space"

To ease management you can use a tool like StableBit's DrivePool. I use DrivePool and keep all my media files in a pool that is always attached to my computer. On those drives I have duplication turned on so I have redundancy, NOT BACKUP, always active. Also I keep a "pool" in my safe. When I connect those drives to my computer DrivePool recognizes the pool and it is a simple matter to run my backup software and when it completes I disconnect the drives and they go back into my safe.

Using DrivePool I do not have to worry about where the files are as I know they are in the pool. DrivePool also has the advantage of storing the files in normal Window's format so you do not even have to have DrivePool active to get the files from a pooled drive.

Sounds like something I could utilize btrfs if I had enough drives, but I don't have the space lol. I have a mini itx server with an I7 and 32 GB of RAM. 3 ssds and one internal hdd. Then one external HDD.

Gilgamesh_48
Posted
33 minutes ago, negativzeroe said:

Sounds like something I could utilize btrfs if I had enough drives, but I don't have the space lol. I have a mini itx server with an I7 and 32 GB of RAM. 3 ssds and one internal hdd. Then one external HDD.

Yes. My kind of setup is really for those of us that have a quite large library. I have a library that occupies about 30tb of real space which means I need over 60tbs of space for my pool with duplication turned on. To accomplish that I have 13 active drives ranging from 3tb to 8tb. My backups have in the last year migrated to 4 10tb drives. It took quite a while to get the initial setup completed for my backup but keeping it up to date takes less than 2 hours every 3 or 4 weeks and most of that time is spent waiting so I can do other things.

Actually it becomes kind of fun after a while. It is really just for me because I do not share with anybody. There is nobody I like well enough to share my library with. It's too much trouble for too little gain.

Posted

I used to do that with duplication and realized it's kind of dumb (for me) to do it that way as all data is still in the same system and a fire, electrical mis-fortune or component failure could kill everything.  So I moved to external drives for my duplicates/backups.  I just use the archive bit of files and do incremental copies.  I have a complete copy of my media offsite.

As I typically replace smaller drives with bigger drives to add more space, the smaller drives get used for backup as I don't need them all online at once.  Example a new 12 TB drive could replace 3 different 4 TB drives which can now be used for backup.  I actually use internal drives for backup as I can pop one into a hot-swapable slot.  I'm using chassis that hold 8 drives so each of those hold 8 12 TB drives or roughly 100 TB per box and I've got a couple of them. I'm probably around 600 top 700 TB between main and backup space.

Gilgamesh_48
Posted
4 minutes ago, cayars said:

I used to do that with duplication and realized it's kind of dumb (for me) to do it that way as all data is still in the same system and a fire, electrical mis-fortune or component failure could kill everything.  So I moved to external drives for my duplicates/backups.  I just use the archive bit of files and do incremental copies.  I have a complete copy of my media offsite.

As I typically replace smaller drives with bigger drives to add more space, the smaller drives get used for backup as I don't need them all online at once.  Example a new 12 TB drive could replace 3 different 4 TB drives which can now be used for backup.  I actually use internal drives for backup as I can pop one into a hot-swapable slot.  I'm using chassis that hold 8 drives so each of those hold 8 12 TB drives or roughly 100 TB per box and I've got a couple of them. I'm probably around 600 top 700 TB between main and backup space.

The duplication is for my local system while the additional 4 10 tb drives (also configured as a pool but without duplication get connected long enough for a backup and then disconnected and stored in my safe and then I repeat that connect/backup/disconnect/store cycle ever three or four weeks. I have local redundancy and I have a reliable backup. It is a bit like using a belt and suspenders while having a fairly tight elastic waistband. It is more protection than I need but it makes be feel secure. real security is next to impossible but that is the next best thing.

DrivePool is what makes that setup possible because it just recognizes the pooled drives and the pool shows up just as if it was never disconnected. Without that my backup system would be much more complex. Also I have the advantage of, in the event of a failure that destroys my computer and active drives, all I need to do is get a new computer install DrivePool and Emby on it connect the backup drives and point Emby at the correct folder and I would be up and running. Emby would not even have to do much work before being usable because I store all artwork, metadata and bif files with my media.

Of course all that is really overkill but, like I said, belt and suspenders.

All my drives are external but, as you know, DrivePool (with duplication on) makes all drives in the pool effectively hot swapable and an easy upgrade as you can simply pull the old drive and plugin the new and DrivePool will use the duplicate files to refill the pool. In fact you be actively streaming when you pull the drive and the worst thing that would happen is your stream would stop and you would have to restart it. But it would pick up where it stopped. If file you are streaming is not on the drive you pool you would never know a drive was changed.

Posted

I use drivepool as well on my main system but not for the backup drives as I try and keep them as simple as possible.  I just copy to a \Movies or \TV Shows folder that would match the directory structure as seen from within my Drivepool.  My offsite backup drives are essentially my older smaller drives that got replaced for more storage space.

I'd have no intention of ever connecting my smaller backup drives to the main system and using them that way.  For me they are only for a total loss type event so I don't loose 20 years of hording movies/shows.  I also use SnapRaid with multiple drives to store parity information from my media drives.  So I can loose multiple drives and still be able to rebuild my media or replace a couple of drives.  So SnapRaid sort of works like a local backup (ie accidentally delete a directory or loose a drive).  I run SnapRAID a couple of times a day.

To me this is the best of all worlds.  Parity locally for drive failure and easy rebuild with offsite total incremental backup.

Gilgamesh_48
Posted
7 minutes ago, cayars said:

I use drivepool as well on my main system but not for the backup drives as I try and keep them as simple as possible.  I just copy to a \Movies or \TV Shows folder that would match the directory structure as seen from within my Drivepool.  My offsite backup drives are essentially my older smaller drives that got replaced for more storage space.

I'd have no intention of ever connecting my smaller backup drives to the main system and using them that way.  For me they are only for a total loss type event so I don't loose 20 years of hording movies/shows.  I also use SnapRaid with multiple drives to store parity information from my media drives.  So I can loose multiple drives and still be able to rebuild my media or replace a couple of drives.  So SnapRaid sort of works like a local backup (ie accidentally delete a directory or loose a drive).  I run SnapRAID a couple of times a day.

To me this is the best of all worlds.  Parity locally for drive failure and easy rebuild with offsite total incremental backup.

That shows there are at least two ways to accomplish nearly the same thing.

I feel that my way is safer because every drive holds the files in a form that can be read without any special software and the fact that I can so easily recreate my Emby setup just by plunging my backups into a new computer. But if things really get that bad then I will have much more serious issues than my media.

I just hope that neither of us ever really has to test just how good our backups are.

Posted

Absolutely always multiple ways to skin the cat (poor kitty).

Don't get me wrong, I'd love to have a "plug in" backup but it's just not possible for me at the size I'm at without throwing a lot of money at it.  Right now I typically don't add more drives but replace older smaller drives with bigger ones.  So 3 4TB drives replaced with 1 new 12 TB drive gives me the same space I had plus 2 open drive bays for more storage.

Obviously some of my backup is made up also of 12 TB drives as there is no way to replace only older drives with newer/bigger drives.  But I've got roughly 30 drives online for media while my offline backup probably consists of close to 70 drives.  I could never plug them all in at the same time.

Surely if I hit the lottery or something I could just purchase all new big drives and do that but it's not in the cards for me right now. :)

I make use of my parity setup with SnapRAID at least once a year on average as some drive always craps out each year.  Sort of the law of averages of stuff that happens when you have so many drives online at once.  I have had to resort to my master backup one time in the last 15 years of doing it this way due to a power supply that blew out and fried a bunch of stuff including 6 drives (I think it was).  So I had to replace those drives which I did over a few months time and copied back over the media from my backups.  Knock on wood, I've never had to do a total restore and likely couldn't at one time but would rebuild the media over a couple of years as I couldn't afford new drives and wouldn't want to use my backup drives as then I'd have no backup.

For me if I did have a total crash and lost all main media I'd probably go out and pickup 2 or 3 12TB drives then selectively restore a few thousand favorite movies and some of my favorite TV shows to bing watch.  I use the Emby "favorite" feature so I could easily pull these from the database to create a batch file to restore based on my preference and size of files.

Another way I'd look at things is that to fully loose everything would likely be a fire or similar type event.  At that point my media wouldn't be very high on priority lists of getting things back in order for at least a year or two.  I could use Netflix and/or Prime to get buy if needed in such an event if even that assuming I had anything to watch it on.  :)

 

BTW, I'm really not sure how your way is actually different.  I'm just copying stuff from my drive space to the drive without using drive space (less to go wrong) on it where you "require" it to read the drive which makes that a bit funky on your main computer.  Don't know how much you know about how Drive Space works but if you have \Movies folder on the drive and add that drive to a DriveSpace pool you'll now have something like this:

\Movies

\PoolPart.b9981451-1ef2-46da-ab52-e48465afdd3a

So Drive Pool just creates a hidden directory on that drive.  So I can just go into Filemanager and drag the /Movies to the poolpart folder so it looks like this:

\PoolPart.b9981451-1ef2-46da-ab52-e48465afdd3a\Movies and now these movies are in the pool.  So I just removed the complexity of having to have another mounted pool and can easily in 10 seconds move the folder to the pool.  Anything under "\PoolPart.b9981451-1ef2-46da-ab52-e48465afdd3a" becomes part of the pool as stored on that drive and anything outside that directory is just part of the normal drive and not part of the pool.  You can drag/drop move things easily in and out of the pool quite easily even from the command prompt.

BTW, didn't mention this before but once I fill up 7 drives in my backup set I put them all in a cabinet that hold 8 drives and put in one additional drive in that's the same size as the largest drive (so I use matched set drives) and run SnapRAID against just those 7 drives saved to the 8th drive which holds the parity.  So now I can loose a backup drive as well and still have redundancy on the backup.  Probably overkill but makes me feel better knowing I've actually got something of a backup to my backup.

Gilgamesh_48
Posted
6 hours ago, cayars said:

BTW, I'm really not sure how your way is actually different.

You are correct there is really not a lot of difference. I do not use any form of "Raid" as I do not trust them and I keep a fully redundant copy of all my media as a backup so, should a disaster happen, it is about as close to "Plug and play" as I need it to be.

One thing that might make a little difference to our approaches are that my health is "questionable" (I recently had to have a defective kidney removed) and I fill my days with my media. It would be pretty hard on me to not have my media for an extended period of time. I tried to be as complete as possible in my backup strategy. I even have stored in a storage locker a computer with Emby, Drivepool and the other needed programs already installed. I bought it real cheep and it is barely powerful enough for its projected duties but it will work OK until I can get something better.

I have other things that I have done to minimize the effect of a disaster. It is probably incomplete but it eases my mind and that is important to my well being.

I hope I never need to implement my recovery plan as that would probably put too much stress on me and hurt my already fragile health but having that recovery plan makes me sleep better and that helps quite a bit.

BTW: I do not think of having multiple ways to skin a cat is a bad thing. I think "ALF" had the right idea about cats.

"I love defenseless animals, especially in a good gravy."

 

Posted

Agree and that ALF comment was funny.  Been a while since I've seen that.

Check out SnapRAID as it's not what you think it is.  It's not RAID in a traditional sense at all.  It's not a real-time RAID product as you might think.  It's a "batch driven" program that only runs when you tell it to or schedule it.  In a nutshell it dedicates one or more "extra" discs you have around which can be external, USB, networked or whatever to storing only parity info but not media.  So for example you could have 15 discs holding media and add one additional disc that is going to hold the parity.  Now whenever you run SnapRAID it will update it's parity info it has on your drives.

That is the only time the SnapRAID drive is used is when you actually run SnapRAID to update it's info or a restore.  So now if you lost a drive you could replace it and rebuild just that one drive from the parity info you have on hand.  It's really cool the bigger your system becomes because your drives are all normal formatted drives that can be plugged into any computer and require no special hardware or software (just like DrivePool).  With SnapRAID you could have 25 drives in your system with just one backup parity drive that gets updated say once a day and could then rebuild any single drive if there was a failure.  You could dedicate 2, 3 or 4 drives to holding parity with it.  The number of "parity" drives equals the amount of concurrent failed drives the system can cope with to rebuild.  So if you have a 2 drive parity setup you could recover from a 2 drive failure.

So what I was saying is that I use SnapRAID on my main system as well as make SnapRAID parity on my backups as well.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...