Best way to back up files daily?

Hello man

Honorable
I realize the thread title is generic, but I couldn't think of the best way to put this. Basically, I need to copy files from one drive to a network location once a day or so, but not duplicate them. I just want to keep a running offsite copy of files. I work with video/photo so really, it is just redundancy in case something were to happen to my main rig. This is funny because they are literally in the same room, but either way, I feel this system is a bit better than just running redundant drives in the system.

I have been trying to use File History to copy the contents of drive A (in my rig) to drive B (in the 900D based storage server I now own). In the future, I could see my storage needs meaning that my NAS/900D holds the bulk of my files and I just cache them locally, but for now I think keeping a running copy is fine.

The issue is that file history adds timestamps to things and changes the organization of my files somewhat. The other problem is I don't want some super clunky third-party program hogging resources to copy files once a day.

Basically, I just don't know how to do this. I feel like it is probably pretty simple, so any help would be great! I am a full networking noob.
 
Solution
Use a backup program which supports differential or incremental backups - these backup only files which have changed since previous backups so take a lot less time to run. There are a lot of free ones, but backing up to a network location is usually disabled on the free versions (though you should check - there might be a free one with the feature enabled). The popular ones are Macrium Reflect and Easeus ToDo backup, though I also hear AOMEI and Paragon Backup also mentioned frequently. Backup programs store the backups in an archive, which preserves the timestamps. They can be scheduled to run overnight when you're not using your computer.

Your first backup will be a full backup of all your files. The subsequent days, you should...
Use a backup program which supports differential or incremental backups - these backup only files which have changed since previous backups so take a lot less time to run. There are a lot of free ones, but backing up to a network location is usually disabled on the free versions (though you should check - there might be a free one with the feature enabled). The popular ones are Macrium Reflect and Easeus ToDo backup, though I also hear AOMEI and Paragon Backup also mentioned frequently. Backup programs store the backups in an archive, which preserves the timestamps. They can be scheduled to run overnight when you're not using your computer.

Your first backup will be a full backup of all your files. The subsequent days, you should run a differential or incremental backup.

Differential backups will backup the changes from the full backup. This means a file which was added the day after the full backup will be backed up every day in the differential backup. So subsequent backups in a differential backup takes more space. The advantage is increased reliability. To restore the entire backup, you just need the full backup + the latest differential backup.

Incremental backups will backup the changes from the previous backup. So they require the least space (new files are backed up only once). The drawback is that to do a full restore, you need the full backup + all the incremental backups since the full backup. So it's a bit less reliable than a differential backup.

If you go with differential backups, I'd suggest doing a full backup once a week (most places do it on weekends), and differential backups every night. If you go with incremental backups, you can go a longer time between full backups depending on how risk-adverse you are. I'd recommend doing the full backup at least once a month.

Assuming your computer and the network backup computer are connected via Gigabit ethernet, if you generate 100 GB of new files every day, incremental backups should only need about 15 minutes to perform its daily backup. Though it can be longer if you want more compression. So the daily backups won't take much time.
 
Solution


Cool-thanks for the breakdown. I literally didm't know there were different backup types that had names. The more you know.

I have two HP (intel based) quad port gigbit NICs ordered on Amazon. Who knows when they will decide to ship, but I should have 4 gigabit ethernet through teaming between the two rigs. I want it set up for video editing when inveitably I need to pull files from the NAS quickly-very quickly.

I am not too wirried about compression. I basically just want the changes I make to files on the drive (an import, etc.) to make their way on to the backup drive sometime that week as raw files with no special compression or encryption that would prevent me from popping them right in to PR pro, photoshop etc.

From what you said, I guess an incremental backup is essentially what I am looking for? Just the addition of the new files? Do you have a prefered program?

 
Unless your computer and NAS are both using SSDs, I'm not sure quad-port ethernet cards are necessary. The newest HDDs top out at around 250 MB/s peak. Slowest peak will be half that, average 83% that. Older HDDs are still down below 200 MB/s. So bonding two Gigabit ethernet ports should be enough since that'll give you 125+125 = 250 MB/s. If compression is enabled, you won't even come close to that.

Incremental is the smallest-size solution. I recommend it for most people because it's easier to predict how much space will be needed for the backups. Differential is a bit more reliable. If your latest differential backup file was somehow corrupted, the file can still be recovered from the previous day's differential backup file unless it was only backed up in that latest differential backup file. You should keep 2-3 copies of the full backup around as archives too. So even corruption of a full backup file is not catastrophic - most of your backup files will still be recoverable. Which you choose depends on how much backup storage space you have (are willing to pay for), and how much risk you're willing to accept for backup failures. Most shops I know working with video or photo media backup to both HDD and optical media (Blu-ray or DVD), to avoid a single point of backup failure.

I use Acronis True Image, but that's just because I got a copy for free with a SSD I bought a few years ago. I've installed Macrium Reflect and EaseUS ToDo Backup on several clients computers and they both seem to work fine. They all have free versions so you can play around with them and see which interface and scheduler you like the best. Just bear in mind what I said about network support often being disabled in the free version, so you may have to buy the pay version to get that feature. I didn't include Acronis in the list because they don't have a free version you can try out, though they might have a trial version.
 

Copying won't preserve the timesstamps. If you run this script once a week, all the new files from that week will have its timestamps changed on the backup server to the time the script was run.

I almost recommended rsync (which is similar to a copy but preserves timestamps) but decided not to. The problem is with how it confirms a file is unchanged. Because rsync (and copy) operates at the file level, it treats a multiple-file sync as if it were a bunch of single file syncs, It will read the checksum of a file on the source computer, read the checksum of the same file on the backup device, and compare the two. If they match, it concludes the file is unchanged and moves on. If they don't match, it knows the file has been changed, and copies it to the backup device.

This works fine if there aren't that many files, or if every file has changed. But if you have a lot of files and only a few files have changed, rsync still has to read the checksum of every single file on the backup device. If you've got a lot of small files, this can take a lot of time. When I used rsync to backup my NAS with 9 TB of data, it would take nearly two hours just to confirm there were no new files.

The backup programs seem to store this checksum all in one place in the backup file. So it can locate the checksum of any backed-up file much quicker than if it had to search the entire backup drive. It still has to search the source drive for new files, but there seem to be some optimizations there as well. Maybe they check folder timestamps and skip any folders which don't report changes? That's the type of comparison that a file-by-file copy can't do - rsync doesn't know when it was previously run, so it can't exclude folders to search based on timestamp, it has to read every individual file and compare to every individual file on the backup device. OTOH a real backup program can read the time of the last backup, and use that to quickly exclude folders which haven't since been changed on the source device. In my experience the incremental backup programs complete much more quickly than a file-by-file comparison backup like rsync.
 


Hey!

So I finally got this thing set up-sorta. I ended up using Synkron for now, with an hourly sync for changed files.

I also got a Dell PowerConnect 2816 off eBay. It is a 16 port L2 managed gigabit switch. My hope was that I would be able to use the quad bort gigabit NICs in my two rigs, creating two aggregate links on the switch and aggregate links on the rigs. So far I have been totally unable to make that work. I have spent like three hours trying to configure the switch. I created aggregate links, but something is VERY wrong-windows shows about 150 Mbps transfers occurring all the time when the PCs are plugged in, but I can't even see the other computer on the network. I am super confused. If I try to use the Quad port Intel nics to connect to the internet through the switch nothing happens. Something very weird is afoot.