HPE admits a software bug that led to an accidental wipeout of 77TB of data.
77TB of Research Data Lost Because of HPE Software Up : Read more
77TB of Research Data Lost Because of HPE Software Up : Read more
The scheduled forum maintenance has now been completed. If you spot any issues, please report them here in this thread. Thank you!
What part of "offline backup" was unclear?
Since it became impossible to restore the files in the area where the backup was executed after the files disappeared, in the future, we will implement not only the backup by mirroring but also an enhancement such as leaving the incremental backup for some time. We will work to improve not only the functionality but also the operation management to prevent a recurrence.
Offline backups won't save you when it is your broken backup script that is deleting files instead of actually backing them up.What part of "offline backup" was unclear?
True.Offline backups won't save you when it is your broken backup script that is deleting files instead of actually backing them up.
It only says "days of work are gone" from December 14th to 16th. So it sounds like they didn't actually lose that much, they just store WAY too much data. Seriously, what research system puts 30TB in permanent storage per day?Even if you delete data your snapshots should still have the data in them.
I can only assume they weren't keeping snapshots.
That only works when the backup script or software responsible for creating snapshots or whatever backup strategy they were using is actually doing its job as intended instead of destroying the files it was meant to preserve.Even if you delete data your snapshots should still have the data in them.
I can only assume they weren't keeping snapshots.
It only says "days of work are gone" from December 14th to 16th. So it sounds like they didn't actually lose that much, they just store WAY too much data. Seriously, what research system puts 30TB in permanent storage per day?
RANGE OF INFLUENCE OF FILE LOSS
- Target file system: / LARGE0
- File deletion period: December 14, 2021 17:32-December 16, 2021 12:43
- Disappearance target file: December 3, 2021, 17:32 or later, Files that were not updated
- Lost file capacity: Approximately 77TB
- Number of lost files: Approximately 34 million files
- Number of affected groups: 14 groups (of which 4 groups cannot be restored by backup)
If it is your backup management script that is destroying source files, having 100 backups wouldn't help since the files are being deleted as you are attempting to back them up.What about multiple mirrors of your files when it's CRITICAL RESEARCH data? ¯\(ツ)/¯
I am not referring to the usual backup methods here. Multiple mirrors as if when they upload the file, it gets uploaded simultaneously to multiple servers. Backup software won't have access to the paths of those files anyway since it would backup the data on main server. And again, double or even triple backup would prevent that too unless backup software (for some n00b reasons) is given wrong permissions to do whatever it wants willy nilly on multiple machines.If it is your backup management script that is destroying source files, having 100 backups wouldn't help since the files are being deleted as you are attempting to back them up.
If HP could screw up a backup script, they could just as easily screw up a mirroring script.I am not referring to the usual backup methods here. Multiple mirrors as if when they upload the file, it gets uploaded simultaneously to multiple servers. Backup software won't have access to the paths of those files anyway since it would backup the data on main server.
It doesn't really sound like it's "CRITICAL RESEARCH data". From what I can figure, they've probably already caught up on all the data they lost.What about multiple mirrors of your files when it's CRITICAL RESEARCH data? ¯\(ツ)/¯
Do they really not have resources when you see a random crappy file hosting website has multiple mirrors to store files and here we are talking about a Uni doing important research. They should've all those files stored at multiple locations/servers as they get uploaded. And of course the usual double backup of main server on whatever schedule they have set it.
That's how you backup a major submission milestone maybe, or especially finalized research data. But this is just daily intermittent stuff that will presumably get revised anyways in the next semester. A backup script is what you do with that stuff.I am not referring to the usual backup methods here. Multiple mirrors as if when they upload the file, it gets uploaded simultaneously to multiple servers. Backup software won't have access to the paths of those files anyway since it would backup the data on main server. And again, double or even triple backup would prevent that too unless backup software (for some n00b reasons) is given wrong permissions to do whatever it wants willy nilly on multiple machines.
But that costs extra...What part of "offline backup" was unclear?
Offline backups won't save you when it is your broken backup script that is deleting files instead of actually backing them up.
Since it is two specific days of data that were lost, I'd guess it was the NEW DATA that was being destroyed by the borked script. So that data wouldn't exist in any other backups since it didn't exist yet when older backups were made and no longer existed for subsequent backups to pick up.That's why you rotate tapes/backups. It is effective then.
The article is unclear however if it's all their data, or just the new data that was created in those couple data.
If it is your backup management script that is destroying source files, having 100 backups wouldn't help since the files are being deleted as you are attempting to back them up.
Data isn't safe when it is your backup system that is destroying data before it can be backed up.Data isn't safe unless it's in at least 3 different physical locations using at least 2 different automated backup processes.
If it is the backup script that is screwing up files, your new files are still screwed.Thats what a differential rolling tape backup is for. When the backup starts, you take an image of all the file time date stamps and lock those files, until they are backed up (compared against previous backup). If a file mod is requested the old file gets copied to special storage till it is archived on tape. With a rolling tape differential backup you are set this way.