Rebuild fails to complete

This case is mostly about maintenance, when 220 Black Armor fails to complete resync

I have a Seagate black armor 220 which keeps trying to repair a volume after a had drive failed as part of RAID1. I replaced the disk, added the member drive ran the S.M.A.R.T checks on the drives and started to recover … it gets to around 80% and the restarts. There are no error messages and this has been looping for about a week now…

Any ideas as to why it wont complete the recovery process and rebuild the raid volume? or how I can trouble shoot it?

This one is easy. The NAS is still readable, so the troubleshooting consists of five steps,

  1. Back up data from NAS to some other location.
  2. Destroy RAID1 in NAS.
  3. Replace disk which is still not replaced.
  4. Reconfigure RAID1 and wait for resync.
  5. Restore from backup.

The problem most likely is that the second (non-replaced) disk has developed a problem; SMART test can sometimes miss a bad sector, or the interpretation of SMART data either by the customer or by the NAS firmware is excessively tolerant. The inability to complete the rebuild is more significant observation than a SMART test result.

Advertisements

Rebuild goes wrong

Everything looked nominal. This case is actually kind of unexpected example of routine operation going belly up, the NAS owner ends up looking for recovery options after RAID 5 crash

Recently our quite old NAS Iomega Storcenter 200rl with a RAID 5 out of 4x500GB, reported a harddisk failure and our administration replaced one of the disks that was marked. Then the Storecenter told that it was synchronizing the data. After synchronization the NAS told that 100% of storage capacity is free and so it tell until now… I am interessted in options to revocer the data.

Certainly does not look like something was done wrong; if the NAS have sensed the rebuild cannot be completed, it would have had refused to even start the process. Certainly, with a wrong disk replaced, the rebuild does not happen. In this case, however, the rebuild was completed with no reported anomaly – something else went wrong.

With something unknown going wrong, it is difficult to predict if the case is recoverable. One can give our Home NAS Recovery a spin, but there is no guarantee of success. Cases when rebuild goes wrong for no apparent reason are always dicey.

 

 

IX4 falling apart

This documents a typical sequence of multiple drives failing in a RAID5, this time in Iomega ix4-200d.

I’ve received an automatic email from the dashboard saying

Data protection is being reconstructed. Data is available during this operation, however performance may be degraded.

After that, the NAS started ‘Data recovery procedure’ [and then came another message]

Drive number 1 encountered a recoverable error.

[and] NAS started recovery procedure from the scratch. Even though that mentioned drive has failed, everything worked fine untill yesterday [when] new message … said

Storage failed and some data loss may have occurred. Multiple drives may have either failed or been removed from your storage system. Visit the Dashboard on the management interface for details.

Is there any way to recover at least some data if it is NAS that failed?

Depending on the condition of the disks, Home NAS Recovery may or may not be able to extract data from it.

Maybe there was a spare involved, as data protected is being reconstructed in a RAID5 can only refer to a rebuild of the array. The rebuild happens either when a defective disk was replaced or when a hot spare kicks in after one of the active array disks fails. There is another variation to that tune, not really obvious, which is a transient failure causing one of the disks to drop out of the array momentarily, then report back online and be accepted back in the array.

Anyhow, while rebuild is in progress, it turns out the second drive in the array is unreadable. This halts the rebuild. The error is at first deemed recoverable, and the rebuild is retried. However, the error recovery is not successful and the second disk (#1) drops offline. With two drives offline, the data is no longer accessible.

For our data recovery software, the logical reconstruction is not a problem given that the disks are still readable enough. This may be a problem though. In worst case, the disks need to be cloned to a blank new disks, and clones then used for recovery. The NAS will not accept the clones because it already has recorded the disks as “failed”, and cloning the entire disk content also clones the “failed” marks for respective disks.