Fixing an array with mdadm goes wrong

Now this is complex.

I have an mdadm-created RAID5 array consisting of 4 discs. One of the discs was dropping out, so I decided to replace it. Somehow, this went terribly wrong and I foolishly succeeded in marking two of the (wrong) drives as faulty, and then re-adding them as spare.

Now the array is (logically) no longer able to start:

mdadm: Not enough devices to start the array.

Degraded and can’t create RAID,auto stop RAID [md1]

As I don’t want to ruin the maybe small chance I have left to rescue my data…

This sure is complicated. Obviously, if you fail two array members, RAID5 goes down. Worse yet, once this happens, it stays down. You can’t tell it to accept the spares back in a normal way. Theoretically, some more fiddling with mdadm can force the array back into shape, but I doubt it is safe given a DIY environment. If your unit is still under warranty (this particular case was with Thecus), then by all means open a ticket and ask them to fix the issue – they are pretty good with mdadm. If the case is beyond Linux repair, fall back on our Home NAS Recovery – we are pretty good too.

Advertisements