In its current iteration, my home lab is comprised of fully recycled equipment, apart from the rack its self and the two UPS’s. (I will be redoing my cable management soon)
One of the 1TB SAS drive in my had started to have a number of SMART errors. I ordered two drives (One replacement, one cold spare) from the ERA. (retail.era.ca) Once they arrived, I put the problematic drive into a offline state, swapped it for the replacement drive, and found that the system did not like the drive. When I rebooted the system into the controllers config menu, the only options for the drive was to locate it.
My next step was to boot back into the system and use the cli software for the controller. (MegaRAID/StorCLI) Using the software, the drive came up as Unknown Bad Unsupported. With googles help, I tried to sanitise/format/delete the drive. Every attempt was met with the “operation not aloud” type of response. I put all of the information into DeepSeek to see if I missed any options. It guided me through a few more things, to no avail.
I put the drive into my sandbox server and booted it into its controller config menu. The results where the same. Booting back into the OS and trying the same commands that I tried on the NAS, resulted in the same result. When I would read the size of the drive, it would say 0.
At this point, I was running out of ideas, and I was contemplating contacting the ERA for a replacement. I had also tried the same commands on both drives.
As I was working with DeepSeek to troubleshoot, it kept saying that it could be the meta data on the drive or in the controller that is bad/left over from its former host. The commands where trying to wipe the drive fully clean.
In my pondering, I remembered that I had flashed one of the controllers on my offline server from IR mode to IT mode. I powered on the server with the drives in it, went into the controllers config menu. Both drives came up as 0 in size. One the options available was format though. I formatted the first drive, and about 6 hours latter, the drive came up with its proper size. I removed the first drive, put it into my NAS. The controller accepted the drive, I added it back to the ZFS pool, and the re-silvering process started. My data was safe, and I had learned a lot from this process.
My takeaways:
Even with a format, a drive can still maintain some metadata.
For a drive to be apart of a JBOD array, it has to be FULLY wiped clean.
For MegaRAID controllers, you can use the StorCLI command to administer the drives/controllers/enclosures.
You can gain a lot of information about the controller/enclosure/drives from the StorCLI command.
There are times when a controller in IT mode can perform better then a card in IR mode.
AI can be a good assistant when diagnosing issues, especially when you are getting tired and/or in need a second opinion. (With being very diligent about not sharing personal/confidential information)
Some issues take some persistence to figure out the solution.
I enjoy figuring out technical issues, and learning at the same time.