You might have heard of Intel Volume Management Device (Intel VMD). It’s a technology embedded in processor silicon that allows the hot-swap of NVMe SSDs and that enables those drives’ status LEDs.
Big deal, right? Well, if you have more than one NVMe drive, it is a big deal. This post explains why.
Most drives, whether SSDs or HDDs, contain two LEDs: an “activity” light and a “status” light. The activity light blinks when the drive is in use reading or writing. The status light has four states: OK, Fault, Rebuilding, or Locate. The LED blinks or changes color in different patterns to indicate the current state.
With SATA and SAS drives, the host bus adapter (HBA) serves as the control point to manage the LEDs. With NVMe SSDs, the control point is inside the drives. Removing an NVMe SSD automatically removes the drive’s control point, so the system has to rely on other means—such as the PCIe bus driver, the BIOS, the operating system, or the system firmware—to handle storage-bus events. The lack of a dedicated control point outside of NVMe SSDs means that the drives lack reliable support for status LED management. If a drive fails, it might not be able to operate the LED properly or at all.
Intel VMD places a control point in the PCIe root complex of servers powered by Intel Xeon Scalable processors. Now, NVMe drives can be hot-swapped and the status LED is always reliable.
Here’s why all of that matters. In business server rooms or data centers, data is stored on multiple drives in a RAID configuration to protect against single drive failures. Say you have six NVMe drives in a server and need to replace a failed drive. You haven’t lost any data yet because of the RAID. But if you pull the wrong drive, you’ll essentially introduce a second point of failure and lose data. How do you know which one to pull if the status LEDs are not reliable or are inoperative?
Sure, the system will tell you a number that has failed (say, drive 5). But which one is drive 5? Did the OEM number them from the left or from the right? Starting with 1 or with 0? When you think about it, there are multiple possibilities for determining which is drive 5, and pulling the wrong drive becomes a real risk. Now, compound the problem by the fact that you have multiple servers in a rack and multiple racks in an aisle—potentially hundreds of drives—and the status LED takes on new significance.
There’s a simple solution: Intel VMD makes NVMe SSDs’ status LEDs reliable. Want to know which drive has failed? Look for the flashing amber status LED.
With Intel VMD, NVMe SSDs are just as serviceable as SAS and SATA drives, which makes it possible to deploy NVMe drives more strategically across the infrastructure. Intel makes Intel VMD available across the ecosystem, which means that any server vendor can build systems that support the technology.