NVMe Blurs The Lines Between Memory And Storage

The history of storage devices is quite literally a race between the medium and the computing power as the bottleneck of preserving billions of ones and zeros stands in the way of computing nirvana. The most recent player is the Non-Volatile Memory Express (NVMe), something of a hybrid of what has come before.

The first generations of home computers used floppy disk and compact cassette-based storage, but gradually, larger and faster storage became important as personal computers grew in capabilities. By the 1990s hard drive-based storage had become commonplace, allowing many megabytes and ultimately gigabytes of data to be stored. This would drive up the need for a faster link between storage and the rest of the system, which up to that point had largely used the ATA interface in Programmed Input-Output (PIO) mode.

This led to the use of DMA-based transfers (UDMA interface, also called Ultra ATA and Parallel ATA), along with DMA-based SCSI interfaces over on the Apple and mostly server side of the computer fence. Ultimately Parallel ATA became Serial ATA (SATA) and Parallel SCSI became Serial Attached SCSI (SAS), with SATA being used primarily in laptops and desktop systems until the arrival of NVMe along with solid-state storage.

All of these interfaces were designed to keep up with the attached storage devices, yet NVMe is a bit of an odd duck considering the way it is integrated in the system. NVMe is also different for not being bound to a single interface or connector, which can be confusing. Who can keep M.2 and U.2 apart, let alone which protocol the interface speaks, be it SATA or NVMe?

Let’s take an in-depth look at the wonderful and wacky world of NVMe, shall we?

Deceiving Appearances

NVMe Blurs The Lines Between Memory And Storage
Diagram of the elements in SATA Express, functionally similar to M.2.

Ask anyone what an NVMe slot on a computer mainboard looks and they’ll be inclined to show you a picture of an M.2 slot, as this has become the most used standard within consumer electronics for solid-state storage devices. Yet even an M.2 slot with a solid-state drive (SSD) in it may not even be an NVMe slot or SSD, since SATA also uses this interface.

Often the mainboard’s silkscreen next to the M.2 slot will mention what technology an M.2 slot can accept. Checking the manual for the board in question is a good idea too. The reason for this confusion is that originally there was a Mini-SATA (mSATA) standard for SSDs which used the PCIe Mini Card form factor, which evolved into the M.2 form factor as well as the U.2 interface. The latter is more akin to the SATA and SAS interfaces, combining both SATA and PCIe channels into a single interface for connecting SSDs.

Meanwhile, the M.2 standard (after a brief detour with the short-lived SATA Express standard) was extended to support not only SATA, but also AHCI and NVMe. This is why M.2 slots are often (improperly) referred to as ‘NVMe slots’ when in reality NVMe is a PCIe-based protocol and defines no physical form factor or connector type.

M.2 interface in ‘B’ and ‘M’ keying.

Meanwhile the M.2 form factor is by itself rather versatile, or convoluted depending on how you look at it. In terms of physical size it can have a width of 12, 16, 22 or 30 millimeters, while supporting lengths between 16 and 110 mm. On its edge connector interface a pattern of notches is used to indicate its functionality, which is matched on the M.2 slot itself. Most common are the ‘B’ and ‘M’ key ID notches in the Key ID list, which contains e.g.:

  • A: 2x PCIe x1, USB 2.0, I2C and DP x4.
  • B: PCIe x2, SATA, USB 2.0/3.0, audio, etc.
  • E: 2x PCIe x1, USB 2.0, I2C, etc.
  • M: PCIe x4, SATA and SMBus.

This means that the physical dimensions of an M.2 expansion card can be any of 32 different permutations, before the 12 possible Key ID notch permutations are included. Fortunately in mainstream use the industry appears to have standardized for storage cards on 22 mm wide connectors with any of a fairly limited number of lengths, resulting in NVMe SSD identifiers such as ‘2242’, which would be a 22 mm wide and 42 mm long card. Meanwhile SSD cards can be keyed for B, M or both.

Important to note here is that by now M.2 slots are commonly used as essentially PCIe expansion slots for situations where space matters. This is why WiFi cards are also commonly found in M.2 form factor.

Defining NVMe

This then lands us at the essential definition of what NVMe is: a standard interface for storage devices that are directly connected to PCIe. What makes NVMe so different from SATA is that the latter translates the PCIe protocol to the SATA protocol, which then has to be interpreted by a chip on the storage device that speaks SATA before any storage-related commands can be executed.

Instead, NVMe defines the interface which can be used directly by any operating system which has an NVMe driver. Commands are sent to the NVMe storage device, which in turn executes those commands to read or write data, or perform certain maintenance operations, like TRIM. As one can safely assume that any storage device that identifies itself as an NVMe device is a solid-state storage device (NAND Flash, 3D XPoint, etc.), this means that the NVMe protocol is designed around the assumption of a low-latency, high burst rate storage device.

Intel’s 3D XPoint-based Optane SSD has very consistent performance regardless of the task load. (Credit: Tom’s Hardware)

Recently, a feature of NVMe called Host Memory Buffer (HMB) has become popular as a way to skip the need for a DRAM buffer on NAND Flash-based SSD. This feature uses part of the system RAM as a buffer, with relatively little loss in performance, with the buffer used primarily for an address mapping table cache.

As storage solutions keep evolving, new storage technologies such as 3D XPoint have made even such features already irrelevant in the long term, as 3D XPoint’s access speed is closer to that of a DRAM buffer than that of NAND Flash. As 3D XPoint SSDs do not have a DRAM buffer, the increased use of such SSDs may result in NVMe being optimized for such storage devices instead.

Hacking NVMe

64×64 (4 kb) magnetic core memory.

At some point one has to wonder what you can actually do with NVMe, beyond buying NVMe SSDs and putting them into an M.2 B and/or M keyed slot on a mainboard. Here you should probably consider whether you’re more interested in hacking some solid-state storage together (if even just in the form of some DRAM or SRAM), or whether the M.2 slot itself is more intriguing.

Where full-sized PCIe slots are fairly big and the expansion cards offer a lot of space for big, clunky components like massive BGA chips and gigantic cooling solutions, M.2 expansion cards are intended to be small and compact, allowing them to fit in laptops. One could, say, combine an FPGA with the requisite SerDes and PCIe hardware blocks with an M.2 form factor PCB to create a compact PCIe expansion card for use in laptops and embedded applications.

Some recent hacks promise to add NVMe support to Raspberry Pi compute modules, replacing the SSD in a Pinebook Pro with a WiFi card, and reading out an iPhone’s NVMe Flash storage module using a ZIF adapter for PCIe.

Which is not to say that there is anything against someone trying to combine something as silly as making an NVMe storage drive using, say, core memory 🙂

Wrapping Up

Looking back on a few decades of computing, there has always been this distinction between system memory and storage, with the former being fast SRAM or DRAM volatile RAM. This distinction has decreased significantly in recent years. While NAND Flash-based storage along with NVMe means that we have the potential for extremely low latency and gigabytes per second of data transfer (especially with PCIe 4.0 NVMe), this is not the end of the story.

The newest thing appears to be the use of so-called ‘Persistent Memory’ DIMMs in regular system memory slots. These use Intel’s Optane solid-state storage, allowing for system RAM to be increased by up to 512 GB per module. Naturally, these modules currently only work in Intel (server) boards which work with these special memory modules. Their use is in buffering for example databases, where the large size would make an in-memory buffer prohibitively expensive or impractical (e.g. multiple TBs of DDR4 DIMMs).

By having what is essentially very fast, persistent storage directly on the CPU’s memory controller, latency is reduced to the absolute minimum. Even though 3D XPoint (as a form of Phase Change Memory) is as of yet not as fast as DDR SDRAM, it may show us a glimpse of what may lie beyond NVMe, with conceivably the difference between ‘system RAM’ and ‘storage’ completely obliterated or changed beyond recognition.

By admin