A growing number of high-throughput, low-latency business data centers have relied on hard disk drives (HDDs) in their servers and are now facing performance bottlenecks. Today, they envisage Solid State Drives (SSDs) as a viable storage solution that can increase the performance, efficiency, and reliability of their data centers while lowering operating costs (OpEx).
To understand the differences between each SSD class, we first need to distinguish between the two key components of an SSD, the Flash Storage Controller and the nonvolatile NAND flash memory used to store data.
In today's market, SSD and NAND flash memory consumption is divided into three main groups:
- Consumer devices (tablets, cameras, mobile phones)
- Client systems (netbook, notebook, ultrabook, AIO, desktop PCs), embedded / commercial (gaming kiosk, purpose-built systems, digital signage)
- Enterprise computing platforms (HPC, data center server)
Choosing the right SSD storage device for a company's data center can be a tedious and tedious learning process, in which a variety of different SSD vendors and product types need to be tested for suitability, as not all SSDs and NAND flash memories are made in the same way.
SSDs are manufactured for easy-to-install replacement or supplementation of magnetic disk-based hard drives (HDD) and come in many different form factors, including 2.5 inches, and with communication protocols / interfaces, including Serial ATA (SATA), Serial Attached SCSI (SAS) and, more recently, PCIe to transfer data to or from the central processing unit (CPU) of a server.
Although SSDs are easy to install, it is not guaranteed that they will all be suitable for the applications the company has selected them for the long term. If SSDs wear out prematurely because they are over-written, sustained write power is significantly lower during their expected lifetime, or they cause extra latency in the storage array and therefore need to be replaced early, the cost of a mis-selected SSD can often negate all of their original cost savings and performance benefits.
To help you decide on your next purchase of spare storage or additional storage for a corporate data center, this study looks at the three key features that distinguish an enterprise-class SSD from a client-class SSD: Performance, reliability and endurance.
Performance
By using multi-channel architecture and parallel access from the SSD controller to the NAND flash chips, SSDs can achieve incredibly high read and write speeds for both sequential and random CPU data queries.
The typical scenario of a data center is the processing of millions of bytes of random company data, including technical CAD drawings and seismic analysis data (e.g., Big Data), or customers worldwide access to banking transactions (e.g., OLTP). Access to the storage devices must be done with the lowest latency, and it may also be necessary for many customers to have access to the same data at the same time, without reducing response times. User experience is based on low latency, which increases user productivity.
A client application affects only one user or application, and the tolerance limit between the minimum and the maximum response time (or latency) for user or system activities is higher.
Mismatched performance can adversely affect complex SSD storage arrays (such as Network Attached Storage, Direct Attached Storage, or Storage Area Network) and wreak havoc on storage array latency, sustained performance, and ultimately, service quality, that is perceived by users.
Unlike client SSDs, enterprise-class SSDs are not only optimized for peak performance in the first few seconds of access, but also provide greater stable performance over longer periods of time by using a larger, oversized area (OP). For more information about each drive, visit the Kingston Website under Enterprise SSDs.
This ensures that the performance of the storage array is consistent with the organization's expected quality of service (QoS), even at peak loads.
Reliability
There are a number of issues associated with NAND flash memory, the two most important ones being limited life expectancy, as NAND flash cells wear out during repeated writes, as well as a normal occurring error rate.
Each NAND flash die is tested by silicon wafers during the manufacturing process of a NAND flash memory and labeled with a bit raw error rate (BER or RBER). The BER defines the rate at which normally occurring bit errors occur in the NAND flash without compensation by the Error Correction Code (ECC) and that the SSD controllers correct with spontaneous Advanced ECC (usually called by different SSD controller manufacturers either BCH ECC, or Strong ECC or LDPC). without interrupting user or system access.
The ability of the SSD controller to correct these bit errors can be interpreted by the Uncorrectable Bit Error Ratio (UBER), "a data corruption rate metric corresponding to the number of data errors per bit read after the usage of certain error correction methods". [1]
As defined and unified by the Industry Standards Association JEDEC in 2010 with the documents JESD218A: Solid State Drive (SSD) Requirements and Endurance Test Method and JESD219: Solid State Drive (SSD) Endurance Workloads, the Enterprise Class differs in a number of ways from the capabilities of client-class SSDs, including, but not limited to, their ability to support higher write utilization, handle more extreme environmental conditions, and recovery of higher BER than a client SSD. [2] [3]
Application-Class |
Workload (see JESD219) |
Active Usage (switched on) |
Data retention (switched off) |
UBER-Requirements |
Client |
Client |
40° C 8 hours/day |
30° C 1 Year |
≤10 - 15 |
Enterprise |
Enterprise |
55° C 24 hours/day |
40° C 3 Months |
≤10 - 16 |
Table 1 - JESD218A Solid State Drive (SSD) Requirements and Endurance Test Method
Copyright JEDEC. Reprinted with permission from JEDEC
With the UBER requirements for SSDs proposed by the JEDEC, it is assumed when comparing enterprise SSDs to client SSDs, that with a 1-bit error ratio of 10 quadrillion bits (~ 1.11 petabytes), only 1 unrecoverable bit error occurs at an enterprise SSD, unlike client SSDs, where 1 bit error is processed per 1 quadrillion bits (~ 0.11 petabytes).
Kingston Enterprise SSDs also have additional technologies that allow the recovery of corrupted data blocks with parity data stored in other NAND dies (similar to RAID drives, which allows for the recovery of certain blocks associated with the parity data, that is stored in other blocks, for a rebuild).
To complement the redundant data burst recovery technologies in Kingston Enterprise SSDs, periodic checkpointing, Cyclic Redundancy Check (CRC), and ECC error correction are also implemented in an end-to-end internal backup system to ensure the integrity of the data from the host over the flash and back to the host. End-to-end privacy means that data received from the host is checked for integrity as it is stored in the internal cache of the SSD and when written or rendered by the NAND storage areas.
Similar to improved ECC protection against bit errors in enterprise-class SSDs, SSDs also include power loss detection circuitry that manages the power storage capacitors on the SSDs. Powerfail support in hardware monitors the incoming power to the SSD and temporarily powers the SSD circuits with tantalum capacitors during a surprising power loss to complete internal or external pending writes before the SSDs turn off. Powerfail protection circuits are typically required for applications where data loss is irreversible.
Powerfail protection can also be implemented in the SSD firmware by frequently deleting data in the SSD controller cache areas (eg, its FTranslation layer table) to the NAND memory. While this does not ensure that no data is lost during a power outage, it attempts to minimize the effects of insecure power outages. Firmware power fail protection also ensures that the SSD is unlikely to be inoperable after an insecure shutdown.
In many situations, using Software Defined Storage or server clustering can reduce the need for hardware-based powerfail support because all data is replicated to a separate and stand-alone storage device on a different server or servers. Web-scale data centers often relinquish powerfail support and use software-defined storage on RAID servers to effectively store redundant copies of the same data.
- Kingston Technology
- Uncorrectable Bit-error-rate (UBER)JEDEC dictionary,
- JEDEC Committee JESD218A: JESD218A: Solid State Drive (SSD) Requirements and Endurance Test Method, JEDEC Committee
Text Copyright: Kingston Technology
In the second and final part of this article that we will publish next week, we show the differences in endurance between the two SSD classes and give a short summery of the results fund.