How to Sustain Performance for NVMe Drives Under Thermal Stress Conditions
NVMe drives are major disruptors in flash storage technology, offering unprecedented speeds and performance either in the ultra-slim M.2 or U.2 form factor. Breaking Serial ATA (SATA) transfer rates capped at 6Gb/s, NVMe drives leverage the PCI Express (PCIe) interface, which directly connects to the CPU, resulting in 4-6X the speed of SATA in random workloads.
The big leap in speed and performance reduces latency, enables faster access, and delivers higher input/output per second (IOPS) compared with other interfaces designed for mechanical storage devices.
With the increase in speed came overheating issues, which are exacerbated by NVMe drives typically installed in compact embedded systems that are often fanless or with minimal airflow for heat dissipation. Overheating has adverse effects on the NVMe's data integrity, endurance and retention capabilities. The drive will degrade quickly as the tunnel oxide weakens, causing electrons to leak out. This, in turn, results in higher bit errors and more uncorrectable errors.
This article explores the thermal management challenges for NVMe drives and presents ATP ELECTRONICS's Customizable Thermal Management Solution based on different application needs, system mechanical design, and other important considerations.
Applications Requiring Thermal Management
Due to its speedy transfer rates, NVMe storage is gaining adoption in applications where microseconds count, such as those involving real-time customer interactions, time-critical data analytics, and more. In many of these scenarios, the device is typically installed in enclosures with little or no airflow. and are constantly subjected to intense workloads under harsh conditions. Multiple die stacking per integrated circuit (IC) and intensive components in the limited printed circuit board (PCB) space, especially for double-sided designs, also contribute to the overheating issue.
Thermal management is therefore critical to sustain performance stability during operation at high temperatures.
Applications requiring thermal management
The following table shows possible scenarios with thermal and airflow conditions that need to be addressed.
*LFM: Linear Feet per Minute
ATP ELECTRONICS's Customizable Thermal Management Solution
ATP ELECTRONICS recognizes that thermal challenges are unique for different use cases and scenarios; hence, a "one-size fits all" approach may not be the most suitable. To meet a customer's specific thermal requirements, ATP offers a holistic and customizable solution that combine firmware and hardware technologies.
The process is hinged on extensive collaboration with customers and is summarized in these four steps:
1. ASSESSMENT
Joint Validation for Thermal Management
ATP ELECTRONICS works with system developers to overcome the challenges unique to the specific case. By understanding the performance criteria, user application and system specifications (including, but not limited to temperature, workload, airflow, and mechanical design), ATP ELECTRONICS can customize an NVMe solution for the customer.
An important part of assessing heat dissipation is taking a close look at the mechanical design within the system. How much space is available for heatsink solutions? How can we make sure that no mechanical interference happens among all the components of the system printed circuit board?
The system's mechanical design may not have considered a heatsink solution in the beginning. This is why it is important to examine the available space around the NVMe SSD as well as possible mechanical interferences that may happen.
2. SIMULATION
Influence of Air Inlet/Outlet and SSD Location
Since air flow may vary depending on the fan and drive location, simulation tests are also performed using a proprietary ATP-built mini chamber to recreate as closely as possible the thermal environments based on customers''profile. Air flow capability and SSD location, as well as performance requirements for the SSD considering its location from the air inlet, are among the factors considered. Necessary adjustments are then made to ensure the most optimal solution to meet the requirements.
The proprietary ATP-built mini chamber (Generation 2) is used to simulate and adjust thermal environments based on customer's profile.
A pure hardware simulation test based on full-speed operation, which is the worst-case scenario, is conducted using the Cadence Simulation system. This gives hardware engineers insights into the heat distribution in each PCB layer, as well as the potential risk of heat accumulating in particular areas. Adjustments can then be made to layout circuits, wire thickness, quantity/position of through-holes, and others.
An example of heat distribution simulation result of a PCB's top layer
3. CUSTOMIZATION
Thermal Management Consideration: Which Heatsink Fits the Mechanical Design?
ATP ELECTRONICS's customized thermal management solution consists of both firmware and hardware components:
Adaptive Thermal Control through the ATP ELECTRONICS Dynamic Thermal Throttling Mechanism
This provides a delicate balance between performance and temperature instead of dramatic performance reduction. Temperature sensors continuously detect the device temperature. After sophisticated FW transactions, the performance gradually declines, and the temperature is adjusted.
H/W Heatsink, Thermal Pad Solutions
For NVMe M.2 2280 modules, a variety of HW heatsink options (materials, dimensions, types) are available to match the mechanical constraints of each system design. For high-density NVMe U.2 SSDs, a thermal pad covering the controller and NAND flash area dissipates heat through the U.2 aluminum housing.
HW thermal management options for NVMe M.2 2280 modules and U.2 SSD
Garbage Collection F/W Tuning
A periodic background refresh offsets the significant performance drop caused by the long garbage collection process.
4. OPTIMIZATION
Thermal Management Consideration: Which Heatsink Fits the Mechanical Design?
An optimized solution combines both HW and FW to meet customer's needs. As the graph below shows, performance can drop sharply when standard thermal throttling is used. ATP NVMe SSDs with the customized thermal management solution, on the other hand, deliver higher sustained write performance.
Comparison graph shows that NVMe SSDs with ATP ELECTRONICS Thermal Management Solutions combining both hardware and firmware deliver better sustained write performance and do not have drastic performance drops compared with SSDs using standard heatsinks and thermal throttling mechanism.
Conclusion
Customization through ATP ELECTRONICS's Joint Validation Service offers effective hardware and firmware thermal management solutions to overcome NVMe heating challenges and to deliver better sustained performance. By working closely together, ATP and its customers can arrive at the most optimized solution to meet thermal criteria and performance requirements.
ATP ELECTRONICS's customizable Thermal Management Solutions use both hardware (heatsinks) and advanced firmware (Dynamic Thermal Throttling mechanism) to make sure that NVMe SSDs remain cool even when installed in spaces with insufficient airflow and under varied thermal conditions.
With their blistering-fast performance, NVMe SSDs race, not only against time but also against speed.
- +1 Like
- Add to Favorites
Recommend
- M.2 2280 PSLC NVMe SSDs Break 3D TLC Limits wth 2000 MB/s Sustained Sequential Write Performance
- News | NAND Flash Storage Solutions for the Data-Driven 5G Era | ATP
- ATP Launches Its Tiniest PCIe Gen3 x4 NVMe™ SSDs in M.2 Type 1620 HSBGA Package
- ATP Rapid Diagnostic Test (RDT): Accelerating Failure to Maximize Reliability and Endurance
- ATP NVMe Solutions Built for the Fast Lane, Sequential Read Speed of up to 2,540 MB/s, and Sequential Write Speeds up 1,100 MB/s
- ATP Electronics Exhibits Latest Memory and Storage Solutions at Japan IT Week’s ESEC Spring
- ATP Electronics Launches Industrial 176-Layer PCIe® Gen 4 x4 M.2, U.2 SSDs Offering Excellent R/W Performance, 7.68TB Capacity
- ATP Electronics’ Latest pSLC Embedded SSDs Offer Best TCO with Customizable Endurance
This document is provided by Sekorm Platform for VIP exclusive service. The copyright is owned by Sekorm. Without authorization, any medias, websites or individual are not allowed to reprint. When authorizing the reprint, the link of www.sekorm.com must be indicated.