Maximizing Storage Efficiency while Minimizing Cost with Flexible Data Placement Technology

Jonmichael Hands, Sr. Director, Product Planning, Fadu

Introduction

The exponential growth of data in the cloud-native era has intensified demand for high-performance, cost-effective storage solutions. Flexible Data Placement (FDP) technology is emerging as a game-changer for enterprises, offering a highly efficient way to optimize Solid State Drive (SSD) performance, endurance, and total cost of ownership (TCO). This article examines how FDP addresses the troublesome Write Amplification Factor (WAF) issue while revolutionizing data management for cloud-native and multi-tenant environments.

The Write Amplification Factor (WAF) Challenge

The WAF is a critical storage metric that reflects the efficiency of SSDs. It measures the ratio of actual NAND flash ‘writes’ to the amount of data a host system writes. A high WAF means your SSDs are performing more write operations than necessary, leading to:

  • Performance degradation: Excess writes slow down an SSD.
  • Reduced endurance: NAND flash has a limited number of writes (TBW, or terabytes written) before wearing out, and the WAF can sharply reduce that number
  • Increased power consumption: More writes require more energy.

How WAF Works

NAND flash memory is organized in blocks (tens to hundreds of megabytes) but is programmed in pages (usually 16 kilobytes). Any data on a block must be erased entirely before a block can be written. This process necessitates a mechanism known as garbage collection (GC). GC identifies blocks with invalidated data, erases them, and then moves valid data to new blocks. This movement of data generates additional writes, contributing to the WAF.

Overprovisioning (OP), where extra storage space is allocated for garbage collection, is a common technique to mitigate WAF’s impact. However, OP can be a costly compromise as it reduces the usable capacity of an SSD. OP’s effect on improving SSD random write performance, latency in mixed workloads, and endurance all come directly from the degree to which the WAF can be reduced.

The equation for WAF is

WAF = NAND writes/host writes

Why is write amplification highly undesirable? 

WAF reduces application performance, decreases SSD endurance and increases power consumption proportionate to the WAF. Here’s an example:

WAFWrite Performance (MB/s)Endurance (TBW)IOPS & BW/W
1Random write = sequential writeTBW = NAND PE Cycles * CapacityBaseline
2Random write = seq write / 2TBW = NAND PE Cycles * Capacity / 2Baseline/2
5Random write = seq write / 5TBW = NAND PE Cycles * Capacity / 5Baseline/5

These attributes, as well as workload performance, power and endurance, are the main drivers of efficiency in rack-level TCO.

Flexible Data Placement (FDP) to the Rescue 

FDP offers a more intelligent and cost-efficient approach to managing data placement within SSDs. It works by grouping similar data together, optimizing garbage collection efficiency, and reducing the WAF. This results in:

  • Enhanced performance: Lower WAF directly translates to faster write speeds.
  • Increased endurance: Less wear and tear on the NAND flash prolongs an SSD’s lifespan.
  • Reduced overprovisioning: FDP allows for greater utilization of an SSD’s raw capacity.
  • Improved Quality of Service (QoS): FDP’s isolation capabilities ensure consistent performance even in multi-tenant environments.

Key Advantages for Enterprises

  • Seamless Integration: FDP is fully backward compatible, allowing it to work with existing applications and infrastructure. Software can be enabled to support FDP, tagging directly or indirectly through multiple namespaces.
  • Scalability: FDP enables enterprises to scale SSD capacity in line with growing compute workloads.
  • Lower TCO: The combined benefits of improved performance, endurance, and capacity utilization significantly reduce the total cost of ownership for storage infrastructure.

Figure 1. FDP tags data that the SSD controller can intelligently place on NAND at high-performance

Compute SSD Trends

The trend towards more cores per CPU, driven by the desire to lower TCO per core, is leading to server consolidation and higher VM densities per SSD. FDP is essential for managing these increasingly complex environments, preventing performance bottlenecks and ensuring a balanced user experience. With more workloads running on a single SSD, the ‘noisy neighbor’ problem arises where one workload’s ‘bursty’ performance can impact the quality of service on another VM. FDP can go a long way in alleviating this issue because FDP separates the written data into different segments of flash, that are known as RUHs (reclaim unit handles).

Source: ark.intel.com, Forward Insights Q2’24

FDP Use Cases

FDP has a wide range of applications, including:

  • Cloud-Native Environments: Multi-tenant, containerized workloads benefit greatly from FDP’s ability to optimize garbage collection and enhance QoS.
  • Caching Applications: FDP can dramatically improve the efficiency and performance of caching solutions like CacheLib, one of the most popular hyperscale applications
  • Mixed Workloads: FDP can improve QoS and performance in mixed workloads by separating them.
  • Write-Intensive Applications: FDP’s ability to reduce the WAF is especially beneficial for write-heavy workloads with multiple write streams, extending an SSD’s lifespan and ensuring more consistent performance.

Customer Use Cases

Caching

FDP was developed by Google, Meta, and SSD vendors, specifically for multi-tenant, multiple namespaces, multiple applications, VMs, containers, microservices, log-structured DB, and caching. Here is an example of CacheLib, one of the most widely deployed caching applications in hyperscale.

A WAF = 1 in CacheLib vs the typical WAF = 3 means that hyperscalers can use the native SSD size for caching instead of having to spend up to 100% more (50% reduction inside) for overprovisioning. This can be done without sacrificing any application performance. The higher performance means better response times in applications. Further, dialing in the amount of each DRAM and NVMe SSD cache per application is critical to balancing an improved user experience with reduced TCO.

Log Structured DB

Flexible Data Placement (FDP) offers significant benefits for log-structured databases (LSDBs) due to their unique write patterns and performance requirements. The OCP NVMe SSD spec has some requirements for workloads that simulate RockDB and other production database workloads at Meta. In fact, Meta wants high TPS with low response time, which from the SSD side means you have to perform random reads at good quality of service and low latency, while background sequential write and TRIM is occuring. 

  1. Optimized Write Amplification: FDP minimizes WAF in LSDBs, leading to faster write speeds, improved endurance, and lower storage costs.
  2. Consistent Performance: Performance consistency and QoS improve by enabling FDP.
  3. Lower OP, resource utilization: FDP enables LSDBs to utilize more of an SSD’s raw capacity, increasing storage efficiency.
  4. Scalability: Can place multiple instances of database on a single SSD (e.g. using a 4TB NVMe with four 1TB namespaces).

In essence, FDP optimizes LSDB performance, reliability, and scalability, while minimizing storage costs, making it a highly valued technology for demanding environments.

Mixed Workloads

Most real-world compute SSD applications are a mix of sequential writes and some random reads. In the test case below, we see how enabling FDP for a mixed 50% read and 50% write workload can increase overall SSD performance by 85%!

Source: Measured by Fadu. 13th Gen Intel(R) Core(TM) i5-13600K; ASRock Z790; Ubuntu 22.04.2 LTS; Kernel 6.2.10,fio-3.35

Write Workloads

 Here is another benchmark designed to show the benefit of FDP in write-intensive applications. The 3.84TB drive features eight different applications that write a variety of block sizes for an SSD. Without FDP, this multi-sequential workload appears random to the drive, triggering a WAF of almost three. This WAF impacts performance and endurance proportionally! With FDP, we can reduce the WAF to one and increase performance to 5GB/s with much higher consistency (lower variation).

To recap, we see a combined performance increase of more than 3x with endurance of the same amount, sharply increasing SSD longevity.

Multiple Namespaces

Even if your applications aren’t designed to directly utilize FDP, you can still harness its advantages by leveraging namespaces. By associating each namespace with a dedicated FDP Reclaim Unit Handle (RUH), you effectively isolate data writes to specific areas of the NAND flash memory. This approach offers several important benefits, particularly in cloud-native environments:

Enhanced Isolation and Performance:

  • Reduced “Noisy Neighbor” Effect: In multi-tenant environments, a heavy workload in one application can impact the performance of others workloads sharing the same SSD. Namespace isolation with FDP minimizes this interference, ensuring consistent performance for each application.
  • Optimized Garbage Collection: FDP optimizes garbage collection within each RUH independently. This means that the GC process for one namespace won’t disrupt the performance of another, resulting in smoother, more predictable performance overall.
  • Simplified Management: Namespaces provide a logical way to group and manage data for different applications or tenants. This granular control simplifies administration and troubleshooting.
  • Performance sharing: While some VMs are idle, others can still get the maximum SSD performance

Example: FDP with Proxmox Virtualization

In a Proxmox environment, you can create multiple virtual machines (VMs) and assign each VM to a dedicated namespace. By associating each namespace with an FDP RUH, you enable FDP optimizations for each VM without requiring any modifications to the VM software.

Implementation Steps:

  1. Delete Existing Namespaces: Remove any existing namespaces to start with a clean configuration.
for i in {1..8}; do nvme delete-ns -n $i /dev/nvme0; done
  1. Enable FDP: Send the appropriate set features command to activate FDP on the SSD.
nvme set-feature /dev/nvme0 -f 0x1d --cdw12=1 -s
  1. Review Configuration: Obtain and verify the SSD’s configuration details to ensure FDP is active. You can see here there are 8 RUH available
sudo nvme fdp config /dev/nvme0 -e 1
FDP Attributes: 0x80
Vendor Specific Size: 0
Number of Reclaim Groups: 1
Number of Reclaim Unit Handles: 8
Number of Namespaces Supported: 16
Reclaim Unit Nominal Size: 17716740096
Estimated Reclaim Unit Time Limit: 604800
Reclaim Unit Handle List:
  [0]: Initially Isolated
  [1]: Initially Isolated
  [2]: Initially Isolated
  [3]: Initially Isolated
  [4]: Initially Isolated
  [5]: Initially Isolated
  [6]: Initially Isolated
  [7]: Initially Isolated
  1. Create Namespaces with RUH Association: Create new namespaces and explicitly tag each one with a unique RUH.
for i in {0..7}; do nvme create-ns -S 470G -C 470G -f 0 -n 1 -p $i /dev/nvme0; done

for i in {1..8}; do nvme attach-ns /dev/nvme0 -n $i -c 1; done
  1. Format and Attach Namespaces: Use LVM (Logical Volume Manager) or a similar tool to format the namespaces and attach them to individual VMs in Proxmox. You can do this through the Proxmox GUI directly or with the CLI.
  2. Optional QoS Control: Apply IOPS or bandwidth limits to specific VMs through virtio block device drivers for additional fine-grained control.

Conclusion

Flexible Data Placement is a transformative technology for enterprise storage. By tackling the WAF challenge head-on, FDP delivers enhanced performance, longevity, and cost savings. As data continues to grow exponentially, FDP will play a pivotal role in ensuring SSDs can meet the evolving demands of the digital age.

To learn more: