Jonmichael Hands, Sr. Director, Product Planning, Fadu

Introduction

The exponential growth of data in the cloud-native era has intensified the demand for high-performance, cost-effective storage solutions. Flexible Data Placement (FDP) technology is emerging as a game-changer for enterprises, offering a way to optimize Solid State Drive (SSD) performance, endurance, and total cost of ownership (TCO). This article delves into how FDP addresses the notorious Write Amplification Factor (WAF) issue, revolutionizing data management for cloud-native and multi-tenant environments.

The Write Amplification Factor (WAF) Challenge

The WAF is a critical metric that reflects the efficiency of SSDs. It measures the ratio of actual NAND flash writes to the amount of data written by the host system. A high WAF means the SSD is performing more write operations than necessary, leading to:

- Performance degradation: Excess writes slow down the SSD.

- Reduced endurance: NAND flash has a limited number of write/erase cycles before wearing out.

- Increased power consumption: More writes require more energy.

How WAF Works

NAND flash memory is organized in blocks (tens to hundreds of megabytes) but is programmed in pages (usually 16 kilobytes). Before a block can be written to, it must be erased entirely. This process necessitates a mechanism called garbage collection (GC). GC identifies blocks with invalidated data, erases them, and then moves valid data to new blocks. This movement of data generates additional writes, contributing to the WAF.

Overprovisioning (OP), where extra storage space is allocated for garbage collection, is a common technique to mitigate WAF’s impact. However, OP is a compromise as it reduces the usable capacity of the SSD. OP’s effect on improving SSD random write performance, latency in mixed workloads, and endurance all come directly from the reduction in WAF. The equation for WAF = NAND writes / host writes.

Why is write amplification highly undesirable?

WAF reduces application performance, decreases SSD endurance and increases power proportionate to the WAF. Here’s an example:

WAF	Write Performance (MB/s)	Endurance (TBW)	IOPS & BW / W
1	Random write = sequential write	TBW = NAND PE Cycles * Capacity	Baseline
2	Random write = seq write / 2	TBW = NAND PE Cycles * Capacity / 2	Baseline / 2
5	Random write = seq write / 5	TBW = NAND PE Cycles * Capacity / 5	Baseline / 5

These attributes, as well as workload performance, power and endurance, are the main drivers of rack-level TCO.

Flexible Data Placement (FDP) to the Rescue

FDP offers a more intelligent approach to managing data placement within SSDs. It works by grouping similar data together, optimizing garbage collection efficiency, and reducing the WAF. This results in:

- Enhanced performance: Lower WAF directly translates to faster write speeds.

- Increased endurance: Less wear and tear on the NAND flash prolongs the SSD’s lifespan.

- Reduced overprovisioning: FDP allows for greater utilization of the SSD’s raw capacity.

- Improved Quality of Service (QoS): FDP’s isolation capabilities ensure consistent performance even in multi-tenant environments.

Key Advantages for Enterprises

- Seamless Integration: FDP is fully backward compatible, allowing it to work with existing applications and infrastructure. Software can be enabled to support FDP tagging directly or indirectly through multiple namespaces.

- Scalability: FDP enables enterprises to scale SSD capacity in line with growing compute workloads.

- Lower TCO: The combined benefits of improved performance, endurance, and capacity utilization significantly reduce the total cost of ownership for storage infrastructure.

Figure 1. FDP tags data that the SSD controller can intelligently place on NAND at high-performance

Compute SSD Trends

The trend towards more cores per CPU, driven by the desire to lower TCO per core, is leading to server consolidation and higher VM densities per SSD. FDP is essential for managing these increasingly complex environments, preventing performance bottlenecks and ensuring a balanced user experience. With more workloads running on a single SSD, the ‘noisy neighbor’ problem starts happening where one workload’s bursty performance can impact the quality of service on another VM. FDP can help alleviate this because FDP separates the written data into different segments of flash, which are known as RUHs (reclaim unit handles).

FDP Use Cases

FDP has a wide range of applications, including:

- Cloud-Native Environments: Multi-tenant, containerized workloads benefit greatly from FDP’s ability to optimize garbage collection and enhance QoS.

- Caching Applications: FDP can dramatically improve the efficiency and performance of caching solutions like CacheLib, one of the most popular hyperscale applications

- Mixed Workloads: FDP can improve QoS and performance in mixed workloads by separating them

- Write-Intensive Applications: FDP’s ability to reduce WAF is especially beneficial for write-heavy workloads with multiple write streams, extending SSD lifespan and ensuring consistent performance.

Customer Use Cases

Caching

FDP was developed by Google, Meta, and SSD vendors, specifically for multi-tenant, multiple namespaces, multiple applications, VMs, containers, microservices, log-structured DB, and caching. Here is an example of CacheLib, one of the most widely deployed caching applications in hyperscale.

A WAF = 1 in CacheLib vs the typical WAF = 3 means that hyperscalers can use the native SSD size for caching instead of having to spend up to 100% more (50% reduction inside) for overprovisioning without sacrificing any application performance. The higher performance means better response times in applications. Further, dialing in the amount of DRAM and NVMe SSD cache for each application is critical to balancing an improved user experience with reduced TCO.

Log Structured DB

Flexible Data Placement (FDP) offers significant benefits for log-structured databases (LSDBs) due to their unique write patterns and performance requirements. The OCP NVMe SSD spec has some requirements for workloads that simulate RockDB and other production database workloads at Meta. They want high TPS with low response time, which from the SSD side means you have to perform random read at good quality of service and low latency while background sequential write and TRIM is going on.

- Optimized Write Amplification: FDP minimizes WAF in LSDBs, leading to faster write speeds, improved endurance, and lower storage costs.

- Consistent Performance: Performance consistency and QoS improves by enabling FDP

- Lower OP, resource utilization: FDP enables LSDBs to utilize more of the SSD’s raw capacity, increasing storage efficiency.

- Scalability: Put multiple instances of database on a single SSD (e.g. using a 4TB NVMe with four 1TB namespaces)

In essence, FDP optimizes LSDB performance, reliability, and scalability while minimizing storage costs, making it a valuable technology for demanding environments.

Mixed Workloads

Most real-world compute SSD applications are a mix of sequential writes and some random reads. In the test case below, we see how enabling FDP for a mixed 50% read and 50% write workload can increase overall SSD performance by 85%!

Write Workloads

Here is another benchmark designed to show the benefit of FDP in write-intensive applications. The 3.84TB drive features eight different applications that write a variety of block sizes for the SSD. Without FDP, this multi-sequential workload looks random to the drive, triggering a WAF of almost three. This WAF impacts performance and endurance proportionally! With FDP, we can reduce the WAF to one and increase performance to 5GB/s with much higher consistency (lower variation). To recap, we see a combined performance increase by more than 3x and endurance of the same amount, sharply increasing SSD longevity.

Multiple Namespaces

Even if your applications aren’t designed to directly utilize FDP, you can still harness its advantages by leveraging namespaces. By associating each namespace with a dedicated FDP Reclaim Unit Handle (RUH), you effectively isolate data writes to specific areas of the NAND flash memory. This approach offers several benefits, particularly in cloud-native environments:

Enhanced Isolation and Performance:

- Reduced “Noisy Neighbor” Effect: In multi-tenant environments, one application’s heavy workload can impact the performance of others sharing the same SSD. Namespace isolation with FDP minimizes this interference, ensuring consistent performance for each application.

- Optimized Garbage Collection: FDP optimizes garbage collection within each RUH independently. This means that the GC process for one namespace won’t disrupt the performance of another, resulting in smoother, more predictable performance overall.

- Simplified Management: Namespaces provide a logical way to group and manage data for different applications or tenants. This granular control simplifies administration and troubleshooting.

- Performance sharing: While some VMs are idle, others can still get the maximum SSD performance

Example: FDP with Proxmox Virtualization

In a Proxmox environment, you can create multiple virtual machines (VMs) and assign each VM to a dedicated namespace. By associating each namespace with an FDP RUH, you enable FDP optimizations for each VM without requiring any modifications to the VM software.

Implementation Steps:

Enable FDP: Send the appropriate set features command to activate FDP on the SSD.
Delete Existing Namespaces: Remove any existing namespaces to start with a clean configuration.
Review Configuration: Obtain and verify the SSD’s configuration details to ensure FDP is active. You can see here there are 8 RUH available
Create Namespaces with RUH Association: Create new namespaces and explicitly tag each one with a unique RUH.
Format and Attach Namespaces: Use LVM (Logical Volume Manager) or a similar tool to format the namespaces and attach them to individual VMs in Proxmox. You can do this through the Proxmox GUI directly or with the CLI.
Optional QoS Control: Apply IOPS or bandwidth limits to specific VMs through virtio block device drivers for additional fine-grained control.

Conclusion

Flexible Data Placement is a transformative technology for enterprise storage. By tackling the WAF challenge head-on, FDP delivers enhanced performance, longevity, and cost savings. As data continues to grow exponentially, FDP will play a pivotal role in ensuring SSDs can meet the evolving demands of the digital age.