IBM FlashSystem Data Reduction Pools: Hardware-Assisted vs. Software Based vs. FCM

Richie Palma, Tech Solutions Consultant

In my previous article, “Understanding IBM External Storage Data Reduction Technology,” we looked at the options available for shrinking your data on disk.

We talked about the three main functions available in FlashSystem storage, which include:

  • Thin provisioning – allows you to over-provision storage for each LPAR.
  • Compression – data is compressed and decompressed as it is written and read to/from disk, which saves the system’s capacity.
  • Deduplication – Removes but tracks duplicate information sitting on disk to reduce size on physical storage.

Data Reduction Pools (DRP)

Now we will dive a little deeper into the engine that drives these storage and money saving functions.  The first thing to understand about Data Reduction Pools (DRP) is that any DRP is taxing on the system and has to be handled accordingly to have it make sense and deliver value (Cost vs. benefit).  In my mind, there are two ways that you could handle this fact.

  1. You accept that your compute resources will be affected by the demands of data reduction and factor the impact into sizing.
  2. You build other compute on top of or alongside the core SAN compute to offload the work.

In the image below, you will see the SAN models’ specifications that make up the FlashSystem family.  They are arranged from left to right by size, and you guessed it, cost.  What we are going to focus on today is called data reduction and is highlighted in yellow below.

View Full Size

IBM_FlashSystem_Chat

 

View Full Size

There are three main types of Data Reduction supported across the FlashSystem family.

  • Software DRP (Data Reduction Pool) compression & deduplication
  • Hardware-Assisted (Data Reduction Pool) compression & deduplication
  • Flash Core Module Compression

IBM FlashSystem 5010

As you can see, the FlashSystem 5010 does not support any Data Reduction and is sized for smaller workloads.  It does not have the compute performance to dedicate some of it to handle DRP.

IBM FlashSystem 5030

The IBM FlashSystem 5030 supports only software compression through Data Reduction Pools (DRP).  This means DRP is handled through software and uses the compute resources in the actual SAN to support all dedupe and compression.  In the 5030’s case, you actually have to configure the 64 GB cache upgrade to support it.  This is something you really want to understand before you choose that FS5030.  With that being said, this 5030 model is a great solution for many IBM i shops and packs a fair amount of performance at the storage level when compared to its predecessors in IBM’s external storage family.

IBM FlashSystem FS5100

The next level up in our awesome FlashSystem family is the FS5100.  This is a badass SAN that is very approachable for the mid-sized IBM i shops from a price/value standpoint, especially when it comes to compression and deduplication.  DRP compression is supported on the IBM FlashSystem 5100 system. This system’s node canisters have a hardware-assisted compression accelerator installed that increases the throughput of I/O transfers between nodes and compressed volumes.

The Storwize family DRP compression is based on the Lempel-Ziv lossless data compression algorithm that operates in real-time. When a host sends a write request, the request is acknowledged by the write cache of the system and then staged to the DRP.  As part of its staging, the write request passes through the compression engine and is stored compressed on disk. Writes are acknowledged immediately after they are received by the write cache with compression occurring as part of the staging to storage. This process occurs transparently to host systems, making them unaware of the compression.

Before we dive into the next piece of compression, let’s do a quick review.

  • The FS5030 supports SW based DRP. So DRP takes some of the compute resources away from the SAN’s I/O and allocates them to do compression/deduplication.

This fits into the first option for data reduction I mentioned at the beginning of this article. “You accept the fact that your compute resources are going to be affected by the demands of data reduction and factor the impact into sizing.”

  • The FS5100 uses hardware-assisted data reduction and has a built-in compression accelerator used to support much of the compression work.

This fits into the second option for data reduction I mentioned at the beginning of this article. “You build other compute on top of, or alongside the core SAN compute to offload the work.”

Now let’s dive into one of the most exciting pieces of tech IBM delivers in their FlashSystem family. 

The FS5100 is your entry into support for FCM (Flash Core Modules), which offers IBM MicroLatency technology, advanced flash management, and reliability into a 2.5-inch SFF drive.  Each SFF module uses the Non-Volatile Memory Express (NVMe) protocol, connects via PCIe Gen3, and has its own high-speed NAND memory to provide high throughput, big I/O, and low latency. This also allows for built-in, performance-neutral hardware compression and encryption.  So each FCM can support in-line compression and encryption with no performance impact to the SAN itself.

In addition, they also serve up benefits from a resiliency standpoint with on module “Variable Stripe RAID” (VSR).  VSR provides data protection at the page, block, or chip level. It eliminates the need to replace a whole flash module when a single chip or plane fails, which expands the life and endurance of flash modules and reduces maintenance events throughout the life of the system.

Basically, each of these modules has onboard components to offload functions typically handled at the SAN controller itself onto the individual units.  Pretty cool, right?

These FCM’s also pack big capacity into the little each module with drive size options of 4.8 TB, 9.6 TB, 19.2 TB, & 38.4 TB NVMe FCM.  Factor in the 2:1 compression ratio IBM supports at its bare minimum, and you are talking about a drive capacity of 77 TB per module.  Each SFF control enclosure supports up to 24 SFF drive slots, so we are talking massive capacity in a 2U control enclosure.

Wow, we have covered a lot of ground already and haven’t even made it out of the small to mid-sized offerings in the FlashSystem family.  The nice thing about the FlashSystem family is their focus on enterprise functionality for all.  The FS5100 is the threshold where the hardware carries enough compute power and added acceleration for compression and dedupe. It is physically capable of delivering the full software-defined storage functionality.

Why is this important?  Well, it makes the rest of this article easy for me.

Let’s look at the rest of the portfolio.

FlashSystem 7200: It’s very similar to the 5100 from a functionality standpoint but has a bigger engine.  It can support SAS based SSD’s and HDD’s in expansion units.

FlashSystem 9200: Similar to the 5100 and 7200 from a functionality standpoint but has a much bigger engine.  With the 9200, we transition into an all NVMe configuration at the control enclosure level and SAS-based SSD’s only in the expansion enclosure.

At the end of the day, what we have to remember about SAN storage is that they are purpose-built servers with really one job to do, move data on and off storage as fast as possible.  The more CPU, Cache, and high-performance disk they have, the more they can get done.  When it comes to data reduction, compression, and deduplication, you will use compute resources.  The way I look at it, you only have two ways of handling that additional compute demand.

  1. You accept that your compute resources will be affected by the demands of data reduction and factor the impact into sizing.
  2. You build other compute on top of or alongside the core SAN compute to offload the work.

I hope you enjoyed this article. and please do not hesitate to reach out if you have any questions.

Leave a Reply

Your email address will not be published. Required fields are marked *

*