Article count:640 Read by:1001592

Account Entry

You can't have your cake and eat it too

Latest update time:2021-08-31 09:42
    Reads:

The capacity of storage devices is changing with each passing day. It feels like only a few years from MB to GB to TB, and it is still growing.

But I’m guessing none of this will come as a surprise to you: We’re recording, storing and managing more and more data, whether it’s personal data, work data or corporate data.

So how do you strike a balance between storing massive amounts of data and accessing that data efficiently?

For work-related data, I have options available to me—even for very large data sets.

set up

When scaling a database, whether it is on-premises or in the cloud, performance is a top priority. Without high performance, a large-scale database is nothing more than an active/semi-active archive.

If the entire data set is small and memory (DRAM) can hold it, then the performance requirements are not so high and the capacity of the storage system is not so important. However, with the huge data growth, the amount of data that can be held in memory is getting smaller and smaller under the premise of affordability. Coupled with the growing demand for faster and more detailed analysis, we have reached a data-driven crossroads: we need high performance, high capacity and affordability.

Enterprise SATA SSDs can help. Building with these SSDs allows us to future-proof our Apache Cassandra® deployment, keeping up as our active datasets grow while scaling out in terms of storage capacity. Cassandra's massive scalability, combined with multi-terabyte, high-IOPS enterprise SATA SSDs, allows us to build a high-volume NoSQL platform with massive capacity, agility, and power.

Note: Given the wide range of Cassandra deployments, Micron tested multiple workloads.

Application 1

Enterprise-class SSDs meet growing demands

When building Cassandra nodes using old-style hard disk storage, you scale out by adding more nodes to the cluster, and you scale up by upgrading to larger hard disks. Sometimes you need to do both.

Adding more old-style nodes worked (up to a point) but quickly became impractical. We gained capacity and a small performance boost, but as we added more nodes, the cluster became larger and more complex, taking up more rack space and support resources.

Upgrading to larger regular hard disks helps (to a certain extent) because you gain more capacity per node and per cluster, but the performance gains offered by such upgrades are limited.

Both approaches have high performance costs and do not scale effectively with growth.

High-capacity, fast SSDs like the Micron® 5200 Series are changing the design game. With capacities in the terabytes (TB), throughput in the megabytes per second (MB/s), and tens of thousands of IOPS on a single SSD, high-capacity, ultra-fast SSDs open up new design opportunities and performance thresholds.

Application 2

SSD Clusters: Real Results from a Large Dataset

When planning the next generation of high-volume, high-demand Cassandra clusters, SSDs can provide amazing capacity and very attractive results. Figures 1a-1c summarize the storage configurations tested by Micron.

The tests used the Yahoo! Cloud Serving Benchmark (YCSB) workloads A–D and F to compare three 4-node Cassandra test cluster configurations:

  • SSD Configuration 1: 1 Micron 5200 ECO (3.8TB each)

  • SSD Configuration 2: 2 Micron 5200 ECO (3.8TB each)

  • Old configuration: 4 15000RPM regular hard disks (300GB each)

Note: Given the wide range of Cassandra deployments, Micron tested multiple threads.

With the same number of nodes and one SSD per node, the capacity of a 1 SSD test cluster can be increased by 3 times (2 SSD test cluster can increase capacity by 6 times) compared to the old configuration. In addition, through measurement, it was found that all workloads of each SSD test cluster tested had significant performance improvements, ranging from a minimum of 1.7 times to a maximum of 10.7 times, while latency was reduced and became more consistent.

Application 3

SSD clustering provides more consistent response

Read Response Consistency: Many Cassandra deployments rely heavily on fast and consistent responses, so Micron compared the 99th percentile read response times for each test cluster and workload. The 99th percentile read latency for each configuration is shown below.

A

Workload

A workload that frequently performs update operations, where 50% of the total I/O is data write operations. At the application level, this workload is similar to recording the latest session operations .

B

Workload

Read-dominated workloads (95% read operations). At the application level, this workload is similar to adding metadata to existing content (for example, tagging an image or article).

C

Workload

Read-only workloads. At the application level, this type of workload is similar to reading a user profile or static data when the profile is built elsewhere.

D

Workload

Read the latest entries (the latest records are the most frequently accessed). At the application level, this type of workload is similar to reading user status updates.

Conclusion

High-capacity, high-performance SSDs can help Cassandra achieve amazing results. Whether you are scaling an on-premises or cloud-based Cassandra deployment for higher performance or faster and more consistent read responses, SSDs are the ideal choice.

We can achieve high performance when the data set can fit in memory, but the huge growth of data means that memory can hold less and less data economically.

We are at a crossroads: business demands drive us to seek higher performance, while data growth drives us to seek affordable capacity. Taken together, the answer is clear: Enterprise SSDs deliver outstanding results, helping to meet both performance needs and data growth needs - you can have your cake and eat it too.

Deploying SSDs in your data center is a high-value option for reducing your total cost of ownership (TCO). To see how this configuration compares to other configurations, use Micron's Move2SSD TCO Tool to estimate the cost savings you can achieve by deploying SSDs compared to your existing architecture .


Featured Posts


Latest articlesabout

 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us About Us Service Contact us Device Index Site Map Latest Updates Mobile Version

Site Related: TI Training

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

EEWORLD all rights reserved 京B2-20211791 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号 Copyright © 2005-2021 EEWORLD.com.cn, Inc. All rights reserved