Obtain as much as 27% higher price-performance for Spark workloads with AWS Graviton2 on Amazon EMR Serverless

Amazon EMR Serverless is a serverless possibility in Amazon EMR that makes it easy to run purposes utilizing open-source analytics frameworks similar to Apache Spark and Hive with out configuring, managing, or scaling clusters.

At AWS re:Invent 2022, we introduced help for working serverless Spark and Hive workloads with AWS Graviton2 (Arm64) on Amazon EMR Serverless. AWS Graviton2 processors are custom-built by AWS utilizing 64-bit Arm Neoverse cores, delivering a big leap in price-performance to your cloud workloads.

This put up discusses the efficiency enhancements noticed whereas working Apache Spark jobs utilizing AWS Graviton2 on EMR Serverless. We discovered that Graviton2 on EMR Serverless achieved 10% efficiency enchancment for Spark workloads based mostly on runtime. AWS Graviton2 is obtainable at a 20% decrease value than the x86 structure possibility (see the Amazon EMR pricing web page for particulars), leading to a 27% general higher price-performance for workloads.

Spark efficiency check outcomes

The next charts examine the benchmark runtime with and with out Graviton2 for a EMR Serverless Spark software (be aware that the charts are usually not drawn to scale). We noticed as much as 10% enchancment in complete runtime and eight% enchancment in geometric imply for the queries in comparison with x86.

The next desk summarizes our outcomes.

Metric	Graviton2	x86	%Acquire
Whole Execution Time (in seconds)	2,670	2,959	10%
Geometric Imply (in seconds)	22.06	24.07	8%

Testing configuration

To guage the efficiency enhancements, we use benchmark assessments derived from TPC-DS 3 TB scale efficiency benchmarks. The benchmark consists of 104 queries, and every question is submitted sequentially to an EMR Serverless software. EMR Serverless has computerized and fine-grained scaling enabled by default. Spark offers Dynamic Useful resource Allocation (DRA) to dynamically alter the appliance sources based mostly on the workload, and EMR Serverless makes use of the indicators from DRA to elastically scale employees as wanted. For our assessments, we selected a predefined pre-initialized capability that enables the appliance to scale to default limits. Every software has 1 driver and 100 employees configured as pre-initialized capability, permitting it to scale to a most of 8000 vCPU/60000 GB capability. When launching the purposes, as default we use x86_64 to get baseline numbers and Arm64 for AWS Graviton2, and the appliance had VPC networking enabled.

The next desk summarizes the Spark software configuration.

Variety of Drivers	Driver Measurement	Variety of Executors	Executor Measurement	Ephemeral Storage	Amazon EMR launch label
1	4 vCPUs, 16 GB Reminiscence	100	4 vCPUs, 16 GB Reminiscence	200 G	6.9

Efficiency check outcomes and value comparability

Let’s do a value comparability of the benchmark assessments. As a result of we used 1 driver [4 vCPUs, 16 GB memory] and 100 executors [4 vCPUs, 16 GB memory] for every run, the overall capability used is 4*101=192 vCPUs, 16*101=1616 GB reminiscence, 200*100=20000 GB storage. The next desk summarizes the fee.

Take a look at	Whole time (Seconds)	vCPUs	Reminiscence (GB)	Ephemeral (Storage GB)	Price
x86_64	2,958.82	404	1616	18000	$26.73
Graviton2	2,670.38	404	1616	18000	$19.59

The calculations are as follows:

Whole vCPU value = (variety of vCPU * per vCPU price * job runtime in hour)
Whole GB = (Whole GB of reminiscence configured * per GB-hours price * job runtime in hour)
Storage = 20 GB of ephemeral storage is offered for all employees by default—you pay just for any further storage that you simply configure per employee

Price breakdown

Let’s have a look at the fee breakdown for x86:

Job runtime – 49.3 minutes = 0.82 hours
Whole vCPU value – 404 vCPUs x 0.82 hours job runtime x 0.052624 USD per vCPU = 17.4333 USD
Whole GB value – 1,616 memory-GBs x 0.82 hours job runtime x 0.0057785 USD per reminiscence GB = 7.6572 USD
Storage value – 18,000 storage-GBs x 0.82 hours job runtime x 0.000111 USD per storage GB = 1.6386 USD
Further storage – 20,000 GB – 20 GB free tier * 100 employees = 18,000 further storage GB
EMR Serverless complete value (x86): 17.4333 USD + 7.6572 USD + 1.6386 USD = 26.7291 USD

Let’s examine to the fee breakdown for Graviton 2:

Job runtime – 44.5 minutes = 0.74 hours
Whole vCPU value – 404 vCPUs x 0.74 hours job runtime x 0.042094 USD per vCPU = 12.5844 USD
Whole GB value – 1,616 memory-GBs x 0.74 hours job runtime x 0.004628 USD per reminiscence GB = 5.5343 USD
Storage value – 18,000 storage-GBs x 0.74 hours job runtime x 0.000111 USD per storage GB = 1.4785 USD
Further storage – 20,000 GB – 20 GB free tier * 100 employees = 18,000 further storage GB
EMR Serverless complete value (Graviton2): 12.5844 USD + 5.5343 USD + 1.4785 USD = 19.5972 USD

The assessments point out that for the benchmark run, AWS Graviton2 result in an general value financial savings of 27%.

Particular person question enhancements and observations

The next chart reveals the relative speedup of particular person queries with Graviton2 in comparison with x86.

We see some regression in a number of shorter queries, which had little affect on the general benchmark runtime. We noticed higher efficiency beneficial properties for lengthy working queries, for instance:

q67 common 86 seconds for x86, 74 seconds for Graviton2 with 24% runtime efficiency achieve
q23a and q23b gained 14% and 16%, respectively
q32 regressed by 7%; the distinction between common runtime is <500 milliseconds (11.09 seconds for Graviton2 vs. 10.39 seconds for x86)

To quantify efficiency, we use benchmark SQL derived from TPC-DS 3 TB scale efficiency benchmarks.

When you’re evaluating migrating your workloads to Graviton2 structure on EMR Serverless, we suggest testing the Spark workloads based mostly in your real-world use instances. The result would possibly range based mostly on the pre-initialized capability and variety of employees chosen. If you wish to run workloads throughout a number of processor architectures, (for instance, check the efficiency on x86 and Arm vCPUs) observe the walkthrough within the GitHub repo to get began with some concrete concepts.

Conclusion

As demonstrated on this put up, Graviton2 on EMR Serverless purposes persistently yielded higher efficiency for Spark workloads. Graviton2 is offered in all Areas the place EMR Serverless is offered. To see a listing of Areas the place EMR Serverless is offered, see the EMR Serverless FAQs. To be taught extra, go to the Amazon EMR Serverless Person Information and pattern codes with Apache Spark and Apache Hive.

When you’re questioning how a lot efficiency achieve you’ll be able to obtain together with your use case, check out the steps outlined on this put up and change together with your queries.

To launch your first Spark or Hive software utilizing a Graviton2-based structure on EMR Serverless, see Getting began with Amazon EMR Serverless.

In regards to the authors

Karthik Prabhakar is a Senior Massive Knowledge Options Architect for Amazon EMR at AWS. He’s an skilled analytics engineer working with AWS prospects to offer finest practices and technical recommendation in an effort to help their success of their knowledge journey.

Nithish Kumar Murcherla is a Senior Programs Improvement Engineer on the Amazon EMR Serverless group. He’s captivated with distributed computing, containers, and all the pieces and something concerning the knowledge.