AutoExecutor: Predictive Parallelism for Spark SQL Queries

Rathijit Sen; Abhishek Roy; Alekh Jindal; Rui Fang; Jeff Zheng; Xiaolei Liu; Ruiping Li

AutoExecutor: Predictive Parallelism for Spark SQL Queries

Rathijit Sen ,
Abhishek Roy ,
Alekh Jindal ,
Rui Fang ,
Jeff Zheng ,
Xiaolei Liu ,
Ruiping Li

VLDB | August 2021

Best Demonstration Award

Download BibTex

Right-sizing resources for query execution is important for cost-efficient performance, but estimating how performance is affected by resource allocations, upfront, before query execution is difficult. We demonstrate AutoExecutor, a predictive system that uses machine learning models to predict query run times as a function of the number of allocated executors, that limits the maximum allowed parallelism, for Spark SQL queries running on Azure Synapse.