Our client is seeking an experienced Apache Spark Optimization Expert to join our data engineering team, working in the Finance InfoTech area of a leading reinsurance company.
Their Azure based data solution processes billions of records within challenging deadlines. Your primary focus will be on optimizing and tuning Spark workloads to enhance performance, stability, and efficiency. Their platform is based on Azure Databricks, with heavy usage of Delta Lake tables. We also have data integrations to various relational databases.
Key Responsibilities
- Analyse, troubleshoot, and optimise the performance of Spark workloads
- Ensure the ingestion from relational databases (usually via JDBC connections) provides optimal and reliable throughput
- Perform benchmarks, interpret and create executions from Spark execution plans and prepare examples (e.g. in notebooks) to unlock better performance and efficiency
- Analyze our Python based codebases for optimization potential
- help teams with application development, including complicated cashflow preparation logic
- Monitor and fine-tune Spark and Databricks clusters for cost-effectiveness and operational excellence
Top 3 essentials skills/experience-based requirements
- Significant hands-on experience with Apache Spark
- Experience with the Databricks platform, preferably in Azure
- Excellent communication skills in English (written and verbal)
...Desired Skills and Qualifications:
- Significant hands-on experience with Apache Spark, including detailed and demonstrable expertise in performance optimisation and tuning.
- Deep understanding of Spark internals, execution plans, different types of join, table layouts incl. liquid clustering, and Adaptive Query Execution (AQE).
- Extensive experience optimizing Spark MERGE statements, joins, aggregations, and transformations.
- Practical experience connecting Spark with relational databases using JDBC and optimizing their throughput
- Expert level skillset with Python and PySpark • Experience with the Databricks platform, preferably in Azure
- Strong analytical skills to quickly diagnose performance bottlenecks and implement effective solutions
- Excellent communication skills in English (written and verbal)