Connecting people I'd hire with companies I'd work at

Matt Wallaert
companies
Jobs

Pyspark with ADB

Capgemini

Capgemini

India
Posted on Feb 15, 2026

At Capgemini Invent, we believe difference drives change. As inventive transformation consultants, we blend our strategic, creative and scientific capabilities, collaborating closely with clients to deliver cutting-edge solutions. Join us to drive transformation tailored to our client's challenges of today and tomorrow. Informed and validated by science and data. Superpowered by creativity and design. All underpinned by technology created with purpose.

Your Role

We are looking for a skilled PySpark Developer with experience in Azure Databricks (ADB) and Azure Data Factory (ADF) to join our team. The ideal candidate will play a crucial role in designing, developing, and implementing data solutions using PySpark for large-scale data processing and analytics.

  • The candidate should be able to design, develop, and deploy PySpark applications on Azure Databricks, implement ETL/ELT pipelines using Azure Data Factory, and collaborate with data teams to process structured and unstructured datasets into meaningful insights.
  • They must optimize PySpark jobs and data pipelines for performance and reliability while ensuring data quality, integrity, and adherence to regulatory and industry standards throughout all stages of data processing.
  • The role requires conducting financial risk assessments, identifying vulnerabilities in data workflows, and implementing strategies to mitigate financial risks associated with data transformation and aggregation.
  • The candidate should be capable of troubleshooting and debugging pipeline issues, applying best practices for data security, compliance, and privacy within the Azure environment, and documenting technical specifications, data flows, and solution architecture.
Works in the area of Software Engineering, which encompasses the development, maintenance and optimization of software solutionsorapplications.1. Applies scientific methods to analyse and solve software engineering problems.2. Heorshe is responsible for the development and application of software engineering practice and knowledge, in research, design, development and maintenance.3. Hisorher work requires the exercise of original thought and judgement and the ability to supervise the technical and administrative work of other software engineers.4. The software engineer builds skills and expertise of hisorher software engineering discipline to reach standard software engineer skills expectations for the applicable role, as defined in Professional Communities.5. The software engineer collaborates and acts as team player with other software engineers and stakeholders.

Your Profile

  • The candidate should hold a bachelor’s degree in Computer Science, Engineering, or a related field (master’s preferred) and have proven experience as a PySpark Developer with strong knowledge of Apache Spark internals, Python, SQL, Azure Databricks, and Azure Data Factory.
  • They must be skilled in designing and optimizing ETL/ELT data pipelines, familiar with cloud platforms—preferably Microsoft Azure—and capable of collaborating effectively through strong communication and teamwork abilities.
  • Experience in Financial, Risk, Compliance, or Banking domains is a plus, along with the ability to identify, analyse, and mitigate financial risks within data processes.
  • The candidate should ensure all data processes comply with regulatory requirements and industry standards while demonstrating excellent problem‑solving skills and critical thinking.

Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.