Bilingual Data Engineer Pyspark and SQL
Capgemini
At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where you can make a difference. Where no two days are the same.
Job Description
As a Spark, python ETL Data engineer, you will be instrumental in joining data from various sources to create a holistic view of the customer journey. This integrated data product will connect our back-end systems for equipment service activations to our front-end experiences on digital applications, helping minimize customer friction, reduce costs, and increase sales.
In this role you will play a key role in:
- Developing robust ETL code running on cloud infrastructure that adheres to our CICD processes
- Creating and maintaining Athena (Iceberg) tables that serve as the foundation for data export services
- Building data consumption layers for Tableau reporting and other analytical tools
- Collaborating with team leads to design and implement data integration solutions
- Optimizing PySpark code for performance, scalability, and reliability
Your profile
- 5+ years of experience in data engineering with a strong focus on ETL development
- Proven expertise in PySpark and SQL for large-scale data processing
- Extensive experience with AWS services including EKS, EMR, Lambda, Airflow, StepFunctions, S3, and Athena
- Proficiency with GitLab for CICD pipelines and version control
- Experience working with cloud-based data lakes and data warehousing concepts
- Strong problem-solving skills and ability to work in a collaborative team environment
What you'll love about working here
- Free access to learning platforms: Free access for all to world-class learning assets and curated programs from Harvard Business Review, Coursera, Pluralsight, Udemy, Microsoft, AWS, Google and many more.
- Dedicated innovation labs: We help the world's largest innovators engineer the products and services of tomorrow by leveraging our experts and labs, dedicated to topics as 5G, 6G, AI, Autonomous Vehicles and Quantum.
- Empowering environment: Autonomy and Goal setting are among the top scores with 8,4+ ratings in our monthly employee feedback Pulse.
Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.