CONSULTANT
Pune, IN
The future is our choice
At Atos, as the global leader in secure and decarbonized digital, our purpose is to help design the future of the information space. Together we bring the diversity of our people’s skills and backgrounds to make the right choices with our clients, for our company and for our own futures.
Skill Set
- Experience in PySpark and Python Language.
- Experience in (OLAP Systems).
- Experience in SQL (should be able to write complex SQL Queries)
- Experience in Orchestration (Apache Airflow is preferred).
- Experience in Hadoop (Spark and Hive: Optimization of Spark and Hive apps).
- Knowledge in Snowflake (good to have).
- Experience in Data Quality (good to have).
- Knowledge in File Storage (S3 is good to have)
Role and Responsibilities
1. Data Pipeline Development
- Build and maintain scalable, reliable, and efficient ETL (Extract, Transform, Load) pipelines using Python and Airflow.
- Automate data ingestion and processing workflows from multiple sources.
2. Data Integration
- Integrate and transform data from disparate sources (e.g., APIs, third-party systems, legacy systems).
- Handle data standardization, validation, and quality assurance during integration.
3. Big Data Processing
- Utilize big data technologies like Apache Spark, and Snowflake for large-scale data processing.
- Write efficient and scalable Python scripts to process and validate the data.
4. Data Governance and Quality
- Implement data validation, cleaning, and transformation processes to ensure data accuracy and reliability.
- Enforce compliance with data governance policies and standards (e.g., GDPR, HIPAA).
5. Collaboration
- Work closely with other teams to understand data requirements.
- Collaborate with software engineers to integrate data workflows into applications.
6. Monitoring and Optimization
- Monitor the performance of data pipelines and systems.
- Debug and optimize data workflows to improve efficiency and reliability.
7. Scripting and Automation
- Develop reusable and modular Python scripts for repeated tasks.
- Automate workflows for recurring data processing jobs.
8. Documentation and Best Practices
- Document pipeline architecture.
Here at Atos, diversity and inclusion are embedded in our DNA. Read more about our commitment to a fair work environment for all.
Atos is a recognized leader in its industry across Environment, Social and Governance (ESG) criteria. Find out more on our CSR commitment.
Choose your future. Choose Atos.