APPLICATION DEVELOPER
Publication Date:
Jun 17, 2025
Ref. No:
530352
Location:
Chennai, IN
The future is our choice
At Atos, as the global leader in secure and decarbonized digital, our purpose is to help design the future of the information space. Together we bring the diversity of our people’s skills and backgrounds to make the right choices with our clients, for our company and for our own futures.
Job Description: -
We are looking for a skilled and motivated AI Quality Assurance (QA) Engineer to join our dynamic team, focusing on the validation, testing, and optimization of GenAI and RAG-based applications. This role will be pivotal in ensuring the quality, scalability, and performance of AI models and systems, specifically in the areas of Large Language Models (LLMs), retrieval pipelines, and API testing.
Key Responsibilities:-This role will play a critical part in driving the performance and trustworthiness of AI models, and it’s an exciting opportunity for anyone looking to make a significant impact in the field of AI.
We are looking for a skilled and motivated AI Quality Assurance (QA) Engineer to join our dynamic team, focusing on the validation, testing, and optimization of GenAI and RAG-based applications. This role will be pivotal in ensuring the quality, scalability, and performance of AI models and systems, specifically in the areas of Large Language Models (LLMs), retrieval pipelines, and API testing.
Key Responsibilities:-This role will play a critical part in driving the performance and trustworthiness of AI models, and it’s an exciting opportunity for anyone looking to make a significant impact in the field of AI.
1. Test Planning & Strategy: -
-
Develop and execute comprehensive test plans and strategies tailored to GenAI and RAG-based applications.
-
Design and implement robust test cases and automation frameworks to ensure the effectiveness of AI models.
2. LLM Output Validation: -
-
Evaluate LLM responses for relevance, factual accuracy, and consistency.
-
Identify and mitigate potential biases in the output, ensuring fairness and reliability.
3. Retrieval Pipeline Testing: -
-
Test and validate high-quality retrieval processes from vector databases like FAISS, Pinecone, Weaviate, and ChromaDB to ensure the accurate context-based responses.
4. Functional & Regression Testing:
-
Perform functional, integration, and regression testing to ensure AI-powered applications meet business requirements.
-
Detect and resolve any issues related to functionality and model performance.
5. Automated API Testing:
-
Develop and implement automated API tests for validating AI models and vector search mechanisms.
-
Ensure APIs are working as expected under various conditions and use cases.
6. AI Performance Evaluation:
-
Evaluate the performance of LLMs using NLP metrics (e.g., BLEU, ROUGE, METEOR, perplexity) to assess the quality and effectiveness of model responses.
7. Scalability & Latency Testing:
-
Test AI/ML models to evaluate performance, latency, and response times under different workload conditions.
-
Ensure systems can handle increasing scale while maintaining efficient performance.
8. Data Integrity & Security:
-
Ensure all AI applications comply with relevant data privacy, security standards, and regulations.
-
Implement best practices to safeguard sensitive data and maintain data integrity throughout testing.
9. Collaboration with AI Teams:
-
Work closely with data scientists, machine learning engineers, and developers to share test results, identify areas for improvement, and refine AI models.
-
Contribute to the continuous evolution of AI models based on test findings.
10. CI/CD & Model Deployment:
-
Integrate AI testing processes into CI/CD pipelines for continuous testing and validation of AI models.
-
Ensure seamless and efficient deployment of models through automated validation mechanisms.
11. Synthetic Data Generation:
-
Implement techniques for generating synthetic data to enhance the robustness and performance of AI models.
-
Ensure diverse, balanced, and realistic datasets for model training and evaluation.
12. Prompt Engineering Testing:
-
Use tools such as LangTest, PromptFoo, or LlamaIndex evals to test and validate prompt effectiveness and ensure they generate the desired outputs.
Required Skills: -
-
Proven experience in AI/ML testing, particularly with Generative AI (GenAI) and Retrieval-Augmented Generation (RAG) models.
-
Familiarity with NLP evaluation metrics such as BLEU, ROUGE, METEOR, and perplexity.
-
Experience with CI/CD pipelines, model deployment, and continuous validation practices.
-
Proficiency in using synthetic data generation techniques to improve model robustness.
-
Experience working with tools like LangTest, PromptFoo, or LlamaIndex for prompt engineering and validation.
-
Familiarity with data privacy and security standards in AI applications.
-
Strong collaboration skills and ability to work with cross-functional teams (data scientists, ML engineers, developers).
Preferred Skills:
- Experience with vector databases (FAISS, Pinecone, Weaviate, ChromaDB).
- Knowledge of the latest AI/ML trends and techniques for model validation and performance optimization.
- Familiarity with machine learning frameworks and tools.
#Atos