17 AppTestify Jobs
AppTestify - Machine Learning Engineer (10-11 yrs)
AppTestify
posted 3+ weeks ago
Flexible timing
Key skills for the job
About the Role :
We are seeking an accomplished and highly experienced ML Engineer with a strong focus on MLOps to join our innovative team. This advanced-level role is critical to bridging the gap between machine learning model development and production operations. You will be responsible for designing, building, and maintaining scalable, robust, and automated infrastructure for deploying, monitoring, and managing machine learning models throughout their lifecycle. With over 10 years of experience, you will serve as a technical leader, driving best practices in MLOps, ensuring the reliability, efficiency, and performance of our ML systems.
Key Responsibilities :
MLOps Strategy & Implementation :
- Lead the design, development, and implementation of end-to-end MLOps pipelines for continuous integration, continuous delivery (CI/CD), and continuous training (CT) of machine learning models.
- Establish and enforce best practices for model versioning, lineage tracking, experiment management, and model serving.
- Develop and maintain infrastructure as code (IaC) to automate the provisioning and management of ML environments.
Scalable Code Development & Engineering :
- Write, test, and implement clean, efficient, and scalable Python code for machine learning pipelines, data processing, model training, and inference services.
- Ensure code quality through rigorous testing, adherence to coding standards, and the resolution of issues identified by linters and security scanners.
- Actively participate in and lead code reviews, providing constructive feedback and ensuring high-quality, maintainable codebases.
- Manage pull requests, merge conflicts, and branching strategies within a collaborative GitHub environment.
Model Deployment & Monitoring :
- Architect and implement robust solutions for deploying ML models to production environments, ensuring high availability, low latency, and fault tolerance.
- Develop comprehensive monitoring and alerting systems for model performance, data drift, concept drift, and infrastructure health.
- Implement strategies for A/B testing, canary deployments, and rollback mechanisms for ML models.
Collaboration & Leadership :
- Act as a technical leader within the team, mentoring junior engineers and promoting a culture of engineering excellence.
- Collaborate closely with Data Scientists to transition models from research to production, providing guidance on model operationalization and serving requirements.
- Work with DevOps and SRE teams to integrate MLOps pipelines into existing infrastructure and security frameworks.
- Lead scrum ceremonies, including sprint planning, daily stand-ups, and retrospectives, ensuring efficient team progress.
- Proactively identify and scope technical issues, define clear problem statements, and propose effective solutions.
Technology Stack Expertise :
- Leverage deep expertise in Python, cloud services (AWS ECS, S3), databases (Postgres), and specialized ML tools (MLFlow Hub) to build and optimize ML systems.
- Integrate and operationalize Large Language Models (LLMs), including fine-tuning, prompt engineering, and deployment strategies.
Required Technical Skills & Experience :
- 10+ Years of proven experience in Machine Learning Engineering, with a significant focus on MLOps.
- Expert-level proficiency in Python for building scalable ML applications and data pipelines.
- Strong experience with relational databases, particularly Postgres, including schema design, query optimization, and data management.
- Extensive experience with AWS services, including Amazon S3 for data storage, Amazon ECS (or Kubernetes/EKS) for container orchestration, and other relevant services for ML workloads (e.g., SageMaker, Lambda, EC2).
- Demonstrable experience with Large Language Models (LLMs), including their deployment, serving, and integration into applications.
- In-depth knowledge and hands-on experience with MLOps platforms and tools, specifically MLFlow Hub for experiment tracking, model registry, and model deployment.
- Proficiency with Git and GitHub for version control, collaborative development, and CI/CD workflows.
- Solid understanding of machine learning algorithms, model training, evaluation, and inference.
- Experience with containerization technologies (Docker) and orchestration (Kubernetes/ECS).
- Familiarity with CI/CD tools and practices (e.g., Jenkins, GitLab CI, GitHub Actions).
Qualifications :
- Bachelor's or Master's degree in Computer Science, Machine Learning, Data Science, or a related quantitative field.
- Exceptional problem-solving abilities and a strong analytical mindset.
- Excellent written and verbal communication skills, with the ability to articulate complex technical concepts to diverse audiences.
- Proven ability to work independently and as part of a high-performing, cross-functional team.
- A strong passion for machine learning and staying abreast of the latest industry trends and technologies.
Functional Areas: Other
Read full job description10-11 Yrs
Data Science, Python, Cloud Services +3 more
8-9 Yrs
Cloud Services, Splunk Admin, IT Operations Management +3 more
5-6 Yrs
Splunk Admin, IT Infrastructure, Monitoring Tools +2 more
4-7 Yrs
Python, Machine Learning, Postgresql +4 more
8-15 Yrs
Python, Machine Learning, Postgresql +2 more
5-7 Yrs
Python, SQL, Cloud Services +6 more
5-6 Yrs
.NET, Javascript, C# +3 more
5-10 Yrs
Python, AWS, Django +3 more
4-5 Yrs
Data Science, Python, Artificial Intelligence +4 more
5-7 Yrs
Python, Django, MongoDB +4 more