Prepare for Your Talentoj Interview with Real Experiences!
View interviews67 Talentoj Jobs
Software Development Engineer III - MLOps (5-10 yrs)
Talentoj
posted 6 days ago
Job Description :
As a Software Development Engineer in the MLOps Team, you will develop the blueprint for highly scalable and performant ML model serving. As a Software Engineer (SDE III), you will design and implement robust ML infrastructure for model deployment, serving, and optimization. You will work on efficient CI/CD pipelines for ML models and leverage advanced compilers or hardware optimization to maximize inference performance while optimizing costs. We welcome you to challenge us to improve our software development processes and tools.
Responsibilities :
- Build and optimize model serving infrastructure with a focus on inference latency and cost optimization.
- Architect efficient inference pipelines that balance latency, throughput, and cost across various acceleration options.
- Develop monitoring and observability solutions for ML systems.
- Collaborate with ML Engineers to establish best practices for optimized model deployment.
- Implement cost-efficient, enterprise-scale solutions.
- Collaborate in a cross-functional, distributed team for continuous system improvement.
- Work with MLEs, QA Engineers, and DevOps Engineers.
- Evaluate and implement new technologies and tools.
- Contribute to architectural decisions for distributed ML systems.
Requirements :
- 5+ years of experience in software engineering with Python.
- Experience with ML frameworks, particularly PyTorch.
- Experience optimizing ML models with hardware acceleration (AWS Neuron, ONNX, TensorRT).
- Experience with AWS ML services and hardware-accelerated instances (Sagemaker, Inferentia, Trainium).
- Proven experience building and operating AWS serverless architectures.
- Deep understanding of event-driven processing patterns, SQS/SNS, and serverless caching solutions.
- Experience with containerization using Docker and orchestration tools.
- Strong knowledge of RESTful API design and implementation.
- Proficiency in writing good quality & secure code and familiarity with static code analysis tools.
- Excellent analytical, conceptual, and communication skills in spoken and written English.
- Experience applying Computer Science fundamentals in algorithm design, problem solving, and complexity analysis.
- Experience with model compilation and quantization, performance profiling, and benchmarking ML inference systems.
- Experience working in regulated industries with strict compliance requirements for cloud-native solutions.
Functional Areas: Software/Testing/Networking
Read full job descriptionPrepare for Your Talentoj Interview with Real Experiences!
View interviews5-10 Yrs
DevOps, Python, AWS +2 more
8-12 Yrs
Data Analytics, Python, SQL +3 more
6-10 Yrs
DevOps, AWS, Legal Advisory +6 more
5-8 Yrs
Javascript, Angularjs, TypeScript +1 more
2-4 Yrs
Data Engineering, Python, ETL Testing +6 more
2-6 Yrs
DevOps, AWS, Linux Administration +3 more
2-4 Yrs
Oracle, IT Consulting, EPM
4-6 Yrs
React Native, Biostatistics, Android +2 more
3-5 Yrs
DevOps, Python, AWS +5 more