2 Middleware Jobs
5-8 years
Middleware - Staff Software Engineer - Artificial Intelligence/Machine Learning (5-8 yrs)
Middleware
posted 3+ weeks ago
Key skills for the job
About Middleware :
Middleware is a full-stack cloud observability platform that helps organizations gain insights and visibility into complex systems and applications.
Our observability platform is designed to provide AI-based real-time monitoring and alerting capabilities as well as advanced analytics and reporting tools.
We are passionate about delivering a world-class, cost-effective product and exceptional customer service. We seek talented individuals who share our values and vision to join our team.
At Middleware, we believe observability should be intelligent, affordable, and seamless. We combine state-of-the-art AI capabilities with a relentless focus on customer satisfaction, developer experience, and technical innovation
About the Role :
We are looking for a Staff Software Engineer, AI/ML to join our growing and passionate AI engineering team. As a core contributor to our AI-based observability platform, you will be responsible for architecting, developing, and deploying intelligent systems that analyze, diagnose, and respond to issues across distributed applications in real time.
You will work at the intersection of machine learning, cloud infrastructure, and software engineering to develop autonomous agents, predictive models, and decision-making frameworks that drive Middlewares AI observability engine. This is a unique opportunity to shape the future of observability with cutting-edge AI technologies.
Key Responsibilities :
- AI Model Development & Deployment : Design, implement, and deploy machine learning and deep learning models (e.g., time-series forecasting, anomaly detection, root cause analysis, and classification) to enhance observability features and incident intelligence.
- AI Infrastructure & APIs : Build scalable and reusable ML model-serving pipelines and expose them as APIs and SDKs that integrate seamlessly with the rest of the Middleware platform.
- Multi-Agent Systems & Autonomous Agents : Design and develop AI agents capable of autonomous reasoning, API orchestration, and decision-making in real-time environments. These agents should proactively identify and resolve issues across complex cloud infrastructures.
- Reinforcement Learning & Agent Optimization : Leverage reinforcement learning techniques such as RLHF (Reinforcement Learning from Human Feedback), PPO (Proximal Policy Optimization), and A3C (Asynchronous Advantage Actor-Critic) to train agents for adaptive, context-aware actions. Long-Term
- Memory & Context Management : Develop and optimize vector-based memory systems for agents to support long-term contextual understanding, reasoning, and dialogue.
- AI Safety & Alignment : Ensure that autonomous agent behaviors are safe, robust, and aligned with user intentions and platform goals. Develop evaluation frameworks to detect and mitigate hallucinations or unintended actions.
- Cross-Functional Collaboration : Collaborate with data engineers, backend developers, product managers, and customer-facing teams to understand product requirements and deliver AI features that solve real-world observability challenges.
- System Performance & Reliability : Write high-quality, scalable, and secure code. Optimize models and systems for low latency, high throughput, and fault tolerance in production environments.
- Code Reviews & Mentorship : Participate in technical design discussions, code reviews, and mentoring of junior engineers. Help shape the AI engineering culture at Middleware.
Qualifications :
- Bachelors or Masters degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field (PhD is a plus). 7+ years of experience in software engineering with a focus on AI/ML systems.
- Strong programming skills in Python (TensorFlow, PyTorch, JAX) and experience with model deployment frameworks (e.g., ONNX, TorchServe, TensorFlow Serving).
- Proficiency in building and deploying ML models in production environments (Docker, Kubernetes, cloud platforms like AWS/GCP).
- Deep understanding of RL algorithms and experience applying them to real-world problems.
- Experience with agentic architectures, LangChain, ReAct, AutoGPT, or similar frameworks.
- Strong understanding of distributed systems, APIs, and observability tools.
- Familiarity with vector databases (e.g., Pinecone, Weaviate, FAISS) and memory retrieval systems.
- Passion for AI safety, explainability, and ethical AI practices.
Functional Areas: Software/Testing/Networking
Read full job description5-8 Yrs
Python, Cloud Computing, Artificial Intelligence +4 more
3-7 Yrs
Remote
DevOps, Salesforce, Machine Learning +7 more