Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
- Gratuity calculator
  
  Check your gratuity amount
- HRA calculator
  
  Check how much of your HRA is tax-free
- Salary hike calculator
  
  Check your salary hike
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

VIEW WINNERS
- ABECA 2025
  
  VIEW WINNERS
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
Participate in ABECA 2026

Add office photos

Employer? Claim Account for FREE

Core Technologies & Solutions

Compare

4.2

based on 8 Reviews

26 Core Technologies & Solutions Jobs

Site Reliability Engineer (7-20 yrs)

Core Minds Tech SOlutions

4.2

based on 8 Reviews

7-20 years

Core Technologies & Solutions

posted 2 weeks ago

Job Role Insights

Flexible timing

Key skills for the job

DevOps Python Java Kubernetes Site Reliability Engineering IT Infrastructure

Job Description

Job Description :

- Engage with our product teams to understand requirements, design, and implement resilient and scalable infrastructure solutions

- Operate, monitor, and triage all aspects of our production and non-production environments

- Collaborate with other engineers on code, infrastructure, design reviews, and process enhancements.

- Evaluate and integrate new technologies to improve system reliability, security, and performance

- Develop and implement automation to provision, configure, deploy, and monitor Apple services

- Participate in an on-call rotation providing hands-on technical expertise during service-impacting events

- Design, build, and maintain highly available and scalable infrastructure

- Implement and improve monitoring, alerting, and incident response systems

- Automate operations tasks and develop efficient workflows

- Conduct system performance analysis and optimization

- Collaborate with development teams to ensure smooth deployment and release processes

- Implement and maintain security best practices and compliance standards

- Troubleshoot and resolve system and application issues

- Participate in capacity planning and scaling efforts

- Stay up-to-date with the latest trends, technologies, and advancements in SRE practices

- Contribute to capacity planning, scale testing, and disaster recovery exercises.

- Approach operational problems with a software engineering mindset

- BS degree in computer science or equivalent field with 5+ years of experience

- 5+ years in an Infrastructure Ops, Site Reliability Engineering, or DevOps-focused role.

- Knowledge of Linux operating system principles, networking fundamentals, and systems management.

- Demonstrable fluency in at least one of the following languages : Java, Python, or Go

- Experience managing and scaling distributed systems in a public, private, or hybrid cloud environment

- Develop and implement automation tools and apply best practices for system reliability.

- You will be responsible for the availability & scalability of our services and manage the disaster recovery and other operational tasks.

- Collaborate with the development team to improve application codebase for logging, metrics and traces for observability.

- Collaborate with data science teams and other business units to design, build and maintain the infrastructure that runs machine learning and generative AI workloads.

- Influence architectural decisions with focus on security, scalability and performance.

- Find and fix problems in production, and work to avoid them from happening again

Preferred Qualifications :

- Familiarity with micro-services architecture and container orchestration with Kubernetes.

- Awareness of key security principles including encryption, keys (types and exchange protocols).

- Understanding SRE principles includes monitoring, alerting, error budgets, fault analysis, and automation.

- Strong sense of ownership, with a desire to communicate and collaborate with other engineers and teams.

- Ability to identify and communicate technical and architectural problems, while working with partners and their team to iteratively find solutions.