Site Reliability Engineering & DevOps Interview Course

Nail Site Reliability Engineering interviews at FAANG and Tier-1 Tech Companies
Want to know more?

Next webinar starts in

00
Days
:
00
Hrs
:
00
Mins
:
00
Secs
Want to know more?

Course designed and taught by instructors from FAANG & Tier-1 Tech Companies

Manoj Krishnan

Software Engineer
No items found.

Adrián Fernández

Engineering Manager
No items found.

Qiuping Xu.

Principal Data Scientist
No items found.

To learn more about the course

Next webinar starts in

00
Days
:
00
Hrs
:
00
Mins
:
00
Secs

Site Reliability Engineering and DevOps Course Curriculum

This is what you'll learn in our site reliability engineering career path!

  • 15 Mock Interviews
  • 6-Month Support Period

To learn more about this course

Next webinar starts in

00
Days
:
00
Hrs
:
00
Mins
:
00
Secs

Site Reliability Engineering & DevOps course and curriculum

Data structures and Algorithms
5 weeks
5 live classes
1

Online Processing Systems

Common Scalable Concepts like DBs, Cache, Messaging Queue, etc., and Common Design Problems
2

Batch Processing Systems

Batch Processing Concepts in-depth and Common Design Problems for FAANG+ interviews
3

Stream Processing Systems

  • Case Studies: on APM, Social Connections, Netflix, Google Maps, Trending Topics, YouTube
Design real-time data-intensive applications like Google Maps, Netflix, etc.
1

Sorting

  • Introduction to Sorting
  • Basics of Asymptotic Analysis and Worst Case & Average Case Analysis
  • Different Sorting Algorithms and their comparison
  • Algorithm paradigms like Divide & Conquer, Decrease & Conquer, Transform & Conquer
  • Presorting
  • Extensions of Merge Sort, Quick Sort, Heap Sort
  • Common sorting-related coding interview problems
2

Recursion

  • Recursion as a Lazy Manager's Strategy
  • Recursive Mathematical Functions
  • Combinatorial Enumeration
  • Backtracking
  • Exhaustive Enumeration & General Template
  • Common recursion- and backtracking-related coding interview problems
3

Trees

  • Dictionaries & Sets, Hash Tables 
  • Modeling data as Binary Trees and Binary Search Tree and performing different operations over them
  • Tree Traversals and Constructions 
  • BFS Coding Patterns
  • DFS Coding Patterns
  • Tree Construction from its traversals 
  • Common trees-related coding interview problems
4

Graphs

  • Overview of Graphs
  • Problem definition of the 7 Bridges of Konigsberg and its connection with Graph theory
  • What is a graph, and when do you model a problem as a Graph?
  • How to store a Graph in memory (Adjacency Lists, Adjacency Matrices, Adjacency Maps)
  • Graphs traversal: BFS and DFS, BFS Tree, DFS stack-based implementation
  • A general template to solve any problems modeled as Graphs
  • Graphs in Interviews
  • Common graphs-related coding interview problems
5

Dynamic Programming

  • Dynamic Programming Introduction
  • Modeling problems as recursive mathematical functions
  • Detecting overlapping subproblems
  • Top-down Memorization
  • Bottom-up Tabulation
  • Optimizing Bottom-up Tabulation
  • Common DP-related coding interview problems

System design
3 weeks
3 live classes
1

Online Processing Systems

Common Scalable Concepts like DBs, Cache, Messaging Queue, etc., and Common Design Problems
2

Batch Processing Systems

Batch Processing Concepts in-depth and Common Design Problems for FAANG+ interviews
3

Stream Processing Systems

  • Case Studies: on APM, Social Connections, Netflix, Google Maps, Trending Topics, YouTube
Design real-time data-intensive applications like Google Maps, Netflix, etc.
1

Online Processing Systems

  • The client-server model of Online processing
  • Top-down steps for system design interview
  • Depth and breadth analysis
  • Cryptographic hash function
  • Network Protocols, Web Server, Hash Index
  • Scaling
  • Performance Metrics of a Scalable System
  • SLOs and SLAs
  • Proxy: Reverse and Forward
  • Load balancing
  • CAP Theorem
  • Content Distribution Networks
  • Cache
  • Sharding
  • Consistent Hashing
  • Storage
  • Case Studies: URL Shortener, Instagram, Uber, Twitter, Messaging/Chat Services
2

Batch Processing Systems

  • Inverted Index
  • External Sort Merge
  • K-way External Sort-Merge
  • Distributed File System
  • Map-reduce Framework
  • Distributed Sorting
  • Case Studies: Search Engine, Graph Processor, Typeahead Suggestions, Recommendation Systems
3

Stream Processing Systems

  • Case Studies: on APM, Social Connections, Netflix, Google Maps, Trending Topics, YouTube
Site Reliability Engineering/DevOps
6 weeks
6 live classes
1

Online Processing Systems

Common Scalable Concepts like DBs, Cache, Messaging Queue, etc., and Common Design Problems
2

Batch Processing Systems

Batch Processing Concepts in-depth and Common Design Problems for FAANG+ interviews
3

Stream Processing Systems

  • Case Studies: on APM, Social Connections, Netflix, Google Maps, Trending Topics, YouTube
Design real-time data-intensive applications like Google Maps, Netflix, etc.
1

Linux and Networking

  • Memory management in Linux: Deep dive into physical and virtual memory. How kernel interacts with memory? What happens in case of page fault? How to deal with dirty pages?
  • Handling memory issues:
  • Getting alerted on DIMM chip failures
  • Keeping track of used memory
  • Preparing for OOM events
  • Getting alerted on memory issues 
  • Discussion on critical interview questions:
  • What is thrashing?
  • What kind of memory pages will thrash depending on whether you have swap enabled or not?
  • How do you tell if a host is computationally-bound or I/O bound?
  • Deep dive into CPU and processes: Metrics to track CPU performance. Why disk I/O is important?
  • Crack bash scripting questions: Learn pro tips and trick questions
  • Get efficient with command line: Pro tips on pipes, Tmux, nc, and file redirection  
2

Containers and Orchestration

  • Comprehensive coverage of Docker and Kubernetes architecture: Learn how to perform a live upgrade of an application with zero downtime
  • Deep dive into k8s: Horizontal Scaling, Load Balancing, Crash Protection, Tiered Networking, Resource Control, and Optimization and Security
  • How to approach common interview questions such as:
  • Usage of Docker volume for persisting data
  • How to evaluate systems’ tolerance for failures/outages?
  • What are the different techniques to scale a relational database?
  • Application deployment: Local vs. Managed k8s 
  • Kubernetes patterns for designing web applications: Sidecar pattern, Ambassador pattern, etc.
  • Important questions and pro tips on troubleshooting Kubernetes
  • How to set customer expectations? Deep dive into Service-Level Objectives and Service-Level Indicators
3

Deployment & Configuration Management

  • A top-down view of modern software release: In-depth understanding of how CI/CD works (Continuous Integration and Continuous Deployment). How automation helps achieve CI/CD?
  • Deep dive into Jenkins: Installation and configuration, Jenkins Plugins, Blue Ocean & Jenkinsfile, and managing and scaling Jenkins 
  • Comprehensive coverage of critical interview questions:
  • Jenkins user authentication and security measures?
  • What happens when the underlying node of a particular job is offline? 
  • Best practices and pro tips in Jenkins node allocation
  • How to design a system responsible for continuous integration and deployment?
  • Comprehensive coverage of configuration management: Compare different tools available in the market, their advantages and features 
  • Infrastructure as code: Why, when, how?
4

Non-Abstract Large System Design

  • How to design large-scale distributed systems like Google Adwords. Deep dive into the architecture, building blocks of scalable systems, scalability, and reliability
  • Interesting follow-up questions on the fundamentals of modern software systems: Servers, agents, load balancer, Storage, indexer, consensus, pipeline, queues, sharding, replication, caching, batching, and scatter-gather
  • Deep-dive discussion of SRE-specific interview questions:
  • How do SLOs (service-level objectives) impact designs?
  • How to do capacity estimates?
  • How to design for fault tolerance?
5

Monitoring & Troubleshooting

  • Monitoring and alerting: Key metrics and four golden signals (errors, saturation, latency, and traffic)
  • Derive SLO of a system from SLI and learn how to implement a proactive SLO for an application for alerting purposes
  • Deep dive into Prometheus, an open-source monitoring tool
  • Questions on logging and log management:
  • How to manage logs for various use cases? How to budget for long-term log storage?
  • Design a logging framework for an organization: Depth of logging, retention, access and audit controls, and encryption
  • Incident management: Lifecycle of an incident, KPIs like MTTD, MTTI and MTTR, and pro tips for incident management process 
  • Testing for failure: Understand the importance of Smoke tests, Stress tests, Perf tests, etc. 
  • Various troubleshooting scenarios and strategies: Leverage utilities like top, vmstat, iostat, mpstat, netstat, ping, sar, tcpdump, traceroute, dig, nslookup, etc.
6

Cloud Computing & AWS Services

  • AWS Compute Services (EC2, EKS, Lambda)
  • AWS Storage and Database Services (S3, RDS, Aurora, Dynamo and ElastiCache)
  • AWS Management and Governance services (CloudWatch, CloudFormation)
  • Networking Architecture
Career Coaching
3 weeks
3 live classes
1

Online Processing Systems

Common Scalable Concepts like DBs, Cache, Messaging Queue, etc., and Common Design Problems
2

Batch Processing Systems

Batch Processing Concepts in-depth and Common Design Problems for FAANG+ interviews
3

Stream Processing Systems

  • Case Studies: on APM, Social Connections, Netflix, Google Maps, Trending Topics, YouTube
Design real-time data-intensive applications like Google Maps, Netflix, etc.
1

Interview Preparation

Interview Questions
Placement assistance
Behavioral Coaching
2

Resume & LinkedIn Masterclass

3

Salary Negotiation Masterclass

Support Period
6 months
1

Online Processing Systems

Common Scalable Concepts like DBs, Cache, Messaging Queue, etc., and Common Design Problems
2

Batch Processing Systems

Batch Processing Concepts in-depth and Common Design Problems for FAANG+ interviews
3

Stream Processing Systems

  • Case Studies: on APM, Social Connections, Netflix, Google Maps, Trending Topics, YouTube
Design real-time data-intensive applications like Google Maps, Netflix, etc.
1

15 mock interviews

2

Take classes you missed/retake classes/tests

3

1:1 technical/career coaching

4

Interview strategy and salary negotiation support


Next webinar starts in

00
Days
:
00
Hrs
:
00
Mins
:
00
Secs

Best Suited for

  • Current SREs/DevOps Engineers

Site Reliability Engineering & DevOps Interview Process at Tier-1 Companies

We prepare you for all stages of a typical Site Reliability Engineering & DevOps interview process at FAANG and Tier-1 companies

Initial Technical Screening

  • 1 DSA coding question (easy/medium Leetcode Questions)
  • Questions from the systems domian like Linux, networking, etc.

Behavioral Round

  • Questions related to your job experience
  • Discussions on past projects
  • Open-ended questions to gauge if you're a "good fit”

Onsite: 4-6 Rounds

  • 1-2 rounds of DSA-based coding questions. Usually, the difficulty level of these questions is easy to medium
  • 2 rounds of questions round SRE fundamentals. Some companies conduct a separate troubleshooting round where candidates are asked to fix a broken system
  • System Design round (usually for senior roles) to test the knowledge of designing scalable systems
Top companies love hiring our candidates
No items found.

Top companies love hiring our candidates!

10K+

Experienced engineers enrolled

7

Years of successful training in Silicon Valley

18

Highest number of offers received by an alum

5

Avg years of experience of our alumni

What our students say

Vineet Joglekar
Software Development Manager
Offers from
Google_logo

"IK helps you build a problem-solving mindset, offers very rich foundational material, introduces typical interview problems, offers technical and behavioral coaching sessions and mock interviews from industry experts to succeed in tech interviews".

Swapnil Tailor
Offers from
Google_logo

Interview Kickstart is like a fitness coach which guides to achieve your dream job. It can help you identify your weak points and also suggest steps to improve them.

Rupesh Dabbir
Offers from
Google_logo

Interview Kickstart (IK) provides you a solid platform to not only strengthen your algorithm and interview game, I've had the pleasure of meeting some of the best/brightest minds in the industry (Faculty and students included). It was a humble experience, to say the least.

Frequently Asked Questions