Title:  SRE and DevOps Lead

The SRE and DevOps Lead will be responsible for the reliability, availability, and performance of web and mobile applications and infrastructure. This role involves designing, implementing, and managing CI/CD pipelines, monitoring systems, and automated processes to ensure continuous delivery and operational efficiency. The ideal candidate will collaborate closely with development, and security teams to drive best practices in system architecture, deployment, and incident management, ensuring seamless integration and high uptime across all environments.

Key Responsibilities:

Infrastructure and Application Reliability

  • Design and implement robust monitoring, alerting, and logging systems to ensure proactive identification of issues.
  • Establish and manage SLOs (Service Level Objectives) and SLAs (Service Level Agreements) for critical services.
  • Develop automated recovery processes and ensure rapid incident response to minimize downtime.

CI/CD Pipeline Management

  • Lead the design and implementation of CI/CD pipelines that support continuous integration and delivery across multiple environments.
  • Collaborate with development teams to optimize build, test, and deployment processes, ensuring fast and reliable releases.
  • Automate repetitive tasks and workflows to increase efficiency and reduce human error.
  • Implement security best practices across infrastructure and applications, including automated compliance checks.
  • Collaborate with security teams to ensure systems meet industry standards and organizational policies.
  • Regularly audit and update systems to protect against vulnerabilities and threats.
  • Advocate for design, code reuse, performance, quality, and security best practices.

Professional Development

  • Stays updated on the latest trends, tools, and frameworks in SRE and DevOps to recommend improvements and optimizations.
  • Actively seeks opportunities for skill enhancement and professional growth.
  • Shares knowledge with the team and foster a culture of continuous learning.

Job Qualifications:

Education:

  • BS/MS degree in Computer Science, Engineering or a related subject.

Experience:

  • At least 7 years of experience in DevOps, SRE, or related roles.
  • Proven track record of managing and deploying large-scale cloud infrastructure on AWS, Azure, or Google Cloud.
  • Experience in designing and implementing CI/CD pipelines across multiple environments (experience with GitHub Actions is an advantage).
  • Proficiency in deploying and managing containerized applications using Docker and Amazon ECS (experience with Kubernetes is an advantage).
  • Hands-on experience in managing infrastructure as code (IaC) using tools like Terraform or CloudFormation.
  • Expertise in monitoring, logging, and alerting systems, with experience in tools like Prometheus, Grafana, or New Relic. (experience with New Relic is a huge plus).
  • Strong familiarity with version control systems, particularly Git, and collaboration platforms like GitHub or GitLab.
  • Proficiency in scripting languages such as Python, Shell, or Ruby for automation and process improvement.
  • In-depth knowledge of network protocols, security best practices, and system architecture.
  • Experience with load balancing, caching, and database management in a cloud environment.

Certification:

  • Relevant certifications such as AWS Certified DevOps Engineer, or equivalent is an advantage.

Location:

  • Must be amenable to work in Bangkok Thailand (Hybrid Work Setup)