Lead Site Reliability Engineer
1 day ago
**Role Summary**
We are seeking a highly skilled and motivated Lead Site Reliability Engineer (SRE) with strong AWS expertise to lead our Service Operations team. You will be responsible for driving SRE practices, ensuring the scalability, reliability, and performance of mission-critical systems for our digital banking clients. This role requires balancing technical depth with leadership capability — setting direction, mentoring engineers, and ensuring service reliability at scale across multiple teams and clients.
**Key Responsibilities**
- Leadership & Mentorship: Lead a team of SREs, providing technical guidance, coaching, and fostering a culture of reliability and continuous improvement.
- SRE Practices: Define and mature SRE practices, including SLIs/SLOs, error budgets, and incident response processes across production systems.
- Architecture & Automation: Own the design and evolution of automated cloud operations, driving adoption of Infrastructure-as-Code (Terraform, CloudFormation) and CI/CD pipelines.
- Incident Management: Lead major incident responses, ensuring rapid resolution, root cause analysis, and implementation of preventive measures.
- Collaboration: Work closely with Development, DevOps, and Cloud Engineering teams to ensure reliability and resilience are built into every stage of delivery.
- Operational Excellence: Establish and track key reliability metrics (availability, latency, error rates) and drive initiatives to continuously improve them.
- Innovation & Tooling: Evaluate and implement AWS-native and third-party tools to improve monitoring, alerting, and automation.
- Stakeholder Engagement: Act as the primary contact point for Service Reliability topics with clients, ensuring transparency and alignment on reliability goals.
- Governance: Ensure compliance with industry standards and internal policies around security, audit, and operational risk.
**Required Education & Experience**
- **Experience: 7-10 years in SRE/DevOps/Cloud Engineering, with at least 2-3 years in a lead or managerial capacity.**
- Cloud Expertise: Deep hands-on experience with AWS services (EC2, ECS/EKS, S3, RDS, IAM, VPC, CloudWatch).
- Infrastructure as Code: Strong experience with Terraform, CloudFormation, and automated deployment pipelines (Harness, GitLab, Jenkins).
- Containerization & Orchestration: Expertise in Kubernetes and container-based workloads in production.
- Monitoring & Observability: Proficiency with monitoring, logging, and alerting tools (CloudWatch, Prometheus, Grafana, ELK).
- Incident Leadership: Proven ability to lead high-pressure incident response and post-mortem processes.
- Problem-Solving & Risk Management: Strong analytical skills with the ability to anticipate, assess, and mitigate technical risks.
- Collaboration & Communication: Excellent stakeholder management skills; fluent English required, with good communication in Vietnamese for local collaboration.
**Nice-to-Have Skills**
- Certifications such as AWS Certified DevOps Engineer - Professional or AWS Solutions Architect - Professional.
- Experience in financial services or other highly regulated industries.
- Knowledge of advanced security practices and compliance frameworks (PCI-DSS, ISO 27001, SOC2).
- Multi-region/multi-AZ architecture design for high availability and disaster recovery.
**What We Offer You**
- Competitive salary and benefits package.
- 13th-month salary guarantee.
- Performance bonus.
- Professional English courses.
- Premium health insurance.
- Extensive annual leave and flexible working arrangements.
- Opportunity to shape the SRE function and drive reliability practices for leading digital banking clients.
**About Us**:
We show commitment to our investors and stand for solid, long-term growth performance. Founded in Germany in 1987 and in American territory since 2008, GFT expanded globally to over 10,000 experts. And to more than 15 markets to ensure proximity to clients. With new opportunities from Asia to Brazil, the international growth story continues. We are committed to grow tech talents worldwide. Because our team’s strong consulting and development skills across legacy and pioneering technologies, like GreenCoding, underpin success. We maintain a family atmosphere in an inclusive work environment.
**Why Choose GFT?**:
- Competitive Compensation
- Benefits package including comprehensive medical, dental, vision and others
- Company Culture based on our Core Values
- Professional Development Training with Individual Development Plans to map out your career growth
- Opportunity to work in a global environment with diverse teams built with colleagues from around the world
- Opportunity to work with technology industry leaders in the financial services industry
- Opportunity to work for big name clients in capital markets, banking and other industries
-
Site Reliability Engineer
7 days ago
Thành phố Hồ Chí Minh, Vietnam Pizza Hut Digital & Technology Full timePizza Hut Digital & Technology *** - Waseco Building - 10 Pho Quang Street, Ward 02, Tan Binh, Ho Chi Minh- Hybrid- Posted 11 minutes ago- Skills: - AWS English Azure **Top 3 reasons to join us**: - Flexible Friday afternoon - 18 Annual Leave + 5 Recharge Days/ Year - Hybrid working model **Job description**: **Role Overview** - As a site reliability...
-
Site Reliability Engineer
7 days ago
Thành phố Hồ Chí Minh, Vietnam Zalo Full timeHồ Chí Minh Full-time A Backend Reliability Engineer (BRE) in Zalo is a crucial role responsible for ensuring the constant availability, optimal performance, and robust scalability of ZA's inhouse database systems. This position blends the skills of a traditional database administrator with the principles of software engineering and site reliability...
-
Site Reliability Engineer
1 week ago
Thành phố Hồ Chí Minh, Vietnam Zalo Full timeHồ Chí Minh Full-time A Database Reliability Engineer (DRE) in Zalo is a crucial role responsible for ensuring the constant availability, optimal performance, and robust scalability of ZA's inhouse database systems. This position blends the skills of a traditional database administrator with the principles of software engineering and site reliability...
-
Site Reliability Engineer
5 days ago
Thành phố Hồ Chí Minh, Vietnam HRS Full time**City**:Ho Chi Minh **Job Function**:Tech **Job Area**:Product & IT **Seniority Level**:Mid-Senior level **Date**:Apr 23, 2025 **HRS AS A COMPANY** - HRS, a pioneer in business travel, aims to elevate every stay through innovative technology. With over 50 years of experience, their digital platform, driven by ProcureTech, TravelTech, and FinTech,...
-
Site Reliability Engineer
5 days ago
Ho Chi Minh City, Vietnam Wizeline Full time**Site Reliability Engineer / DevOps**: Wizeline - Ứng Tuyển Cloud System Admin AWS - Đăng nhập để xem mức lương - 285 Cách Mạng Tháng 8, District 10, Ho Chi Minh- Xem bản đồ- Tại văn phòng- 4 giờ trước **3 Lý Do Để Gia Nhập Công Ty**: - Leading Technologies to Deliver Great Solutions - Enjoy Competitive &...
-
Site Reliability Engineer
1 week ago
Ho Chi Minh City, Ho Chi Minh, Vietnam HRS Group Full time $50,000 - $120,000 per yearHrs As a CompanyHRS, a pioneer in business travel, aims to elevate every stay through innovative technology. With over 50 years of experience, their digital platform, driven by ProcureTech, TravelTech, and FinTech, transforms how companies and travelers Stay, Work, and Pay.ProcureTech digitally revolutionizes lodging procurement, connecting corporations and...
-
Site Reliability Engineer
1 week ago
District , Ho Chi Minh City, Vietnam Moatable Full time $90,000 - $120,000 per yearWe are seeking an experienced and highly skilledSite Reliability Engineerto join our dynamic team. The ideal candidate will have a strong background in AWS, Jenkins, GitLab CI, and Infrastructure as Code (IaC). As a Senior DevOps Engineer, you will play a critical role in enhancing our CI/CD pipelines, automating infrastructure, and ensuring the reliability...
-
Sales Engineer
6 days ago
Ho Chi Minh City Metropolitan Area, Vietnam VPOWER RELIABILITY Full time $40,000 - $60,000 per yearCompany DescriptionVPOWER RELIABILITY is dedicated to providing a one-stop solution for improving reliability and ensuring safety. Driven by a commitment to connecting passion, sharing knowledge, and enhancing capabilities, VPOWER RELIABILITY strives to minimize downtime and optimize operational efficiency. Our mission is to empower businesses with reliable...
-
Site Reliability Engineer
7 days ago
Ho Chi Minh City, Vietnam NextWave Partners Full timeLocation: - Ho Chi Minh City- Job Type: - Permanent- Discipline: - Software Engineering- Salary: - Negotiable- Contact: - Chelsea Phan**Site Reliability Engineer** **Ho Chi Minh City** **About NextWave** NextWave Partners is the Recruitment Partner of choice within the Clean Energy, Sustainable Infrastructure, ESG, Impact Investment, Climate-Tech &...
-
Sr. Site Reliability Engineer
2 weeks ago
Ho Chi Minh City, Ho Chi Minh, Vietnam HRS Group Full time $120,000 - $180,000 per yearHrs As a CompanyHRS, a pioneer in business travel, aims to elevate every stay through innovative technology. With over 50 years of experience, their digital platform, driven by ProcureTech, TravelTech, and FinTech, transforms how companies and travelers Stay, Work, and Pay.ProcureTech digitally revolutionizes lodging procurement, connecting corporations and...