Sr. Site Reliability Engineer
2 weeks ago
Hrs As a Company
HRS, a pioneer in business travel, aims to elevate every stay through innovative technology. With over 50 years of experience, their digital platform, driven by ProcureTech, TravelTech, and FinTech, transforms how companies and travelers Stay, Work, and Pay.
ProcureTech digitally revolutionizes lodging procurement, connecting corporations and suppliers in a cutting-edge ecosystem. This enables seamless efficiency and automation, surpassing travelers' expectations.
TravelTech redefines the online lodging experience, offering personalized content from selection to check-in, ensuring an unparalleled journey for corporate travelers.
In FinTech, HRS introduces advancements like mobile banking and digital payments, turning corporate back offices into touchless lodging enablers, eliminating legacy cost barriers. The innovative 2-click book-to-pay feature streamlines interactions for travelers and hoteliers.
Combining these technology propositions, HRS unlocks exponential catalyst effects. Their data-driven focus delivers value-added services and high-return network effects, creating substantial customer value.
HRS's exponential growth since 1972 serves over 35% of the global Fortune 500 and leading hotel chains.
Join HRS to shape the future of business travel, empowered by a culture of growth and setting new industry standards worldwide.
*BUSINESS UNIT*
The Site Reliability Engineering (SRE) department at HRS is fundamental to ensuring the reliability, scalability, and performance of our Lodging-as-a-Service (LaaS) platform. Our team collaborates across engineering, operations, and development teams to implement reliability standards, maintain infrastructure architecture, and achieve operational excellence while adhering to our service level objectives (SLOs) and reducing toil.
As an SRE at HRS, a key part of your role will be incident handling. You'll be at the forefront of identifying, responding to, and resolving production issues, ensuring minimal impact on our services. You'll participate in on-call rotations, requiring quick thinking and decisive action during critical incidents. Your ability to remain calm under pressure and make data-driven decisions will be crucial in maintaining our platform's reliability.
You will contribute to the reliability roadmap, support platform observability, and drive automation initiatives to enhance system resilience. Monitoring critical metrics such as error budgets, mean time to recovery (MTTR), and service level indicators (SLIs) will be part of your daily responsibilities to ensure optimal platform performance and availability. This role requires strong technical expertise in cloud infrastructure, distributed systems, and automation, combined with excellent problem-solving and incident management skills.
The department operates according to HRS' leadership principles, prioritizing system reliability and customer experience above all. We embrace a culture of blameless post-mortems, continuous improvement, and proactive problem-solving. As an SRE, you'll actively participate in incident reviews, contributing insights to prevent future occurrences and improve our overall system reliability.
SREs at HRS are innovation contributors, exploring new technologies and methodologies to improve system reliability and operational efficiency. You will work with infrastructure as code, maintain robust monitoring and alerting systems, and develop automation solutions to reduce manual intervention and improve incident response times. Our team takes full ownership of production systems, from capacity planning to disaster recovery, ensuring resilient and scalable infrastructure.
In this role, you will collaborate with team leads and other SREs to implement best practices, refine incident response procedures, and contribute to the overall reliability and performance of our LaaS platform. Your expertise in incident handling, system optimization, and proactive problem-solving will be crucial in maintaining and improving the high standards of our SRE department at HRS.
*POSITION*
We are seeking a competent Sr. Site Reliability Engineer with solid experience to join our team. The ideal candidate will focus on ensuring the reliability and scalability of services, working collaboratively with cross-functional teams to enhance our platform and improve processes.
*CHALLENGE*
- Service Reliability: Maintain service availability, system performance, and manage capacity-related matters. Involvement in designing and implementing SLOs and SLIs
- System Improvement: Develop and implement solutions to improve system reliability and scalability.
- Incident Response: Participate in on-call rotations and assist in incident management and resolution. Contribution to post-incident reviews (blameless post-mortems)
- Collaboration: Work closely with development teams to troubleshoot issues and enhance system performance.
- Automation: Contribute to the automation of processes to improve efficiency and scalability.
- Monitoring & Observability: Implement and maintain monitoring solutions using tools like New Relic, Kibana, Prometheus, Grafana, and ElasticSearch.
*FOR THIS EXCITING MISSION YOU ARE EQUIPPED WITH*
- Experience: 6-8 years in site reliability engineering or related areas.
- Education: Bachelor's degree in Computer Science, Engineering, or related field.
Technical Skills:
Proficiency in programming (.NET/C#, TS, bash) to automate your day to day work.
- Experience with cloud services and cloud engineering practices (ideally Azure).
- Knowledge of monitoring tools (New Relic, Kibana, Prometheus, Grafana, ElasticSearch).
- Strong understanding of software development methodologies.
- Experience with infrastructure as code tools (e.g., Terraform, CloudFormation)
- Familiarity with containerization and orchestration (e.g., Docker, Kubernetes)
Knowledge of networking and distributed systems
Problem-Solving: Strong analytical skills and the ability to perform root cause analysis.
- Automation: Experience with scripting and automation to enhance operational efficiency.
- Teamwork: Ability to work effectively within a team and collaborate with cross-functional teams.
- Security: Good understanding of DevSecOps principles and the ability to integrate security controls throughout the delivery pipeline.
Soft Skills
- Attention to Detail: High level of accuracy and thoroughness.
- Communication Skills: Clear and concise communication abilities.
- Learning Mindset: Eagerness to learn and apply new technologies.
- Proactive Approach: Initiative to identify issues before they become problems.
*LOCATION, MOBILITY, INCENTIVE*
- Competitive Pay: Attractive salary package with a 13th-month bonus and Reward bonus
- Hybrid Working Model: Embrace flexibility of working from both home and office that supports better work-life balance and boosts productivity
- Recognition That Matters: We celebrate your milestones with meaningful rewards and long-term commitment opportunities (after 2 years, 5 years, 10 years, etc. Of service)
- Comprehensive Benefits: Private health insurance for you and family member, plus full statutory coverage (social, health, unemployment)
- Time to Recharge: 18 days of paid leave per year, including sick days
- Continuous Growth: Dedicated learning budget, access to Udemy, English classes, mentoring, and global workshop/business forum opportunities
- Support for Your Day-to-Day: Meal and parking allowances to keep life simple
- Culture That Empowers: Company trips, year-end celebrations, structured quarterly team bonding, culture activities for x-team collaboration — because strong teams drive meaningful outcomes
- Refer and Earn: Generous referral bonus for every successful hire you bring in
- Location: Anna Building, HCMC
*PERSPECTIVE*
Access to a global network of a globally united and mutually responsible "Tribe of lntrapreneurs" that is passionately dedicated to renew the travel industry and while doing so reinvent the ways how businesses stay, work and pay.
Our entrepreneurial driven environment of full ownership and execution focus offers you the playground to contribute to a greater mission, while growing personally and professionally throughout this unique journey. You will continuously learn from a radical culture of retrospectives and continuous improvement and actively contribute to making business life better, smarter and more sustainable.
-
Site Reliability Engineer
4 days ago
Ho Chi Minh City, Ho Chi Minh, Vietnam HRS Group Full time $50,000 - $120,000 per yearHrs As a CompanyHRS, a pioneer in business travel, aims to elevate every stay through innovative technology. With over 50 years of experience, their digital platform, driven by ProcureTech, TravelTech, and FinTech, transforms how companies and travelers Stay, Work, and Pay.ProcureTech digitally revolutionizes lodging procurement, connecting corporations and...
-
Senior Site Reliability Engineer
4 days ago
Ho Chi Minh City, Ho Chi Minh, Vietnam Zalopay Full time $40,000 - $120,000 per yearWe are seeking a Senior Site Reliability Engineer (SRE) with a strong DevOps mindset to drive automation, delivery excellence, and infrastructure scalability for our high-throughput payment platform. You will partner with engineering teams to streamline CI/CD pipelines, implement GitOps workflows, and build internal tools that improve developer productivity...
-
Senior Site Reliability Engineer
6 hours ago
Ho Chi Minh City, Ho Chi Minh, Vietnam VNG Full time $30,000 - $120,000 per yearWe are looking for aSenior Site Reliability Engineer (SRE)with deep expertise in deploying, operating, and optimizing database systems on Kubernetes (K8s). In this role, you will play a critical part in ensuring the data infrastructure is highly reliable, high-performance, scalable, and proactively monitored through modern observability systems.Key...
-
Senior Site Reliability Engineer
6 days ago
Ho Chi Minh City, Ho Chi Minh, Vietnam Techcombank Full time ₫10,000,000 - ₫20,000,000 per yearTop 3 reasons to join usTop-tier banking environment in VietnamChallenging opportunities for the "Greater" YouAttractive career path and benefitsJob description1. About the Role:We are seeking a highly skilled Site Reliability Engineer with experience applying GenAI to automate and enhance the reliability of complex data platforms in Data Division. You will...
-
Site Reliability Engineer
2 weeks ago
Ho Chi Minh City, Ho Chi Minh, Vietnam PAVE Full time ₫4,000,000 - ₫12,000,000 per yearPAVE is an innovative automotive technology company transforming the way the world inspects vehicles. Powered by Intelligent Damage Detection capabilities,PAVEenables anyone with a smartphone to complete a guided vehicle inspection simply by taking photos of their car.Headquartered in Toronto, our team brings deep expertise from both the automotive and...
-
Site Reliability Engineer
6 days ago
Ho Chi Minh City, Ho Chi Minh, Vietnam PAVE Full time ₫120,000 - ₫180,000 per yearTop 3 reasons to join usHybrid and flexible working environmentInnovative ProductGrowth OpportunitiesJob descriptionWe're seeking a skilled Site Reliability Engineer to join our DevOps team and ensure the stability and reliability of our enterprise vehicle inspection platform. Reporting to the Lead DevOps Engineer, you'll play a critical role in our GCP to...
-
Site Reliability Engineer
4 days ago
Ho Chi Minh City, Ho Chi Minh, Vietnam LOGIX TECHNOLOGY Full time ₫100,000 - ₫150,000 per yearLOGIX TECHNOLOGYis a distinguished software services company, specializing in the provision of professional software development services, fostering strong partnerships through the establishment of offshore development centers, offshore product development, and software testing with an infrastructure-oriented approach. We excel at simplifying and enhancing...
-
Senior Site Reliability Engineer
1 week ago
Ho Chi Minh City, Ho Chi Minh, Vietnam EPAM Systems Full time $60,000 - $120,000 per yearAtEPAM Vietnam, EPAM is hiring aSenior Site Reliability Engineerto join the team in Vietnam. You'll design and optimize infrastructure, automate processes and ensure the reliability of our education platforms. More than that, at EPAM, engineering is in our DNA. So, when you join our growing team, you will work with top global clients and make significant...
-
Senior Site Reliability Engineer
6 days ago
Ho Chi Minh City, Ho Chi Minh, Vietnam VSol Full time $120,000 - $180,000 per yearTop 3 reasons to join usOnsite opportunities in UAE & Saudi ArabiaPremium Health insurance for employees & family14+ days of Annual leave & 5 days of Outing leaveJob descriptionVSOL is a digital enabler with a mission to help public and private organizations evolve their businesses through data and technology. We provide an end-to-end service from consulting...
-
senior devops/ site reliability engineer
2 days ago
Ho Chi Minh City, Ho Chi Minh, Vietnam Bestarion: Leading Outsourcing Company in Vietnam Full timeBestarion is a subsidiary of Larion, a well-established software outsourcing company in Vietnam with decades of experience delivering high-quality technology solutions. Inheriting Larion's strong foundation and technical expertise, Bestarion continues to grow as a trusted partner for clients worldwide.For over 15 years, Bestarion has provided innovative...