Site Reliability Engineer
2 days ago
**Top 3 reasons to join us**:
- Làm việc với các đội ngũ trẻ tài năng và máu lửa
- Môi trường làm việc thoải mái, năng động
- Sản phẩm của người Việt chinh phục toàn cầu
**Job description**:
- Manage and improve system reliability through SLO, SLI, and SLA practices.
- Design and implement observability systems (metrics, logs, tracing, alerting) using tools like Prometheus, Grafana, ELK, etc.
- Build and automate CI/CD pipelines and Infrastructure as Code (IaC) using tools such as Terraform, Ansible, Pulumi, Helm.
- Collaborate in the analysis, design, and deployment of systems and processes to ensure reliability, observability, and scalability.
- Optimize system cost, performance (latency, throughput), and security.
- Operate and optimize Kubernetes clusters (EKS); strong knowledge of Docker, Kubernetes, Helm is required.
- Develop internal tools to automate workflows and support other teams.
- Participate in incident response, root cause analysis, postmortem reviews, and improve incident handling processes.
- Support and coordinate with NOC (Network Operation Center) teams.
- Be part of the on-call rotation when needed.
**Your skills and experience**:
- 2-5 years of experience in SRE / DevOps / Platform Engineering.
- Hands-on experience with monitoring and alerting systems (Prometheus, Grafana, ELK, Loki, etc.).
- Proficient in CI/CD tools (GitLab CI, Jenkins) and familiar with Git workflows.
- Experience in deploying and managing Kubernetes (EKS is a plus).
- Understanding of gRPC, and capable of optimizing nginx connections and network stacks.
- Strong Linux background with deep knowledge of kernel, network stack, file system, and processes.
System-thinking mindset, focus on automation, and ability to mentor teammates.
- Proactive, responsible, and able to work under pressure during incident response.
** Nice to Have**:
- Experience with AWS (EKS, EC2, RDS, CloudWatch).
- Strong understanding of networking concepts (TCP/IP, DNS, Load Balancing, CDN).
- Experience with high availability and distributed systems.
- Previously built a complete observability stack.
- Experience in building or optimizing Golang SDKs or internal frameworks.
- Knowledge of cloud-native networking (CNI, overlay, BGP, eBPF-based load balancing).
**Why you'll love working here**:
** Thu nhập**
- Lương NET cạnh tranh với thị trường
- Thưởng tháng lương 13
- Thưởng mềm 1 - 2 tháng lương bình quân mỗi năm
- Xét tăng lương theo hiệu quả công việc 6 tháng/ lần
- Lĩnh tiền ngày phép dư cuối năm
** Môi trường làm việc**
- Môi trường làm việc trẻ trung, thoải mái, bình đẳng
- Làm việc cùng đội ngũ trẻ tài năng, đam mê và máu lửa
- Ăn trưa miễn phí có đầu bếp riêng tại công ty
- Tận hưởng văn phòng làm việc tiện nghi, sáng tạo với các phòng chức năng
- Thời gian làm việc linh động, trang phục thoải mái
- Du lịch công ty hàng năm, teambuilding hàng quý
** Đào tạo và phát triển**
- Được training các công nghệ mới (Machine Learning, Artifical Intelligence, Nosql, System Design ).
- Được tiếp xúc và giải quyết các bài toán khó về e-commerce.
-
Reliability Engineer Intern
2 weeks ago
Hà Nội, Vietnam Amazon Corporate Services Vietnam Company Limited - K62 Full timeBachelor's degree or above in electrical engineering, material engineering, mechanical engineering or related fields. - Strong organizational and problem-solving skills - Clear oral and written communication skills (Vietnamese and English, Chinese is preferred) - Demonstrated critical thinking capability - Self-motivated and proactive Amazon develops...
-
Reliability Engineer
1 week ago
Hà Nội, Vietnam Amazon Full timeDESCRIPTION The Role: What you'll do: - Perform system reliability testing, packaging reliability testing, accessory’s reliability testing, review testing reports and highlight the reliability results to the cross-functional team (Product Design, Hardware team, Packaging team, Design & Development, Product, Operations). - Develop system, packaging and...
-
Reliability Engineer
2 weeks ago
Hà Nội, Vietnam Amazon Corporate Services Vietnam Company Limited Full timeTechnical Degree (BSEE, BSME, BSCS, Physics, Industrial Engineering, other) -8+ years of combined experience in Packaging, Accessories and Product Reliability Engineering and Testing for New Product Introductions and Sustaining. -5~10 years of combined experience in consumer electronics manufacturing; experience with Sensors, RF and/or Wi-Fi based products...
-
Senior Officer, Site Reliability Engineering
2 weeks ago
Hà Nội, Vietnam Techcombank Full time9 Jul 2025 **Senior Officer, Site Reliability Engineering (40001670)**: - Category: Technology Division - Job Type: - Facility: Technology **Job Purpose**: **Key Accountabilities (1)**: - 'Participate in monitoring and handling system alerts/incidents/problems: - Ensure projects/specialized operations departments provide adequate warning/incident...
-
Senior Officer, Site Reliability Engineering
1 week ago
Hà Nội, Vietnam Techcombank Full time31 Oct 2025- **Senior Officer, Site Reliability Engineering (40001670)**: - Category: Technology Division- Job Type: - Facility: Technology**Job Purpose**: - **Key Accountabilities (1)**: 'Participate in monitoring and handling system alerts/incidents/problems: -- Ensure projects/specialized operations departments provide adequate warning/incident...
-
Site Reliability Engineer
2 weeks ago
Hà Nội, Vietnam DEUNA Full timeAbout DEUNA \uD83E\uDDE1 DEUNA is a rapidly growing startup revolutionizing global commerce with ATHIA, our AI-powered orchestration and payments platform that helps large enterprises boost approval rates, reduce costs, and unlock new revenue. Built by the team behind DEUNA—the fastest-growing Commerce OS in Latin America—ATHIA combines payment...
-
Hà Nội, Vietnam Amazon.com Full timeDESCRIPTION Amazon develops innovative consumer-centric product solutions. As a reliability program engineer you will be part of an exciting team developing, testing, and delivering new products. Your primary responsibility will be the development and implementation of methodologies/techniques to enhance product reliability. You will work closely internal...
-
Hà Nội, Vietnam Amazon Full timeDESCRIPTION Amazon develops innovative consumer-centric product solutions. As a reliability program engineer you will be part of an exciting team developing, testing, and delivering new products. Your primary responsibility will be the development and implementation of methodologies/techniques to enhance product reliability. You will work closely internal...
-
Reliability Program Manager
6 days ago
Hà Nội, Vietnam Amazon Corporate Services Vietnam Company Limited - K62 Full timeBachelor's degree or above in electrical engineering, material engineering, mechanical engineering or related fields. - 5+ years experience as an engineering lead or engineering project manager. - Good understanding of the principles and basic structures of measuring instruments such as HALT, chambers, oscilloscopes, multimeters etc. - Strong technical...
-
Site Operation Engineer
2 weeks ago
Hà Nội, Vietnam Công Ty TNHH SX TM DV XNK Pex Full time**Mô tả công việc**: (Mức lương: 17 - 23 triệu VNĐ) Department: Engineering / Construction Management Work Location: Hà Nội, Việt Nam. Project Location: Australia. As a Site Operation Engineer, you will be the key coordinator for construction activities from the office, ensuring that the progress, quality, and cost of on-site execution...