View Profile Language Deutsch (Deutschland) English (Global) Français (France) æÃÃ¥æÃìèêà (æÃÃ¥æÃì) çîÃäýÃäøæÃà (äøÃÂ¥ÃýÃ¥ä§à ÃÃ) View Profile Language Deutsch (Deutschland) English (Global) Français (France) æÃÃ¥æÃìèêà (æÃÃ¥æÃì) çîÃäýÃäøæÃà (äøÃÂ¥ÃýÃ¥ä§à ÃÃ) View Profile
Select how often (in days) to receive an alert: Create Alert
Select how often (in days) to receive an alert: We are seeking a highly skilled and experienced Senior Linux Infrastructure Engineer with a focus on Root Cause Analysis (RCA) to join our team. The ideal candidate will possess an extensive technical background, superior problem-solving skills, and a passion for ensuring the robustness and resilience of our Linux server infrastructure. You must feel comfortable working in a fast-paced, dynamic, and flexible environment and operate effectively in a global 24x7 international setting. What you'll do
- Perform thorough Root Cause Analysis (RCA) to identify, analyze, and resolve complex issues within Linux server infrastructure.
- Monitor, troubleshoot, and optimize the performance of Linux-based systems.
- Collaborate with cross-functional teams to gather data, replicate issues, and implement solutions.
- Create comprehensive RCA reports, system documentation, and knowledge base articles.
- Implement automation through scripting and configuration management tools to streamline diagnostic processes.
- Maintain security, compliance, and OS hardening across the infrastructure.
- Stay current with industry trends, technologies, and best practices to continuously improve systems and processes.
- Provide mentorship and detailed documentation to assist junior colleagues in implementing technical plans and adhering to best practices. What you bring
- 10+ years of related professional experience with a focus on system diagnostics and Root Cause Analysis (RCA). Technical Skills
- Linux Systems: In-depth knowledge of Linux system internals, kernel architecture, process and memory management, filesystems, and system calls.
- Monitoring Tools: Proficiency with tools such as top, htop, vmstat, iostat, sar, ps, netstat, ss, etc.
- Logs and Tracing: Experience with journalctl, rsyslog, syslog-ng, dmesg, strace, lsof, etc.
- Networking: Advanced understanding of TCP/IP, network interfaces, routing, DNS, DHCP, firewalls, and diagnostic tools like ping, traceroute, tcpdump, wireshark, iftop, netcat, nmap, etc.
- Performance Analysis: Proficiency with tools like perf, systemd-analyze, iotop, blktrace, ioping, and benchmarks.
- Security Incident Management: Knowledge of security principles, OS hardening, compliance, and tools for vulnerability scanning and intrusion detection.
- Scripting and Automation: Strong knowledge of Shell scripting, Python, Perl, or other scripting languages, and Infrastructure-as-Code tools like Ansible, Puppet, Chef, or Terraform.
- Cloud Infrastructure: Experience with AWS, Azure, GCP, including services such as EC2, S3, IAM, VPC, security groups, and load balancers.
- Virtualization Technologies: Familiarity with Docker, Kubernetes, VMware, KVM, and other virtualization or containerization technologies.
Soft Skills:
- Analytical and Problem-Solving: Strong ability to analyze issues, identify root causes, and implement effective solutions systematically.
- Documentation: Ability to create clear and detailed RCA reports and technical documentation.
- Communication: Excellent communication and networking skills, with the ability to articulate findings and solutions to technical and non-technical stakeholders.
- Incident Management: Experience with ITIL or similar frameworks for incident management.
- Continuous Learning: Proactive in acquiring new knowledge and staying updated with the latest trends and technologies.
Language Skills:
- Fluency in English, with excellent communication skills tailored towards explaining complex RCA findings.