Advertisement
Description
Company Overview:
Float is the #1 resource management software trusted by over 4,500+ customers worldwide. Since 2012, we’ve grown every year—independently, profitably, and with a commitment to positive impact as a certified B Corporation. Our 50+ team members work 100% remotely across multiple countries, empowering people to live their “Best Work Life.”
Job Summary:
We are seeking a motivated Site Reliability Engineer (SRE) to join our fast-scaling global team. This high-impact role gives you the opportunity to shape infrastructure, automate smarter systems, and drive reliability at scale. Whether you’re an entry-level engineer looking to grow quickly or an experienced SRE ready for urgent challenges, this position offers autonomy, collaboration, and exciting career growth in a truly remote-first environment.
Key Responsibilities:
-
Maintain, upgrade, and optimise Kubernetes infrastructure for smooth and secure operations.
-
Improve system hygiene by eliminating noisy alerts and boosting monitoring accuracy.
-
Collaborate with engineers to integrate services and support service migrations.
-
Optimise Kubernetes cluster usage and scale node specifications.
-
Lead service mesh exploration and implement secure ingress layers.
-
Create incident response playbooks to ensure faster recovery during production issues.
-
Support the next-gen data layer (CDC) for high-performance engineering teams.
-
Coach teams on reliability goals, SLAs, and best practices for production quality.
Required Qualifications & Skills:
-
Proficiency with Bash scripting and at least one programming language (PHP, Python, or NodeJS).
-
Strong production experience managing and optimising Kubernetes clusters.
-
Solid knowledge of Terraform (Infrastructure as Code).
-
Familiarity with Google Cloud Platform (GCP) or willingness to upskill fast.
-
Strong written communication skills for documentation and team collaboration.
-
Remote-first mindset: comfortable with asynchronous communication via Slack, Loom, and similar tools.
Preferred Qualifications:
-
Experience in scaling infrastructure for SaaS platforms.
-
Prior experience in a global, remote-first team environment.
Preferred Experience:
2–5+ years in site reliability, DevOps, or infrastructure engineering.
Benefits & Perks:
-
Competitive salary of USD $133,000 per year (Level 2 role).
-
100% remote global team – work from anywhere.
-
Flexible schedules with deep focus time and minimal meetings.
-
Professional development opportunities, paid take-home assignment, and transparent salary framework.
-
Supportive, diverse, and collaborative culture with a strong focus on work-life balance.
Remote (Global Team – Work from Anywhere)
How to Apply:
Submit your updated CV and cover letter via our careers portal. If you’re excited about scaling infrastructure and want to thrive in a remote-first, high-growth environment, we’d love to hear from you.