All jobs

Site Reliability Engineer - SRE (L1)

100% Remote Full-time Open now

Accepting candidates in Brazil ONLY. Professional Role Overview We are seeking a Site Reliability Engineer (L1) to ensure the continuous availability and performance of our mission-critical production services. This role is designed for a professional who possesses the technical rigor required to manage complex distributed systems under a 100% on-call mandate within South American time zones. You will be responsible for the stewardship of high-stakes data environments—specifically those involving message queuing, relational and non-relational databases, and enterprise data warehouses—with a primary objective of maintaining strict service-level objectives (SLOs) through proactive monitoring, rapid incident response, and automated intervention. Key Responsibilities • Production Stewardship: Serve as the first responder for production anomalies, managing the end-to-end incident lifecycle from initial detection to post-incident resolution. • Data Infrastructure Management: Ensure the reliability and scalability of high-throughput data platforms, including message brokers, relational (PostgreSQL or similar) and non-relational databases (MongoDB or similar), and data warehouse environments. • Operational Excellence: Execute 100% on-call rotations, providing consistent coverage and rapid response to critical system alerts. • Automation & Toil Reduction: Develop and maintain scripts (Python, Go, or Bash) to automate routine operational tasks, enhancing system resilience and reducing manual overhead. • Observability & Telemetry: Configure and optimize monitoring suites (e.g., Prometheus, Grafana, Datadog) to ensure comprehensive visibility into application and system health. Must Have: • Prior SRE/On-call Experience: A mandatory background in SRE or production support roles, with a demonstrated ability to manage high-pressure on-call rotations and running production services. • Data Systems Proficiency: Message Queuing: Experience managing brokers (e.g., Kafka), topics, and troubleshooting throughput issues. • Relational & Non-Relational Databases: Proficiency in managing database health, query optimization, and high-availability configurations. • Data Warehouse: Experience in managing large-scale data warehouse performance and resource allocation. • Systems Engineering: Strong competency in Linux internals and networking protocols. • Regional Alignment: Must be based in and able to operate effectively within South American time zones to facilitate synchronized operations. Preferred Skills: • Analytical Rigor: The ability to diagnose root causes in complex, interconnected systems rather than applying superficial fixes. • Communication: Exceptional technical documentation skills and the ability to provide concise, professional updates during active incidents. • Dedication: A steadfast commitment to system uptime and a proactive approach to identifying potential points of failure before they impact the user experience. Education: • Bachelor’s degree in Technology, Computing, or a related field Job Types: Full-time, Contract Pay: $35,000.00 - $48,000.00 per year Benefits: • Dental insurance • Flexible schedule • Health insurance • Paid time off • Vision insurance Application Question(s): • Do you have previous on-call experience? • Are you located in South America? Work Location: Remote Apply tot his job

apply to this job

You might also like

FedRAMP Site reliability Engineer (Remote - USA)

100% Remote Full-time

[Remote] Site Reliability Engineer II - CTJ - T...

100% Remote Full-time

Solution Engineer - Data Engineering Specialist

100% Remote Full-time

Senior Site Reliability Engineer (SRE)

100% Remote Full-time

Snowflake Data Engineer – Remote

100% Remote Full-time

Site Reliability Engineer, Core Streaming (Remo...

100% Remote Full-time

Shopify Developer (Long-Term Project possible –...

100% Remote Full-time

Shopify Developer & Designer (Full-Time | $600/...

100% Remote Full-time

Shopify Developer/Graphic designer

100% Remote Full-time

Trendy Shopify Designer for Korean Skincare Bra...

100% Remote Full-time

AI Client Experience & Strategy Consultant

100% Remote Full-time

Senior Remote Platform Manager – Content Integrity, Trust & Safety Product Leadership (Work‑From‑Home)

100% Remote Full-time

Senior Marketing Automation Analyst, Sales System

100% Remote Full-time

Experienced Virtual Assistant for Remote Data Entry and Administrative Support – blithequark Remote Jobs Opportunity

100% Remote Full-time

Area Sales Manager / Mitarbeiter Vertriebsaußendienst (w/m/d)

100% Remote Full-time

Experienced Part-Time Customer Support Representative – Remote Work Opportunity

100% Remote Full-time

Forensic Fire Investigator - Tampa, FL

100% Remote Full-time

Experienced Customer Service Representative – Remote Chat Operator – Launch Your Career with arenaflex!

100% Remote Full-time

Experienced Data Entry Specialist – Supporting Operations at blithequark

100% Remote Full-time

Experienced Data Entry Associate – Remote Work Opportunity at arenaflex

100% Remote Full-time