All jobs

[Remote] DevOps Engineer - Atlanta, GA, Birmingham, AL, Louisville, KY, Richmond, VA, Charlotte, NC

100% Remote Full-time Open now

Note: The job is a remote job and is open to candidates in USA. Dice is seeking an experienced Site Reliability Engineer (SRE) / DevOps Engineer with expertise in Incident Management and cloud-native platforms. The role involves ensuring the reliability and performance of distributed systems, managing incident responses, and implementing automation and governance strategies.

Responsibilities

  • Manage and improve platform reliability, availability, and performance across production environments
  • Lead and participate in incident management, root cause analysis, remediation planning, and post-incident reviews
  • Drive change control processes and ensure operational governance standards are followed
  • Monitor and manage error budgets while implementing reliability improvements
  • Design, build, and maintain scalable cloud infrastructure and automation frameworks
  • Deploy and manage containerized applications using Kubernetes and Docker
  • Develop and maintain CI/CD pipelines to support efficient software delivery
  • Implement Infrastructure as Code (IaC) solutions for automated provisioning and configuration management
  • Establish observability strategies using monitoring, logging, and alerting platforms
  • Collaborate with development, infrastructure, security, and business teams to ensure platform stability
  • Troubleshoot complex production issues across cloud, networking, infrastructure, and application layers
  • Continuously improve operational processes, automation, and system resilience

Skills

  • 7+ years of experience in Site Reliability Engineering (SRE), DevOps, Cloud Infrastructure, or Production Operations
  • Strong experience managing workloads in cloud environments: Microsoft Azure, Amazon Web Services (AWS), Google Cloud Platform (Google Cloud Platform)
  • Hands-on experience with: Kubernetes, Docker, CI/CD Pipelines, Infrastructure as Code (IaC)
  • Strong scripting and automation expertise using: Python, Bash, PowerShell, Go (Golang)
  • Experience with observability and monitoring platforms: Datadog, Grafana, Prometheus, Splunk
  • Strong understanding of: Networking concepts, Linux Administration, Windows Administration, Distributed Systems, Cloud-Native Architectures
  • Experience with: Incident Response, Production Troubleshooting, Operational Governance
  • Experience implementing reliability engineering best practices and SRE methodologies
  • Experience supporting large-scale enterprise production environments
  • Familiarity with high-availability and disaster recovery architectures
  • Experience automating operational workflows and infrastructure management
  • Knowledge of security best practices within cloud environments
  • Experience working in Agile and DevOps-driven organizations

Company Overview

  • Dice is a job-searching platform for technology professionals. It is a sub-organization of DHI Group. It was founded in 1990, and is headquartered in Santa Clara, California, USA, with a workforce of 201-500 employees. Its website is http://www.dice.com.
  • Apply To This Job

    You might also like

    [Remote] Lead Certified CMMC Assessor (CCA) Consultant | Remote | 1 week Consulting Project

    100% Remote Full-time

    [Remote] Senior Accountant (CPA Preferred) – Workday Financials

    100% Remote Full-time

    [Remote] Marketing Manager (India)

    100% Remote Full-time

    [Remote] B2B SaaS Sales Recruiter (SDR/AE/AM), Remote, Contract-JB-E

    100% Remote Full-time

    [Remote] Senior Procurement Analyst

    100% Remote Full-time

    [Remote] Technical Business Analyst (Remote-US)

    100% Remote Full-time

    [Remote] Account Executive - Large Enterprise Switzerland

    100% Remote Full-time

    [Remote] Databricks Data Security/ Governance Engineer

    100% Remote Full-time

    [Remote] Program Manager - Benefits

    100% Remote Full-time

    [Remote] Machine Learning Engineer II - Autonomous Driving Performance Evaluation

    100% Remote Full-time

    Solution Engineer, Amazon Connect

    100% Remote Full-time

    Manager, Cybersecurity Operations

    100% Remote Full-time

    Experienced Customer Service Representative – Remote Opportunity for Florida Residents

    100% Remote Full-time

    [Remote] AI Subject Matter Expert (AI SME)

    100% Remote Full-time

    Experienced Work-from-Home Customer Service Representative – Virtual Support Agent

    100% Remote Full-time

    Virtual Assistant (General Pool)

    100% Remote Full-time

    Experienced Customer Support Executive – Delivering Exceptional Service at arenaflex

    100% Remote Full-time

    [Remote] Regulatory Project Manager

    100% Remote Full-time

    Remote Customer Service, Part-Time, $20/hr

    100% Remote Full-time

    Remote Content Creator – Digital Storytelling & Brand Engagement for arenaflex Entertainment (United States)

    100% Remote Full-time