All jobs

[Remote] Data Engineer

100% Remote Full-time Open now

Note: The job is a remote job and is open to candidates in USA. Pavago is seeking a Data Engineer to design, build, and maintain scalable data infrastructure and reliable data pipelines for analytics and operational decision-making. The role involves ensuring seamless data flow from source systems into warehouses while maintaining high standards for quality and governance.

Responsibilities

  • Build, maintain, and optimize ETL/ELT pipelines using Python, SQL, or Scala
  • Orchestrate workflows using Airflow, Prefect, Dagster, or similar orchestration tools
  • Ingest structured and unstructured data from APIs, SaaS platforms, databases, files, and streaming systems
  • Develop scalable connectors and automated ingestion workflows
  • Manage and optimize cloud data warehouses such as Snowflake, BigQuery, or Redshift
  • Design scalable schemas using star and snowflake modeling techniques
  • Implement partitioning, clustering, indexing, and performance optimization strategies
  • Build clean, analytics-ready datasets for business intelligence and reporting use cases
  • Implement validation checks, anomaly detection, logging, and monitoring to ensure data integrity
  • Enforce naming conventions, lineage tracking, and documentation standards using tools such as dbt or Great Expectations
  • Maintain audit-ready data processes and ensure compliance with GDPR, HIPAA, or industry-specific requirements
  • Monitor pipeline health and proactively resolve failures or inconsistencies
  • Build and manage real-time data pipelines using Kafka, Kinesis, Pub/Sub, or similar platforms
  • Support low-latency ingestion and event-driven architectures for time-sensitive applications
  • Monitor streaming infrastructure and optimize throughput and reliability
  • Partner closely with analysts, data scientists, and business stakeholders to deliver reliable datasets
  • Support dashboard and reporting initiatives across Tableau, Looker, or Power BI
  • Translate business requirements into scalable data solutions and models
  • Maintain clear technical documentation for pipelines, schemas, and workflows
  • Containerize data services using Docker and manage deployments through Kubernetes when applicable
  • Automate deployments using CI/CD pipelines such as GitHub Actions, Jenkins, or GitLab CI
  • Manage cloud infrastructure using Terraform, CloudFormation, or similar Infrastructure-as-Code tools
  • Continuously optimize performance, scalability, reliability, and cloud costs

Skills

  • 3+ years of experience in Data Engineering, Back-End Engineering, or Data Infrastructure roles
  • Strong proficiency in Python and SQL
  • Experience with at least one modern data warehouse (Snowflake, Redshift, BigQuery)
  • Hands-on experience with orchestration tools such as Airflow or Prefect
  • Strong understanding of ETL/ELT pipelines, data modeling, and data transformation workflows
  • Familiarity with cloud platforms such as AWS, GCP, or Azure
  • Experience with dbt for data modeling and transformation management
  • Streaming and event-driven data pipeline experience (Kafka, Kinesis, Pub/Sub)
  • Experience with cloud-native data services such as AWS Glue, GCP Dataflow, or Azure Data Factory
  • Familiarity with Docker, Kubernetes, Terraform, or CI/CD workflows
  • Background in regulated industries such as healthcare, fintech, or enterprise SaaS
  • Experience optimizing warehouse costs and query performance at scale

Company Overview

  • Pavago - Thinking Globally to Grow Locally 🌍 Welcome to Pavago, where the world is your talent pool. It was founded in 2022, and is headquartered in Meridian , Idaho, US, with a workforce of 11-50 employees. Its website is https://pavago.co.
  • Apply To This Job

    You might also like