All jobs

Data Engineer - Databricks (Mid Level) - US Citizens Only

100% Remote Full-time Open now

---- Project requirements mandate role open only for US Citizens. IRS MBI Clearance a plus/ Active Secret or Top Secret a Plus. All candidates will have to go through Clearance process before being able to start on the project.--(No exceptions to this requirement)

Job Description

  • Infobahn Solutions is hiring Databricks Data Engineering professionals in the Washington DC Metro Area for a US Government Federal Project with the Department of Treasury .
  • The Data Engineers will be part of a Data Migration & Conversion Team on a large DataLake being implemented on AWS Gov Cloud .
  • Data will be migrated from on premise Main Frame /Legacy database systems using Informatica PowerCenter to the AWS Landing Zone on S3.
  • Further conversion will be done using Databricks (PySpark) in AWS.
  • The Data Engineer should have prior Data Migration experience and understand all the intricacies required of developing data integration routines for moving data from multiple source systems to a new target system with a different data model.
  • The Data Engineer should have experience in converting Oracle PL/SQL and/or Greenplum code to Databricks.
  • Must have experience - Experience with Data Migrations and Conversion using Databricks .
  • Experience of using Databricks on AWS and managing a Databricks production system is critical and a must have for the project.

What you’ll be doing:

  • Databricks Environment Setup: Configure and maintain Databricks clusters, ensuring optimal performance and scalability for big data processing and analytics.
  • ETL (Extract, Transform, Load): Design and implement ETL processes using Databricks notebooks or jobs to process and transform raw data into a usable format for analysis.
  • Data Lake Integration: Work with data lakes and data storage systems to efficiently manage and access large datasets within the Databricks environment.
  • Data Processing and Analysis: Develop and optimize Spark jobs for data processing, analysis, and machine learning tasks using Databricks notebooks.
  • Collaboration: Collaborate with data scientists, data engineers, and other stakeholders to understand business requirements and implement solutions.
  • Performance Tuning: Identify and address performance bottlenecks in Databricks jobs and clusters to optimize data processing speed and resource utilization.
  • Security and Compliance: Implement and enforce security measures to protect sensitive data within the Databricks environment, ensuring compliance with relevant regulations.
  • Documentation: Maintain documentation for Databricks workflows, configurations, and best practices to facilitate knowledge sharing and team collaboration.

Skills:

  • Apache Spark: Strong expertise in Apache Spark, which is the underlying distributed computing engine in Databricks.
  • Databricks Platform: In-depth knowledge of the Databricks platform, including its features, architecture, and administration.
  • Programming Languages: Proficiency in languages such as Python or Scala for developing Spark applications within Databricks.
  • SQL: Strong SQL skills for data manipulation, querying, and analysis within Databricks notebooks.
  • ETL Tools: Experience with ETL tools and frameworks for efficient data processing and transformation.
  • Data Lake and Storage: Familiarity with data lakes and storage systems, such as Delta Lake, AWS S3, or Azure Data Lake Storage.
  • Collaboration and Communication: Effective communication and collaboration skills to work with cross-functional teams and stakeholders.
  • Problem Solving: Strong problem-solving skills to troubleshoot issues and optimize Databricks workflows.
  • Version Control: Experience with version control systems (e.g., Git) for managing and tracking changes to Databricks notebooks and code.

Role Requirements:

  • Bachelor/Master’s degree in computer science, Engineering, or related field
  • 7-8 plus years of development experience on ETL tools (4+ years of Databricks is a must have)
  • 5+ years of experience as a Databricks Engineer or similar role.
  • Strong expertise in Apache Spark and hands-on experience with Databricks.
  • More than 7 years of experience performing data reconciliation, data validation, ETL testing, deploying ETL packages and automating ETL jobs, developing reconciliation reports.
  • Working knowledge of message-oriented middleware/streaming data technologies such as Kafka, Confluent
  • Proficiency in programming languages such as Python or Scala for developing Spark applications.
  • Solid understanding of ETL processes and data modeling concepts.
  • Experience with data lakes and storage systems, such as Delta Lake, AWS S3, or Azure Data Lake Storage.
  • Strong SQL skills for data manipulation and analysis.
  • Good experience in shell scripting, AutoSys
  • Strong Data Modeling Skills
  • Strong analytical skills applied to business software solutions maintenance and/or development
  • Must be able to work with a team to write code, review code, and work on system operations.
  • Past project experience with Data Conversion and Data Migration
  • Communicate analysis, results and ideas to key decision makers including business and technical stakeholders.
  • Experience in developing and deploying data ingestion, processing, and distribution systems with AWS technologies
  • Experience with using AWS datastores, including RDS Postgres, S3, or DynamoDB
  • Dev-ops experience using GIT, developing, deploying code to production
  • Proficient in using AWS Cloud Services for Data Engineering tasks
  • Proficient in programming in Python/shell or other scripting languages for the purpose of data movement
  • Eligible for a US Government issued IRS MBI (candidates with active IRS MBIs will be preferred)
  • Databricks industry certifications - Associate / Professional Level

Preferred Qualifications

  • Cloud Data Migration and Conversion projects
  • Experience on AWS

Job Types: Full-time, Contract Pay: $90,000.00 - $130,000.00 per year Benefits:

  • Dental insurance
  • Flexible schedule
  • Health insurance
  • Life insurance
  • Paid time off
  • Vision insurance

Education:

  • Bachelor's (Preferred)

License/Certification:

  • Databricks Certified Data Engineer Professional (Required)

Security clearance:

  • Secret (Preferred)

Work Location: Remote Apply tot his job Apply To this Job

You might also like

Integrated Care Assistant III, National IKC

100% Remote Full-time

Applied Machine Learning Engineer Lead Consultant

100% Remote Full-time

Machine Learning Engineer III Hybrid in Raleigh, NC

100% Remote Full-time

Remote Customer Service Representative at Delta Airlines

100% Remote Full-time

Delta Airline Customer Service Representative - Work At Home

100% Remote Full-time

Digital Marketing Manager (Mrkt - 2)

100% Remote Full-time

Associate Director, HR Partner - Business Solutions (Hybrid) 2 Locations

100% Remote Full-time

Jacksonville Entry Level Flight Attendant($30-$70/hr) | Hiring

100% Remote Full-time

Experienced Part-Time Remote Data Entry Clerk - Join Delta Airlines' Dynamic Team as a Flexible and Rewarding Career Opportunity

100% Remote Full-time

Senior Design Researcher

100% Remote Full-time

Sr Administrative Support Spec

100% Remote Full-time

VP Marketing - Academic, Publisher and Government, Funder & Non-Profit Markets (14 month FTC)

100% Remote Full-time

Experienced Data Entry and Virtual Assistant – Remote Disney Team Member

100% Remote Full-time

Experienced Customer Service Associate I – Delivering Exceptional Shopping Experiences at arenaflex

100% Remote Full-time

Senior Director, Talent Acquisition

100% Remote Full-time

Remote Customer Associate- Biology, Clinical

100% Remote Full-time

Systems Support

100% Remote Full-time

Accounts Receivable Clerk - National Remote

100% Remote Full-time

Customer Success Manager- Secondary Education East Coast (USA Remote)

100% Remote Full-time

Remote Jobs Product Tester at Amazon

100% Remote Full-time