The Senior Big Data Engineer manages the uninterrupted flow of information by designing and maintaining data pipelines to deliver data across our organization. S/he builds the automated data pipelines to ingest and prepare the data to meet the reporting and analytics needs of the organization. This includes building and maintaining the data structures and architectures for data ingestion, processing and deployment for large-scale, data-intensive applications. This individual must ensure that optimal ETL (Extract, Transformation, and Load) solutions are developed by applying best practices to the data modeling, code development and automation.
Essential Job Functions
As part of an agile team, design, develop and maintain an optimal data pipeline architecture using both structured data sources and big data for both on-premise and cloud-based environments.
Develop and automate ETL code using Scripting languages, ETL tools and job scheduling software to support all reporting and analytical data needs.
Design and build dimensional data models to support the data warehouse initiatives.
Assemble large, complex data sets that meet the analytical needs of the data scientist teams.
Assess new data sources to better understand availability and quality of data.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data pipeline performance, re-designing infrastructure for greater scalability and access to information.
Participate in requirements gathering sessions to distill technical requirements from business requests.
Collaborate with business partners to productionize, optimize, and scale enterprise analytics.
Collaborate with data architects and modelers on data store designs and best practices
Provide off-hours support for all developed data pipelines in an on-call rotation.
Bachelor's degree in Computer Science, Engineering, Information Science, Math or related discipline
Data engineering, data management or cloud certification is a plus
At least six (6) to eight (8) years of experience in in a data engineering role or related specialty with demonstrated ability in data modeling
At least two (2) years Data engineering experience on the Microsoft Azure, Amazon Web Services (AWS), or Snowflake
Experience using Extract, Transformation and Load (ETL) tools with Informatica (IICS) to build automated data pipelines
Experience with object-oriented/object function Scripting languages: Python, Java, C++
Thorough understanding of relational, columnar and NoSQL database architectures and industry best practices for development
Understanding of dimensional data modeling for designing and building data warehouses
Excellent advanced SQL coding and performance tuning skills
Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with parsing data formats such as XML/JSON and leveraging external APIs
Understanding of agile development methodologies
Ability to work in a team-oriented, collaborative environment; good interpersonal skills
Strong analytical and problem-solving skills; ability to weigh various suggested technical solutions against the original business needs and choose the most cost-effective solution
Keen attention to detail and ability to access impact of design changes prior to implementation
Self-driven, highly motivated and ability to learn quick
Ability to effectively prioritize and execute tasks in a high-pressure environment
Strong customer service orientation
Ability to present and explain technical information to diverse types of audiences in a way that establishes rapport and gains understanding
Work experience with geospatial data and spatial analytics is preferred