Senior Data Engineer
Earnest Analytics is a VC-backed data innovation startup driven to change the way professionals understand consumer and business behavior. Working with world-class data partners, we transform raw data into a source for business and investment professionals to ask better questions so they can make better decisions. We believe that, in the right hands, data has the power to change the way we work.
SENIOR DATA ENGINEER
Earnest Analytics is seeking a Data Engineer to join our Datasets Team. The Datasets Team is responsible for the ingestion, transformation, and productization of all our datasets. As a Data Engineer in our Datasets Team, you will be instrumental in creating the next generation of Earnest’s products and play a leading role in building our internal and client-facing data pipelines, infrastructure, and tooling. This is a chance to work on a cross-functional team with modern managed cloud services, functional programming, and lots of data. The work we do will be directly attributable to the next level of growth for Earnest.
- We work in a DevOps environment where teams own their code and infrastructure in Production.
- Some of the technologies we use across Earnest include but are not limited to Google Cloud Platform, Scala, Python, DBT, BigQuery, Docker, Airflow, and Kubernetes.
- We prioritize the utilization of optimal tools to address business requirements, all the while maintaining a strong emphasis on high maintainability and readability
- We also want to hear your ideas on the latest and greatest technologies we could use!
- Collaborate with product owners and data analysts in the development and delivery of new product features across a multitude of datasets
- Build and maintain integrated data pipelines, systems, and internal tooling in functional Scala, Python, SQL, and Go to power the company’s products
- Define ETL/ELT logic for processing terabytes of raw data, including writing BigQuery SQL (via DBT), Dataflow (Apache Beam) and orchestrating Airflow tasks.
- Ensure high data duality and pipeline stability by maintaining a high quality code base with high test coverage
- Work with the engineering organization to build Earnest’s data platform, in particular interfacing with our data science group
- Assist analysts with troubleshooting data issues and leverage technology to increase their productivity
- Maintain a fleet of web scrapers which power Earnest Web products
- 5 + years of experience processing large amounts of structured and semi-structured data
- Programming experience in Scala/Java, Python, SQL
- 2+ years writing and maintaining ETL at a terabyte-level scale
- 1+ years experience working with Hadoop applications (Spark/Scalding) or Dataflow (Apache Beam)
- Experience with version control systems (Git) and CI/CD practices
- Substantial SQL and data modeling experience, particularly focussed on efficient transformations
- Industrious and conscientious with the ability to work both independently and in a collaborative environment
- Effective interpersonal, written and verbal communication with engineers and non-engineers
- Knowledge of Cloud computing including Google Cloud Platform (GCP), especially BigQuery, Dataflow.
- Knowledge of IAC / Terraform
- Code-based data transformation orchestration scheduling with Apache Airflow or similar
- Scala experience, either with microservices or distributed big data transformation tools like Spark/Scalding
- Experience with Docker containerization and Kubernetes
- Experience with cross-timezone code reviews and CI/CD toolchains
- Experience with infrastructure as code and tools like Terraform
- Knowledge of statistics and analytics
- Data warehouse modeling experience
- Experience with or willingness to learn functional programming paradigms
- Experience with unit testing, property checking, and type-driven development
- Experience automating data quality checks through Data Build Tool (DBT), Great Expectations or other company tools
Benefits & Perks
- A strong tech community with training and support to develop your skills
- Ability to make an immediate impact to our products
- Input into the architectural design and technologies used in our platform
- Distributed environment, flexible working arrangements, competitive salary, generous annual leave, Health Insurance for you and your family, Health & Fitness Reimbursement Program