Data Engineering Intern at PPLSI - UC Berkeley IEOR Department - Industrial Engineering & Operations Research

View job on Handshake

Build and Maintain Data Pipelines

Assist to Develop data pipelines in a consumer and producer pattern to move data in and out of the data lake and warehouse using a variety of technologies that include Apache NiFi, Apache Kafka, Matillion, and Python.

Structure Data and Queries for Performance

You will work with senior members of Data Engineering team to partner with data analysts, data scientists, and business intelligence developers to model and sanitize data for algorithm development, reporting, performance, and cost management.

Build and Maintain the Data Dictionary

You will learn and help to document, extend and maintain the data dictionary and grow your domain expertise. You will partner with Program and Product Managers and Engineering teams to understand and document data for reporting and compliance needs.

In pursuit of Bachelor’s or Master’s degree in Engineering, Computer Science, Data Science, or related field
Familiar with basic tools and concepts in the data warehouse space and in writing and analyzing SQL statements
Familiar with software development phases including design, implementation, testing, and maintenance
Experience with Python or other modern scripting languages- preferred
Experience leveraging APIs, databases, and streams as data producers- preferred
A healthy hunger for learning and growing

Duration: 12 weeks during the Summer of 2022 (can be flexible if needed)
Location: this opportunity can be remote, or can be based near one of the areas where some of our key leaders reside, such as: Seattle, Denver, Dallas, Oklahoma City, or our headquarters in Ada, OK
Additional Job Description
Additional Job Description