View job on Handshake

The Cleveland Guardians are seeking a data engineer to join the organization’s Baseball Systems team. As a baseball team, we generate and store data from a plethora of data sources (e.g. FieldVision), and use this data to answer a variety of questions that include “what trades should we make,” “who should we select with our next pick in the draft,” and “how can we show players their data from yesterday’s game?” To answer those questions, this position will closely collaborate not only with its Baseball Systems teammates, but also with the organization’s R&D department and all types of end-users – Front Office, Coaches, Scouts, and Players – to support their needs.

The ideal candidate will possess a solid foundation in data or computer science, along with the ability to effectively work in a collaborative, cross-functional environment. The position offers the opportunity to craft innovative solutions to challenging problems, grow from both an engineering and leadership standpoint, and work with teammates side by side in pursuit of the organization’s ultimate mission – winning the World Series. We are open to a remote role for the right candidate, but relocation to Cleveland, OH is preferred.

We know that people from historically marginalized groups and those who have not yet had direct experience in the sports industry are less likely to apply for a job unless they meet every requirement. That being said, we encourage anyone who meets some of the qualifications above to apply or reach out for more information.

Essential Duties & Responsibilities

  • Build robust data systems that improve the backbone of our data first applications
  • Transform both internal and external data sources into our central data warehouse
  • Collaborate with R&D team to help resolve challenges with new or existing statistical/machine learning models, and move those models into production
  • Work together with software engineers and data scientists to determine technical requirements, and then turn those requirements into accessible and secure data endpoints (e.g. direct SQL, BI tools, REST)
  • In collaboration with our Infrastructure teammates, troubleshoot/enhance performance and query costs in both cloud and on premise
  • Be an active participant in identifying, evolving, and evangelizing data engineering best practices, constantly challenging the status quo and improving our data engineering standards

Qualifications

  • Demonstrated experience or degree in a field such as computer science or other STEM program
  • Passion for data quality and building optimized and intuitive data sets
  • Proficiency in at least one programming language (e.g. Python, C#, Java, etc.)
  • Comfortable with complex SQL

Preferred Experience

We are looking for a variety of skill sets. If you have demonstrated experience with any of the following, you may be who we are looking for to join our team.

  • Demonstrated ability to engineer efficient, adaptable, and scalable data pipelines/jobs to process structured and unstructured data, utilizing languages and technologies similarly to Python, C#, Airflow, SSIS, dbt, etc.
  • Experience installing, maintaining, tuning, and developing in both relational databases and data warehousing systems like SQL Server, PostgreSQL, BigQuery, Snowflake, or Redshift
  • Knowledge of data modeling techniques like normalization, de-normalization, snowflake, star, etc.
  • Understanding of the data lifecycle and concepts such as lineage, governance, retention, testing, etc.
  • Experience extracting reporting and analytics requirements from end users and turning those into intuitive and scalable endpoints via REST, direct SQL, BI, etc.
  • Experience with building and maintaining data via distributed data systems concepts such as replication, change data capture, log shipping, etc
  • Conceptual knowledge of cloud (preferably GCP) such as SaaS, PaaS, FaaS, IaaS, Serverless, etc.
  • Familiarity with data science languages and statistical concepts such as R, python, linear regression, etc.
  • Familiarity with devops and software developmental concepts such as CI/CD, containerization, IaC, shell scripting, versioning/branching, OOP, Agile, Kanban, etc.

Standard Requirements

  • Represents the Cleveland Guardians in a positive fashion to all business partners and the general public.
  • Ability to develop and maintain successful working relationship with members of the Front Office.
  • Ability to act according to the organizational values and service excellence at all times.
  • Ability to work with diverse populations and have a demonstrated commitment to social justice
  • Ability to work in a diverse and changing environment.

About Us

In Baseball Operations and Data Engineering, our shared goal is to identify and develop diverse players and front office teammates who contribute to our mission. By working together effectively and collaboratively, we create a family atmosphere that supports learning as we strive for excellence in everything we do. We believe that we will achieve our goals by making evidence-based decisions and creating environments that support our people and empower them to learn.