IEOR - Designing a More Efficient World

Data Engineer at Maestro Technologies, Inc.

View job on Handshake

Employer: Maestro Technologies, Inc.

Expires: 06/30/2020

What You’ll Do:Build & deploy large-scale ETL and stream processing pipelines in our serverless microservice infrastructure built on top of industry standard technology (Kubernetes & Kafka)Manage workflows in support of both product and our AI/Data Science pipeline, you’ll be introduced to our unique ingest and processing pipelines turning proprietary data assets into ground breaking solutionsBuild stream ingestion processes to efficiently send, process, analyze & publish dataPerform analyses of large structured and unstructured data to solve multiple & complex business problemsInvestigate and prototype different task dependency frameworks to understand the most appropriate design for a given use caseWork hand-in-hand with the data science team to understand various user or content trends that influence product changes and customer acquisition strategiesWho You Are:An Engineer interested in working in both streaming and batch processing environments [Spark, Kafka streaming, Kinesis)A tech-enthusiast excited to work with Cloud Based Technologies (GCP, Azure, AWS)A doer who loves to produce meaningful analytic insights for an innovative, data-intensive productsAlways curious about analytics frameworks and you are well-versed in the advantages and limitations of various big data architectures and technologiesBeliever in transparency & communicationCoding skills for analytics and data engineering/manipulation (Scala, Java and Python)Experience with SQL & NoSQL database systems, S3 & distributed big data technologies including Hadoop and Spark. Knowledge /awareness of orchestration systems like Airflow, NiFi, Pentaho a plus but not requiredThe Tools We Use:Tools can be learned, so please don’t shy away from applying if you’re a strong engineer! To give you a flavor of our current tools: Language: Scala/Java/Python Streaming: Spark Streaming, KafkaCloud Technologies: GCP (BigQuery, Cloudproc, Cloudflow, BigQuery, Compute Engine), Azure (HDInsight, data factory)