Employer: Cerebras Systems
Cerebras is developing a radically new chip and system to dramatically accelerate deep learning applications. Our system runs training and inference workloads orders of magnitude faster than contemporary machines, fundamentally changing the way ML researchers work and pursue AI innovation.We are innovating at every level of the stack – from chip, to microcode, to power delivery and cooling, to new algorithms and network architectures at the cutting edge of ML research. Our fully-integrated system delivers unprecedented performance because it is built from the ground up for the deep learning workload.Cerebras is building a team of exceptional people to work together on big problems. Join us!The TeamAs a Kernel Software Engineer on our team, you will work with leaders from industry and academia at the intersection of hardware and software, to develop state-of-the-art solutions for emerging problems in AI compute.Our team of kernel developers is responsible for the design, implementation, and performance tuning of deep learning operations on highly parallel custom processors. We are developing parallel and distributed algorithms to maximize hardware utilization and accelerate the training of deep neural networks to unprecedented speeds.The RoleWe’re looking for an engineer to design and implement optimized kernels for primitive operations used by state-of-the-art neural network architectures. You should apply if you are an engineer familiar with parallel and distributed architectures who can map various workloads to our high-performance hardware. The role involves a mix of algorithm design, kernel implementation, and performance tuning. In particular, we are looking for candidates comfortable with performance analysis of parallel algorithms and low-level software optimization. You will also be responsible for understanding the latest deep learning algorithms in order to design kernel implementations.Skills & QualificationsBachelor’s / Master’s degree or foreign equivalent in Computer Science, Engineering, or related fieldFamiliarity with parallel algorithms and distributed memory systemsAbility to read and write code using C and PythonExperience with assembly-level programming and optimizationUnderstanding of hardware architecture concepts — you should be comfortable learning the details of a new hardware architectureDeep learning algorithm experience is a plusLocationOur cozy and well-appointed headquarters are in the heart of Silicon Valley near downtown Los Altos, California.