Neural Processor Sr. Compiler Engineer
Are you passionate about exploring deep learning algorithms? Are you motivated to push the limits of Deep Neural Networks capabilities for vision and Speech related Products? If so, Samsung’s NPL architecture group is looking add you to our team.
Samsung’s Neural Processor Lab (NPL) is working on the next generation architecture for NPU, Neural Processor Unit, to be used in handsets, Advanced Driver Assistance Systems “ADAS”, and mobile devices’ next generation Samsung products.
Architect and develop the Compiler Architecture for Samsung proprietary Neural Processor architecture, to enable inference of deep learning networks onto this architecture with an emphasis on performance and power.
Manage performance trade-offs; understand the balance between performance, memory, and power in compiler generated code
- Develop a deep learning compiler stack that interfaces frameworks such as Tensorflow, PyTorch, Keras etc. and converts DNNs and Transformer Networks into internal representations suitable for optimizations.
- Develop new optimization techniques and algorithms to efficiently map DNNs onto Samsung NPU processors
- Devise multiprocessor/multicore partitioning and scheduling strategies
- Develop complex programs to validate the functionality and performance of the CNN application programming kit
- MS with 6+ years or PhD
- Solid C/C++, Python, or equivalent
- Solid scientific computing skills
- Solid Linux, embedded Linux programming skills
- Familiarity with machine learning, deep learning, TensorFlow, Caffe, & PyTorch
- Solid knowledge, ability to write code parsers, translators, code generators
- Solid knowledge optimizing, accelerating code
- Excellent communication and presentation skills
- Available to travel internationally 5% or more
- Strong ability to debug software, prototypes, algorithms, experiments
- Must be a quick learner with excellent problem solving skills
- Must be motivated and able to work without supervision
- Familiarity with computer vision algorithms such as object detection, tracking, and recognition.
- Prior work with CNNs and familiarity with deep learning frameworks (Tensorflow, PyTorch, etc.) is a strong plus.
- Familiarity with the state-of-the-art deep learning compilation approaches is a huge plus: XLA, TVM, Glow, ONNX, Tensor Comprehensions, etc.
- Knowledge of neural net exchange formats (NNVM, NNEF) is a bonus
- Experience with application focused hardware acceleration technologies, such as GPU acceleration with CUDA or OpenCL, or FPGA acceleration with OpenCL or CAPI
- Strong knowledge of resource management, scheduling, code generation and , compute graph optimization
- Proficiency in hardware definition/architecture collaboration and hardware / software integration
Computer Science fundamentals in object-oriented design, data structures and algorithm design, complexity analysis, scalability and availability
Samsung Semiconductor Inc (SSI), an equal opportunity employer, is a world leader in Memory, System LSI, and LCD technologies. Headquartered in San Jose, California, SSI is a wholly-owned U.S. subsidiary of Samsung Electronics Co., Ltd.- the second largest semiconductor manufacturer in the world and the industry's volume and technology leader in DRAM, NAND Flash, SSDs, mobile DRAM and graphics memory. It is one of the largest providers of system logic, imaging and LED lighting solutions, as well as providing advanced process design and manufacturing for fabless companies. Samsung Semiconductor, Inc. also has a research and innovation center with numerous labs providing product design and research in: logic, memory, image sensors, displays and mobile technologies. In addition, the company supports Samsung Display Company, the largest producer of LCD and OLED displays.
Learn more about Samsung Semiconductor here.
A day in the life Samsung Video
Samsung Semiconductor Career Page