As a Senior Software Engineer in the AI Processor Software & Hardware Co-design Lab, you will be responsible for designing and implementing both compile-time and run-time optimizations to enable real-time AI applications on Huawei AI processors. You will collaborate closely with cross-functional teams to integrate and deploy AI solutions on the Ascend platform, leveraging your expertise to shape the performance, functionality, and efficiency of our AI models and systems.
Required:
Rich experience in optimizing AI chip architectures and AI systems, be familiar with mainstream heterogeneous computing software and hardware architectures in the industry, and have comprehensive capabilities from applications to basic software to chips.
Hands-on experience of one of the following technologies: Numerical Calculation, Compilation, Algorithm & chip co-design, Runtime, Shared Memory.
Knowledge of AI industry application scenarios, be familiar with mainstream models and algorithm development trends, and be able to extract requirements for the chip layer.
Experience in analyzing workload sensitivity to micro-architecture features, evaluating performance trade-offs, and recommending improvements to both micro-architecture and application software for optimal efficiency.
Familiarity with the performance impact of different compute, memory, and communication configurations, as well as hardware and software implementation choices, on AI acceleration.
Experience with GPU compute APIs such as CUDA or OpenCL, and the ability to utilize GPU/NPU-optimized libraries to enhance performance.
Experience in the development of deep learning frameworks, compilers, or system software.
Strong background in compilers and optimization techniques; experience with LLVM-MLIR is a plus, but not required.
Experience in software development using C/C++ and python.
Desired:
Relevant experience in several sub-fields of AI application algorithms, frameworks, runtime, modelling and simulation, and compilers.
In-depth understanding of the innovative methods, platforms, and tools of AI head manufacturers, and have experience in transforming application and academic research achievements into commercial products.
Experience with GPU acceleration using AMD or Nvidia GPUs.
Experience in developing inference backends and compilers for GPU or NPU.
Experience with AI/ML inference frameworks like ONNXRuntime, IREE or TVM.
Experience with deploying AI models in production environments.