[Remote] System Modeling (Performance Models)
Note: The job is a remote job and is open to candidates in USA. Unconventional AI is focused on redefining computing to address the energy limitations of AI on a global scale. They are seeking a Member of Technical Staff for System Modeling (Performance Models) to develop physics-based system models and simulation frameworks for machine learning workloads, supporting innovative AI acceleration architectures.
Responsibilities
- Building extensible and composable high-fidelity power, performance and area estimation tools for novel AI acceleration system architectures to enable rapid design space exploration
- Define and create comparative analyses across candidate architectures and existing state-of-art implementations
- Working with other teams to understand their needs for such modeling and simulation to support high level system design as well as lower level verification of hardware
Skills
- MS/PhD in a quantitative field (AI/ML, Computer Science, Physics, Electrical Engineering, Applied Math), or BS with substantial, clear evidence of equivalent research/engineering depth
- Experience with tools and development for power profiling, modeling and simulation for AI workloads
- Deep understanding of spatial architectures and data orchestration mechanisms
- Deep understanding of different dataflow strategies and their tradeoffs, e.g. Weight-Stationary (WS), Output-Stationary (OS), Input-Stationary (IS) and Row-Stationary (RS)
- Familiar with (OSS) tools for hardware accelerator design: TimLoop, Accelergy, NeuroSim, CIMLoop, CACTI, etc
- Familiar with different existing systolic array accelerator architectures for AI/ML workloads
- Solid understanding of modern AI/ML architectures and training/inference workflows
- Strong experience implementing and debugging ML models in PyTorch (preferred) or similar, with practical experience profiling, optimizing, and stabilizing non-trivial large-scale ML systems
- Basic familiarity of analog dynamic systems, including transient responses, nonidealities such as nonlinearity, quantization, random noise, and feedback/stability
- Strong Python engineering skills: modular design, testing, packaging, CI
- Experience with PyTorch internals: autograd, custom modules, low-level ops; familiarity with torch.compile or similar graph capture/compile flows
- Experience with CUDA, Triton, or other GPU programming approaches (writing custom kernels, understanding memory hierarchy, basic performance tuning)
- Comfort with at least some of: JAX, NumPy, TensorFlow, Modal, HPC patterns (MPI, NCCL, distributed training), SciPy
- Demonstrated ability to reason across multiple layers of the stack: algorithm, software, runtime, hardware
- Able to connect model architecture choices to system performance implications: memory bandwidth, communication patterns, latency, energy, and numerical issues
- Experience applying at least some efficiency techniques (quantization, sparsity, pruning, distillation, kernel fusion, etc.)
- Prior experience building or extending a serious simulation or modeling framework (could be ML systems, physics, circuits, or other technical domains)
- Comfort with approximations and tradeoffs: you know when to use a simple model and when you need something closer to the physics
Benefits
- Best-in-class health benefits
- 401k matching
- Truly unlimited PTO
- Complimentary meals when working from our Palo Alto office
Company Overview