Design a GPU Scheduling Platform
Asked by: OpenAI · Anthropic
"Design a GPU scheduling platform" is asked at OpenAI and Anthropic and gets at the resource problem every AI lab lives with: GPUs are scarce and astronomically expensive, and dozens of teams are fighting over them. The scheduler is the system that decides who gets which accelerators, when. It's a constrained-optimization + distributed-systems problem that fits the standard interview framework, so we'll use it.
The crux (spend ~60% of your time here). This is a bin-packing problem under three constraints that fight each other — utilization, locality (topology), and gang atomicity — wrapped in fair-share + preemption. The depth is not in the API or the queue; it's in placement (NVLink/fabric-aware), gang admission without deadlock, and preemption that cooperates with checkpointing. Frame the whole problem as "maximize GPU utilization subject to fairness, locality, and all-or-nothing gangs," name the tension between those, and go deep there.
Comments (0)