Problem SolvingDS Technical Skills
Design an inference batching system for a single GPU that can handle up to 100 inputs per batch while users wait synchronously, maximizing hardware utilization under strict compute constraints.
Was asked at
More interviews, more skills, more success.
Be the first to share your approach to this question
Interview question asked to Data Scientists interviewing at Google, DataRobot, Zomato and other companies. Original question asked: Design an inference batching system for a single GPU that can handle up to 100 inputs per batch while users wait synchronously, maximizing hardware utilization under strict compute constraints..