ML System Bottleneck Analyzer

Model Configuration

Devices

Resource Utilization

System Analysis (Token rates are approximations)

Real-world results are below for reference

Model Quantization Framework Hardware Batch Size Sequence Length Token Rate (Batch) Token Rate (Single) Source