KV Cache Calculator

Calculate memory savings for different quantization methods

Model

Quantization Method

Context Length

4K32K tokens128K

Batch Size

KV Cache Memory

3.50 GB

FP16 Baseline

16.00 GB

Memory Savings

78.1%

Model Specs

Parameters:8B
Layers:32
Attention Heads:32
Head Dim:128
Avg Bits/Token:3.5
Config:K4V3