WEBLLMFIT(1)
NAME
webllmfit — estimate which WebLLM models can run on this machine
SYNOPSIS
webllmfit webllmfit --all
DESCRIPTION
Inspired by llmfit.org, but wired to the real WebLLM catalogue. It probes this machine's WebGPU capabilities (adapter, shader-f16 support, max buffer / storage-buffer sizes) and approximate system memory, then cross-references every model's own vram_required_MB and required_features to label each one FIT, TIGHT or NO.
For each model it keeps the build that best matches the GPU (q4f16 when shader-f16 is available, otherwise q4f32) and de-duplicates the rest. The list is sorted so the largest model you can run comes first.
Verdicts: FIT comfortably within the estimated memory budget TIGHT fits, but close to the budget — may be slow or fail under load NO too large, or needs a GPU feature you don't have
The memory budget is an ESTIMATE: browsers don't expose total VRAM, so it is derived from navigator.deviceMemory (capped, approximate) with headroom. Real-world fit also depends on free VRAM and other GPU load.
OPTIONS
--all, -a also list the models that do NOT fit
EXAMPLES
webllmfit webllmfit --all