ssh — guest: ~ connected

WEBLLMFIT(1)

NAME

webllmfit — estimate which WebLLM models can run on this machine

SYNOPSIS

webllmfit webllmfit --all

DESCRIPTION

Inspired by llmfit.org, but wired to the real WebLLM catalogue. It probes this machine's WebGPU capabilities (adapter, shader-f16 support, max buffer / storage-buffer sizes) and approximate system memory, then cross-references every model's own vram_required_MB and required_features to label each one FIT, TIGHT or NO.

For each model it keeps the build that best matches the GPU (q4f16 when shader-f16 is available, otherwise q4f32) and de-duplicates the rest. The list is sorted so the largest model you can run comes first.

Verdicts: FIT comfortably within the estimated memory budget TIGHT fits, but close to the budget — may be slow or fail under load NO too large, or needs a GPU feature you don't have

The memory budget is an ESTIMATE: browsers don't expose total VRAM, so it is derived from navigator.deviceMemory (capped, approximate) with headroom. Real-world fit also depends on free VRAM and other GPU load.

OPTIONS

--all, -a also list the models that do NOT fit

EXAMPLES

webllmfit webllmfit --all

SEE ALSO

webllm, hashcat