Description
Local LLM inference engines can be benchmarked from the terminal for speed and comparison.
This package is useful for developers and power users who tune llama.cpp or similar model runtimes. It measures performance; it does not provide a model by itself.
Benchmarks can consume high CPU, GPU, memory, and power. Run them when thermal and battery conditions are acceptable.