Description
Local LLM inference can run on a machine without relying on a hosted chat service.
This package is useful for developers and advanced users who want llama.cpp tools for running compatible language models. It provides inference software; model files must be obtained separately.
Local AI still processes sensitive prompts and can use large CPU, memory, disk, or GPU resources. Review model sources and do not expose servers unintentionally.