Description
Local large language models can be downloaded, run, and served from the machine. This helps developers and advanced users experiment with AI assistants, coding helpers, and text generation without sending every prompt to a hosted provider.
It can download models, use significant CPU or GPU resources, and expose a local API. Review model licenses, disk usage, network access, and any prompts containing private data.