FICHA · AUR

llama.cpp-cuda-git

Port of Facebook's LLaMA model in C/C++ (with NVIDIA CUDA optimizations)

llm-inference-tool
CLI
Dev
Launchable
Runs in terminal

official+codex · reviewed · Jun 2, 2026 description in en

Description

Current CUDA acceleration code is available for local LLM inference on NVIDIA GPUs.

This Git package is useful for testers or developers who need the latest llama.cpp CUDA changes before a stable release. It is a GPU-optimized inference build, not a model bundle.

Development GPU builds can be unstable or driver-sensitive. Test with noncritical workloads first.

How to run

llama-cli

Commands: llama-cli

Permissions

Permissions not analysed for this source yet.