FICHA · AUR

llama.cpp-cuda

Port of Facebook's LLaMA model in C/C++ (with NVIDIA CUDA optimizations)

llm-inference-tool
CLI
Dev
Launchable
Runs in terminal

official+codex · reviewed · Jun 2, 2026 description in en

Description

NVIDIA GPUs can accelerate local LLM inference through the CUDA build of llama.cpp.

This package is useful for users with compatible NVIDIA hardware who want faster model execution. It provides optimized inference tools; it does not include model files.

CUDA inference can use large GPU memory and may fail with mismatched drivers. Check hardware support before installing.

How to run

llama-cli

Commands: llama-cli

Permissions

Permissions not analysed for this source yet.