FICHA · AUR

llama.cpp-cuda-git

Port of Facebook's LLaMA model in C/C++ (with NVIDIA CUDA optimizations)

  • llm-inference-tool
  • CLI
  • Dev
  • Launchable
  • Runs in terminal
official+codex · reviewed · Jun 2, 2026 description in en

Description

Current CUDA acceleration code is available for local LLM inference on NVIDIA GPUs.

This Git package is useful for testers or developers who need the latest llama.cpp CUDA changes before a stable release. It is a GPU-optimized inference build, not a model bundle.

Development GPU builds can be unstable or driver-sensitive. Test with noncritical workloads first.

How to run

llama-cli

Commands: llama-cli

Permissions

Permissions not analysed for this source yet.