FICHA · AUR

ollama-cuda12-bin

Create, run and share large language models (LLMs) with CUDA 12

llm-runtime
CLI
SERVICE
AI
Launchable
Background service

official+codex · reviewed · Jun 2, 2026 description in en

Description

Local language models can run with NVIDIA CUDA 12 acceleration. This helps users use compatible GPUs for faster AI inference and model experimentation.

It is an Ollama binary variant for CUDA 12 systems. Check driver compatibility, GPU memory, model licenses, and privacy of prompts before using local or network-exposed APIs.

How to run

ollama

Commands: ollama

Permissions

Permissions not analysed for this source yet.