FICHA · AUR

apache-spark

A unified analytics engine for large-scale data processing

  • framework
  • COMMAND-LINE
  • DATA-PROCESSING
  • NETWORK
  • Launchable
  • Runs in terminal
official+codex · reviewed · May 29, 2026 description in en

Description

Processes large datasets with Apache Spark's unified analytics engine. It helps data engineers and researchers run batch jobs, interactive queries, streaming tasks, and machine-learning pipelines on local or clustered resources.

Spark jobs can consume substantial CPU, memory, disk, and network resources and may process sensitive datasets. Review configuration, cluster access, credentials, and data handling before running workloads.

How to run

spark-shell

Commands: spark-shell

Permissions

Permissions not analysed for this source yet.