FICHA · MANJARO

tesseract-data-chi_sim

Tesseract OCR data (chi_sim)

  • Data
  • DATA
  • OCR
  • LOCALIZATION
  • Dependency only
official+codex · reviewed · May 29, 2026 description in en

Description

Enables Tesseract to recognize Simplified Chinese text in scanned pages, screenshots, and document images. It is useful for extracting searchable text from modern mainland Chinese printed material.

Chinese OCR is sensitive to image resolution, font, spacing, and layout. Review output carefully, especially for names, numbers, and formal documents.

Permissions

Permissions not analysed for this source yet.