view mupdf-source/thirdparty/tesseract/doc/merge_unicharsets.1.asc @ 46:7ee69f120f19 default tip

>>>>> tag v1.26.5+1 for changeset b74429b0f5c4
author Franz Glasner <fzglas.hg@dom66.de>
date Sat, 11 Oct 2025 17:17:30 +0200
parents b50eed0cc0ef
children
line wrap: on
line source

MERGE_UNICHARSETS(1)
====================
:doctype: manpage

NAME
----
merge_unicharsets - Simple tool to merge two or more unicharsets.

SYNOPSIS
--------
*merge_unicharsets* 'unicharset-in-1' ... 'unicharset-in-n' 'unicharset-out'

DESCRIPTION
-----------
merge_unicharsets(1) is a simple tool to merge two or more unicharsets.
It could be used to create a combined unicharset for a script-level engine,
like the new Latin or Devanagari.

IN/OUT ARGUMENTS
----------------
'unicharset-in-1'::
	(Input) The name of the first unicharset file to be merged.

'unicharset-in-n'::
	(Input) The name of the nth unicharset file to be merged.

'unicharset-out'::
	(Output) The name of the merged unicharset file.

HISTORY
-------
merge_unicharsets(1) was first made available for tesseract4.00.00alpha.

RESOURCES
---------
Main web site: <https://github.com/tesseract-ocr> +
Information on training tesseract LSTM: <https://tesseract-ocr.github.io/tessdoc/TrainingTesseract-4.00.html>

SEE ALSO
--------
tesseract(1)

COPYING
-------
Copyright \(C) 2012 Google, Inc.
Licensed under the Apache License, Version 2.0

AUTHOR
------
The Tesseract OCR engine was written by Ray Smith and his research groups
at Hewlett Packard (1985-1995) and Google (2006-2018).