Mercurial > hgrepos > Python2 > PyMuPDF
comparison mupdf-source/thirdparty/tesseract/ChangeLog @ 2:b50eed0cc0ef upstream
ADD: MuPDF v1.26.7: the MuPDF source as downloaded by a default build of PyMuPDF 1.26.4.
The directory name has changed: no version number in the expanded directory now.
| author | Franz Glasner <fzglas.hg@dom66.de> |
|---|---|
| date | Mon, 15 Sep 2025 11:43:07 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 1:1d09e1dec1d9 | 2:b50eed0cc0ef |
|---|---|
| 1 2024-11-10 - V5.5.0 | |
| 2 * Set hOCR capabilities ocrp_dir and ocrp_lang unconditionally. | |
| 3 * Calculate row bounding box in single-word mode per (issue #4304). | |
| 4 * Reduce clock syscalls (#4303). | |
| 5 * Several small performance and other code fixes. | |
| 6 * Modernized code. | |
| 7 * Print time for tessedit_timing_debug in milliseconds. | |
| 8 * Print time for ErrorCounter::ComputeErrorRate in milliseconds. | |
| 9 * cmake: Correctly set the soversion based on SemVer properties. | |
| 10 * Do not export PDBs for static libraries (issue #4279). | |
| 11 * Several other small fixes and improvements for builds and CI. | |
| 12 * Modernize code for renderers and remove filename conversion for Windows (#4330). | |
| 13 * Add build rule for Windows installer. | |
| 14 * Support symbolic values for --oem and --psm options. | |
| 15 * Remove Tensorflow support. | |
| 16 * Add RISC-V V support (#4346). | |
| 17 * Remove broken GitHub action msys2-4.1.1. | |
| 18 | |
| 19 2024-06-11 - V5.4.1 | |
| 20 * Avoid FP overflow in NormEvidenceOf (fixes issue #4257) (#4259) | |
| 21 * Small build fixes and code improvements (#4262, #4263, #4266, #4267) | |
| 22 | |
| 23 2024-06-06 - V5.4.0 | |
| 24 * Small build fixes and code improvements | |
| 25 (#4241, #4243, #4244, #4245, #4246, #4248, #4249, #4250, #4253) | |
| 26 | |
| 27 2024-05-19 - V5.4.0-rc2 | |
| 28 * Fix setup of datadir on installations with Conda (issue #4230) (#4240) | |
| 29 * Fix FP exception in Wordrec::angle_change (issue #4242) (#4243) | |
| 30 | |
| 31 2024-05-12 - V5.4.0-rc1 | |
| 32 * Build fixes, code refactoring and other smaller changes. | |
| 33 * Fix grey result of indexed PNG in pdfrenderer. | |
| 34 * Rename frk -> deu_latf (ISO 639-3, ISO 15924). | |
| 35 * Remove broken Dockerfile. | |
| 36 * Fixes for several issues reported by Coverity Scan. | |
| 37 * Remove unsupported OpenCL code and related API functions (#4220). | |
| 38 * Facilitate vectorization for generic build (#4223). | |
| 39 * Add PAGE XML renderer / export (#4214). | |
| 40 * Support training without lstmf files. | |
| 41 * Improve CCUtil::main_setup (fixes issue #4230 related to Coda). | |
| 42 * Allow for text angle/gradient to be retrieved (#4070). | |
| 43 | |
| 44 2024-01-18 - V5.3.4 | |
| 45 * Fixes for scrollview | |
| 46 * Fixes for autoconf, clang and sw builds | |
| 47 * Improve OCR for an image URL | |
| 48 * Fail on curl download errors | |
| 49 * New parameter curl_cookiefile | |
| 50 * Set User-Agent: header field in HTTP request for curl downloads | |
| 51 * Output directory list from "combine_tessdata -d" to stdout | |
| 52 * Other small improvements for code and documentation. | |
| 53 | |
| 54 2023-10-05 - V5.3.3 | |
| 55 * Small code fixes and improvements to fix Coverity Scan issues. | |
| 56 * Disable -mfpu=neon for aarch64. | |
| 57 * Fix build without git clone in cloned directory (required for FreeBSD). | |
| 58 * Other build fixes for autotools, cmake and sw. | |
| 59 * Fix regression in layout detection which was introduced in release 5.0.0. | |
| 60 * Fix regression which prevented loading of submodels, introduced in release 5.0.0-rc2. | |
| 61 * Other small improvements for code and documentation. | |
| 62 | |
| 63 2023-07-11 - V5.3.2 | |
| 64 * Updates for snap package building. | |
| 65 * Support for Sgaw and W Pwo Karen languages in the Myanmar validator (#4065). | |
| 66 * Improve format of logging from lstmtraining. | |
| 67 * Use less digits in filenames of checkpoints written by lstmtraining. | |
| 68 * Replace deprecated sprintf. | |
| 69 * Remove unused code in function fix_rep_char. | |
| 70 * Avoid 32 bit overflow in multiplication (fixes 3 CodeQL CI alerts). | |
| 71 * Avoid conversions from std::string to char* to std::string. | |
| 72 * Abort with error message if OSD is requested with LSTM-only model. | |
| 73 * cmake: allow to disable tiff (-DDISABLE_TIFF=ON). | |
| 74 * cmake: provide info about disabled LibArchive and CURL. | |
| 75 * cmake: check if leptonica was build with tiff support. | |
| 76 * Remove old broken GitHub action vcpkg-4.1.1 (fixes issue #4078). | |
| 77 * Create config.yml. | |
| 78 * Fix typos. | |
| 79 | |
| 80 2023-04-01 - V5.3.1 | |
| 81 * Bug fixes for some special scenarios: | |
| 82 * Fix issue #4010. | |
| 83 * textord: Catch empty rows in block iterator (fixes #4039). | |
| 84 * Fix FP division by zero (issue #3995). | |
| 85 * Improve documentation and log messages. | |
| 86 * Build fixes and improvements (mainly for cmake). | |
| 87 | |
| 88 2022-12-22 - V5.3.0 | |
| 89 * Minor updates for documentation and cmake builds. | |
| 90 | |
| 91 2022-12-13 - V5.3.0-rc1 | |
| 92 * Fix the training tools for the legacy OCR engine (fix issue #3925). | |
| 93 * PDF renderer: Ignore non-text blocks (fix issue #3957). | |
| 94 * Remove colormap before thresholding (fix issue #3940). | |
| 95 * Fix a number of performance issues reported by Coverity Scan. | |
| 96 * Training tools: Replace call of exit function by return statement in main function. | |
| 97 * Fix double free in function vigorous_noise_removal (fix issue #3876). | |
| 98 * Create to_win if needed in Textord::make_spline_rows (fix issue #3875). | |
| 99 * Bug fixes for ScrollView viewer: | |
| 100 * Fix memory issues in ScrollView::MessageReceiver. | |
| 101 * Catch potential nullptr in SVNetwork::SVNetwork. | |
| 102 * Move svpaint.cpp from src/viewer to src/. | |
| 103 * Add rule for svpaint executable in Autotools. | |
| 104 * Bug fixes and improvements for build tools: | |
| 105 * Fix AMD64 detection with autobuild on FreeBSD (fix issue #3964). | |
| 106 * Fix tesseract.pc generated from CMake to match Autotools. | |
| 107 * Detect availability of AVX512-VNNI. | |
| 108 * configure.ac: fix build on aarch64_be. | |
| 109 * Drop CI for old versions of macOS and Ubuntu. | |
| 110 | |
| 111 2022-07-06 - V5.2.0 | |
| 112 * Improvements and fixes for continuous integration, | |
| 113 autoconf and cmake builds. | |
| 114 * Set /Os for some 32 bit MS compilers (fixes #3769). | |
| 115 * Improve comments and other documentation. | |
| 116 * Add initial support for Intel AVX512F. | |
| 117 * Fix for very large PDF files on 32 bit hosts (fixes #3805). | |
| 118 * Fix NEON detection on FreeBSD. | |
| 119 * Fix regression with UZN files (fixes #3837). | |
| 120 * Fix calling delete[] for memory allocated by malloc in C API. | |
| 121 * Add an API function to init tesseract with traineddata from memory | |
| 122 (fixes #3691). | |
| 123 * Replace direct access to Leptonica internal data structures by | |
| 124 function calls and support latest releases of Leptonica. | |
| 125 * Replace std::regex by std::string functions (fixes issue #3830). | |
| 126 * Use compiled-in TESSDATA_PREFIX also on Windows (fixes #3767). | |
| 127 * Add new parameter 'invert_threshold', change the default threshold | |
| 128 from 0.5 to 0.7 and mark parameter 'tessedit_do_invert' as deprecated. | |
| 129 | |
| 130 2022-03-01 - V5.1.0 | |
| 131 * Handle image and line regions in output formats ALTO, hOCR and text. | |
| 132 * New parameter curl_timeout for curl_easy_setop. | |
| 133 * Build fixes and improvements. | |
| 134 * Catch nullptr in PageIterator::Orientation to improve robustness. | |
| 135 * Remove unused code. | |
| 136 | |
| 137 2022-01-06 - V5.0.1 | |
| 138 * Add SPDX-License-Identifier to public include files. | |
| 139 * Support redirections when running OCR on a URL. | |
| 140 * Lots of fixes and improvements for cmake builds. | |
| 141 Distributions should use the autoconf build. | |
| 142 * Fix broken msys2 build with gcc 11. | |
| 143 * Fix parameter certainty_scale (was duplicated). | |
| 144 * Fix some compiler warnings and clean code. | |
| 145 * Correctly detect amd64 and i386 on FreeBSD. | |
| 146 * Add libarchive and libcurl in continuous integration actions. | |
| 147 * Update submodule googletest to release v1.11.0. | |
| 148 | |
| 149 2021-11-22 - V5.0.0 | |
| 150 * Faster training and recognition by default (float instead of | |
| 151 double calculations) | |
| 152 * More options for binarization | |
| 153 * Improved support for ARM NEON | |
| 154 * Modernized code | |
| 155 * Removed proprietary data types like GenericVector and STRING | |
| 156 from public API | |
| 157 * pdf.ttf no longer needed, now integrated into the code | |
| 158 * Faster flat build with automake | |
| 159 * New options for combine_tessdata to show details of traineddata files | |
| 160 * Improved training messages | |
| 161 * Improved unit tests and fuzzing tests | |
| 162 * Lots of bug fixes | |
| 163 | |
| 164 2021-11-15 - V4.1.3 | |
| 165 * Fix build regression for autoconf build | |
| 166 | |
| 167 2021-11-14 - V4.1.2 | |
| 168 * Add RowAttributes getter to PageIterator | |
| 169 * Allow line images with larger width for training | |
| 170 * Fix memory leaks | |
| 171 * Improve build process | |
| 172 * Don't output empty ALTO sourceImageInformation (issue #2700) | |
| 173 * Extend URI support for Tesseract with libcurl | |
| 174 * Abort LSTM training with integer model (fixes issue #1573) | |
| 175 * Update documentation | |
| 176 * Make automake builds less noisy by default | |
| 177 * Don't use -march=native in automake builds | |
| 178 | |
| 179 2019-12-26 - V4.1.1 | |
| 180 * Implemented sw build (cppan is depreciated) | |
| 181 * Improved cmake build | |
| 182 * Code cleanup and optimization | |
| 183 * A lot of bug fixes... | |
| 184 | |
| 185 2019-07-07 - V4.1.0 | |
| 186 * Added new renders Alto, LSTMBox, WordStrBox. | |
| 187 * Added character boxes in hOCR output. | |
| 188 * Added python training scripts (experimental) as alternative shell scripts. | |
| 189 * Better support AVX / AVX2 / SSE. | |
| 190 * Disable OpenMP support by default (see e.g. #1171, #1081). | |
| 191 * Fix for bounding box problem. | |
| 192 * Implemented support for whitelist/blacklist in LSTM engine. | |
| 193 * Improved cmake configuration. | |
| 194 * Code modernization and improvements. | |
| 195 * A lot of bug fixes... | |
| 196 | |
| 197 2018-10-29 - V4.0.0 | |
| 198 * Added new neural network system based on LSTMs, with major accuracy gains. | |
| 199 * Improvements to PDF rendering. | |
| 200 * Fixes to trainingdata rendering. | |
| 201 * Added LSTM models+lang models to 101 languages. (tessdata repository) | |
| 202 * Improved multi-page TIFF handling. | |
| 203 * Fixed damage to binary images when processing PDFs. | |
| 204 * Fixes to training process to allow incremental training from a recognition model. | |
| 205 * Made LSTM the default engine, pushed cube out. | |
| 206 * Deleted cube code. | |
| 207 * Changed OEModes --oem 0 for legacy tesseract engine, --oem 1 for LSTM, --oem 2 for both, --oem 3 for default. | |
| 208 * Avoid use of Leptonica debug parameters or functions. | |
| 209 * Fixed multi-language mode. | |
| 210 * Removed support for VS2010. | |
| 211 * Added Support for VS2015 and VS2017 with CPPAN. | |
| 212 * Implemented invisible text only for PDF. | |
| 213 * Added AVX / SSE support for Windows. | |
| 214 * Enabled OpenMP support. | |
| 215 * Parameter unlv_tilde_crunching change to false. | |
| 216 * Miscellaneous Fixes. | |
| 217 * Detailed Changelog can be found at https://tesseract-ocr.github.io/tessdoc/4.0x-Changelog.html and https://tesseract-ocr.github.io/tessdoc/ReleaseNotes.html#tesseract-release-notes-oct-29-2018---v400 | |
| 218 | |
| 219 2017-02-16 - V3.05.00 | |
| 220 * Made some fine tuning to the hOCR output. | |
| 221 * Added TSV as another optional output format. | |
| 222 * Fixed ABI break introduced in 3.04.00 with the AnalyseLayout() method. | |
| 223 * text2image tool - Enable all OpenType ligatures available in a font. This feature requires Pango 1.38 or newer. | |
| 224 * Training tools - Replaced asserts with tprintf() and exit(1). | |
| 225 * Fixed Cygwin compatibility. | |
| 226 * Improved multipage tiff processing. | |
| 227 * Improved the embedded pdf font (pdf.ttf). | |
| 228 * Enable selection of OCR engine mode from command line. | |
| 229 * Changed tesseract command line parameter '-psm' to '--psm'. | |
| 230 * Write output of tesseract --help, --version and --list-langs to stdout instead of stderr. | |
| 231 * Added new C API for orientation and script detection, removed the old one. | |
| 232 * Increased minimum autoconf version to 2.59. | |
| 233 * Removed dead code. | |
| 234 * Require Leptonica 1.74 or higher. | |
| 235 * Fixed many compiler warning. | |
| 236 * Fixed memory and resource leaks. | |
| 237 * Fixed some issues with the 'Cube' OCR engine. | |
| 238 * Fixed some openCL issues. | |
| 239 * Added option to build Tesseract with CMake build system. | |
| 240 * Implemented CPPAN support for easy Windows building. | |
| 241 | |
| 242 2016-02-17 - V3.04.01 | |
| 243 * Added OSD renderer for psm 0. Works for single page and multi-page images. | |
| 244 * Improve tesstrain.sh script. | |
| 245 * Simplify build and run of ScrollView. | |
| 246 * Improved PDF output for OS X Preview utility. | |
| 247 * INCOMPATIBLE fix to hOCR line height information - commit 134ebc3. | |
| 248 * Added option to build Tesseract without Cube OCR engine (-DNO_CUBE_BUILD). | |
| 249 * Enable OpenMP support. | |
| 250 * Many bug fixes. | |
| 251 | |
| 252 2015-07-11 - V3.04.00 | |
| 253 * Tesseract development is now done with Git and hosted at github.com (Previously we used Subversion as a VCS and code.google.com for hosting). | |
| 254 * Tesseract now requires leptonica 1.71 or a higher version. | |
| 255 * Removed official support for VS 2008. | |
| 256 * Added support for 39 additional scripts/languages, including: amh, asm, aze_cyrl, bod, bos, ceb, cym, dzo, fas, gle, guj, hat, iku, jav, kat, kat_old, kaz, khm, kir, kur, lao, lat, mar, mya, nep, ori, pan, pus, san, sin, srp_latn, syr, tgk, tir, uig, urd, uzb, uzb_cyrl, yid | |
| 257 * Major updates to training system as a result of extensive testing on 100 languages. | |
| 258 * New training data for over 100 languages | |
| 259 * Improved performance with PIC compilation option. | |
| 260 * Significant change to invisible font system in pdf output to improve correctness and compatibility with external programs, particularly ghostscript. | |
| 261 * Improved font identification. | |
| 262 * Major change to improve layout analysis for heavily diacritic languages: Thai, Vietnamese, Kannada, Telugu etc. | |
| 263 * Fixed problems with shifted baselines so recognition can recover from layout analysis errors. | |
| 264 * Major refactor to improve speed on difficult images, especially when running a heap checker. | |
| 265 * Moved params from global in page layout to tesseractclass. | |
| 266 * Improved single column layout analysis. | |
| 267 * Allow ocr output to multiple formats using tesseract command line executable. | |
| 268 * Fixed issues with mixed eng+ara scripts. | |
| 269 * Improved script consistency in numbers. | |
| 270 * Major refactor of control.cpp to enable line recognition. | |
| 271 * Added tesstrain.sh - a master training script. | |
| 272 * Added ability to text2image training tool to just list available fonts. | |
| 273 * Added ability to text2image to underline words. | |
| 274 * Improved efficiency of image processing for PDF output. | |
| 275 * Added parameter description for each parameter listed with 'print-parameters' command line option. | |
| 276 * Added font info to hOCR output. | |
| 277 * Enabled streaming input and output of multi-page documents. | |
| 278 * Many bug fixes. | |
| 279 | |
| 280 2014-02-04 - V3.03(rc1) | |
| 281 * Added new training tool text2image to generate box/tif file pairs from | |
| 282 text and truetype fonts. | |
| 283 * Added support for PDF output with searchable text. | |
| 284 * Removed entire IMAGE class and all code in image directory. | |
| 285 * Tesseract executable: support for output to stdout; limited support for one | |
| 286 page images from stdin (especially on Windows) | |
| 287 * Added Renderer to API to allow document-level processing and output | |
| 288 of document formats, like hOCR, PDF. | |
| 289 * Major refactor of word-level recognition, beam search, eliminating dead code. | |
| 290 * Refactored classifier to make it easier to add new ones. | |
| 291 * Generalized feature extractor to allow feature extraction from greyscale. | |
| 292 * Improved sub/superscript treatment. | |
| 293 * Improved baseline fit. | |
| 294 * Added set_unicharset_properties to training tools. | |
| 295 * Many bug fixes. | |
| 296 * More training source data included. | |
| 297 | |
| 298 2012-02-01 - V3.02 | |
| 299 * Moved ResultIterator/PageIterator to ccmain. | |
| 300 * Added Right-to-left/Bidi capability in the output iterators for Hebrew/Arabic. | |
| 301 * Added paragraph detection in layout analysis/post OCR. | |
| 302 * Fixed inconsistent xheight during training and over-chopping. | |
| 303 * Added simultaneous multi-language capability. | |
| 304 * Refactored top-level word recognition module. | |
| 305 * Added experimental equation detector. | |
| 306 * Improved handling of resolution from input images. | |
| 307 * Blamer module added for error analysis. | |
| 308 * Cleaned up externally used namespace by removing includes from baseapi.h. | |
| 309 * Removed dead memory mangagement code. | |
| 310 * Tidied up constraints on control parameters. | |
| 311 * Added support for ShapeTable in classifier and training. | |
| 312 * Refactored class pruner. | |
| 313 * Fixed training leaks and randomness. | |
| 314 * Major improvements to layout analysis for better image detection, diacritic detection, better textline finding, better tabstop finding. | |
| 315 * Improved line detection and removal. | |
| 316 * Added fixed pitch chopper for CJK. | |
| 317 * Added UNICHARSET to WERD_CHOICE to make mult-language handling easier. | |
| 318 * Fixed problems with internally scaled images. | |
| 319 * Added page and bbox to string in tr files to identify source of training data better. | |
| 320 * Fixes to Hindi Shiroreka splitter. | |
| 321 * Added word bigram correction. | |
| 322 * Reduced stack memory consumption and eliminated some ugly typedefs. | |
| 323 * Added new uniform classifier API. | |
| 324 * Added new training error counter. | |
| 325 * Fixed endian bug in dawg reader. | |
| 326 * Many other fixes, including the way in which the chopper finds chops and messes with the outline while it does so. | |
| 327 | |
| 328 2010-11-29 - V3.01 | |
| 329 * Removed old/dead serialise/deserialize methods on *LISTIZED classes. | |
| 330 * Total rewrite of DENORM to better encapsulate operation and make | |
| 331 for potential to extract features from images. | |
| 332 * Thread-safety! Moved all critical global and static variables to members of the appropriate class. Tesseract is now thread-safe (multiple instances can be used in parallel in multiple threads.) with the minor exception that some control parameters are still global and affect all threads. | |
| 333 * Added Cube, a new recognizer for Arabic. Cube can also be used in combination with normal Tesseract for other languages with an improvement in accuracy at the cost of (much) lower speed. *There is no training module for Cube yet.* | |
| 334 * `OcrEngineMode` in `Init` replaces `AccuracyVSpeed` to control cube. | |
| 335 * Greatly improved segmentation search with consequent accuracy and speed improvements, especially for Chinese. | |
| 336 * Added `PageIterator` and `ResultIterator` as cleaner ways to get the full results out of Tesseract, that are not currently provided by any of the `TessBaseAPI::Get*` methods. All other methods, such as the `ETEXT_STRUCT` in particular are deprecated and will be deleted in the future. | |
| 337 * ApplyBoxes totally rewritten to make training easier. It can now cope with touching/overlapping training characters, and a new boxfile format allows word boxes instead of character boxes, BUT to use that you have to have already bootstrapped the language with character boxes. "Cyclic dependency" on traineddata. | |
| 338 * Auto orientation and script detection added to page layout analysis. | |
| 339 * Deleted *lots* of dead code. | |
| 340 * Fixxht module replaced with scalable data-driven module. | |
| 341 * Output font characteristics accuracy improved. | |
| 342 * Removed the double conversion at each classification. | |
| 343 * Upgraded oldest structs to be classes and deprecated PBLOB. | |
| 344 * Removed non-deterministic baseline fit. | |
| 345 * Added fixed length dawgs for Chinese. | |
| 346 * Handling of vertical text improved. | |
| 347 * Handling of leader dots improved. | |
| 348 * Table detection greatly improved. | |
| 349 * Fixed a couple of memory leaks. | |
| 350 * Fixed font labels on output text. (Not perfect, but a lot better than before.) | |
| 351 * Cleanup and more bug fixes | |
| 352 * Special treatments for Hindi. | |
| 353 * Support for build in VS2010 with Microsoft Windows SDK for Windows 7 (thanks to Michael Lutz) | |
| 354 | |
| 355 2010-09-21 - V3.00 | |
| 356 * Preparations for thread safety: | |
| 357 * Changed TessBaseAPI methods to be non-static | |
| 358 * Created a class hierarchy for the directories to hold instance data, | |
| 359 and began moving code into the classes. | |
| 360 * Moved thresholding code to a separate class. | |
| 361 * Added major new page layout analysis module. | |
| 362 * Added HOCR output (issues 221, 263: thanks to amkryukov). | |
| 363 * Added Leptonica as main image I/O and handling. Currently optional, | |
| 364 but in future releases linking with Leptonica will be mandatory. | |
| 365 * Ambiguity table rewritten to allow definite replacements in place | |
| 366 of fix_quotes. | |
| 367 * Added TessdataManager to combine data files into a single file. | |
| 368 * Some dead code deleted. | |
| 369 * VC++6 no longer supported. It can't cope with the use of templates. | |
| 370 * Many more languages added. | |
| 371 * Doxygenation of most of the function header comments. | |
| 372 * Added man pages. | |
| 373 * Added bash completion script (issue 247: thanks to neskiem) | |
| 374 * Fix integer overview in thresholding (issue 366: thanks to Cyanide.Drake) | |
| 375 * Add Danish Fraktur support (issues 300, 360: thanks to | |
| 376 dsl602230@vip.cybercity.dk) | |
| 377 * Fix file pointer leak (issue 359, thanks to yukihiro.nakadaira) | |
| 378 * Fix an error using user-words (Issue 345: thanks to max.markin) | |
| 379 * Fix a memory leak in tablefind.cpp (Issue 342, thanks to zdravco) | |
| 380 * Fix a segfault due to double fclose (Issue 320, thanks to souther) | |
| 381 * Fix an automake error (Issue 318, thanks to ichanjz) | |
| 382 * Fix a Win32 crash on fileFormatIsTiff() (Issues 304, 316, 317, 330, 347, | |
| 383 349, 352: thanks to nguyenq87, max.markin, zdenop) | |
| 384 * Fixed a number of errors in newer (stricter) versions of VC++ (Issues | |
| 385 301, among others) | |
| 386 | |
| 387 2009-06-30 - V2.04 | |
| 388 * Integrated bug fixes and patches and misc changes for portability. | |
| 389 * Integrated a patch to remove some of the "access" macros. | |
| 390 * Removed dependence on lua from the viewer, speeding it up | |
| 391 dramatically. | |
| 392 * Fixed the viewer so it compiles and runs properly! | |
| 393 * Specifically fixing issues: 1, 63, 67, 71, 76, 81, 82, 106, 111, | |
| 394 112, 128, 129, 130, 133, 135, 142, 143, 145, 147, 153, 154, 160, | |
| 395 165, 170, 175, 177, 187, 192, 195, 199, 201, 205, 209, 108, 169 | |
| 396 | |
| 397 2008-04-22 - V2.03 | |
| 398 * Fixed crash introduced in 2.02. | |
| 399 * Fixed lack of tessembedded.cpp in distribution. | |
| 400 * Added test for leptonica header files and conditional test for lib. | |
| 401 | |
| 402 2008-04-21 - V2.02 (again) | |
| 403 * Fixed namespace collisions with jpeg library (INT32). | |
| 404 * Portability fixes for Windows for new code. | |
| 405 * Updates to autoconf system for new code. | |
| 406 | |
| 407 2008-01-23 - V2.02 | |
| 408 * Improvements to clustering, training and classifier. | |
| 409 * Major internationalization improvements for large-character-set | |
| 410 * languages, eg Kannada. | |
| 411 * Removed some compiler warnings. | |
| 412 * Added multipage tiff support for training and running. | |
| 413 * Updated graphics output to talk to new java-based viewer. | |
| 414 * Added ability to save n-best lists. | |
| 415 * Added leptonica support for more file types. | |
| 416 * Improved Init/End to make them safe. | |
| 417 * Reduced memory use of dictionaries. | |
| 418 * Added some new APIs to TessBaseAPI. | |
| 419 | |
| 420 2007-08-27 - V2.01 | |
| 421 * Fixed UTF8 input problems with box file reader. | |
| 422 * Fixed various infinite loops and crashes in dawg code. | |
| 423 * Removed include of config_auto.h from host.h. | |
| 424 * Added automatic wctype encoding to unicharset_extractor. | |
| 425 * Fixed dawg table too full error. | |
| 426 * Removed svn files from tarball. | |
| 427 * Added new functions to tessdll. | |
| 428 * Increased maximum utf8 string in a classification result to 8. | |
| 429 | |
| 430 2007-07-02 - V2.00 | |
| 431 * Converted internal character handling to UTF8. | |
| 432 * Trained with 6 languages. | |
| 433 * Added unicharset_extractor, wordlist2dawg. | |
| 434 * Added boxfile creation mode. | |
| 435 * Added UNLV regression test capability. | |
| 436 * Fixed problems with copyright and registered symbols. | |
| 437 * Fixed extern "C" declarations problem. | |
| 438 | |
| 439 2007-05-15 - V1.04 | |
| 440 * Added dll exports for Windows. | |
| 441 * Fixed name collisions with stl etc. | |
| 442 * Made some preliminary changes ready for unicodeization. | |
| 443 * Several bug fixes discovered during unicodeization. | |
| 444 | |
| 445 2007-02-02 - V1.03 | |
| 446 * Added mftraining and cntraining. | |
| 447 * Added baseapi with adaptive thresholding for grey and color. | |
| 448 * Fixed many memory leaks. | |
| 449 * Fixed several bugs including lack of use of adaptive classifier. | |
| 450 * Added ifdefs to eliminate graphics code and add embedded platform support. | |
| 451 * Incorporated several patches, including 64-bit builds, Mac builds. | |
| 452 * Minor accuracy improvements. | |
| 453 | |
| 454 2006-10-04 - V1.02 | |
| 455 * Removed dependency on Aspirin. | |
| 456 * Fixed a few missing Apache license headers. | |
| 457 * Removed $log. | |
| 458 | |
| 459 2006-09-07 - V1.01. | |
| 460 * Added mfcpch.cpp and getopt.cpp for VC++. | |
| 461 * Fixed problem with greyscale images and no libtiff. | |
| 462 * Stopped debug window from being used for the usage output. | |
| 463 * Fixed load of inttemp for big-endian architectures. | |
| 464 * Fixed some Mac compilation issues. | |
| 465 | |
| 466 2006-06-16 - V1.0 of open source Tesseract checked-in. |
