Skip to content

Assertion failure with new image and eng+chi_tra fast #4362

Open
@marcreichman-pfi

Description

@marcreichman-pfi

Current Behavior

This is in the recent main (9f17a3fd) I receive a SIGABRT in Release (SIGILL in Debug) with the eng and chi_tra langages. Both are fast and official.

(gdb) set args ~/dev/testimages/ACCDEE72E33B2C425E597A4411009466.jpg - --tessdata-dir <snip>/tessdata/ -l eng+chi_tra
(gdb) r
Starting program: /root/dev/tesseract/build-debug/bin/tesseract ~/dev/testimages/ACCDEE72E33B2C425E597A4411009466.jpg - --tessdata-dir <snip>/tessdata/ -l eng+chi_tra
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Estimating resolution as 261
Detected 12 diacritics
[New Thread 0x7ffff73c6640 (LWP 5374)]
[New Thread 0x7ffff6bc5640 (LWP 5375)]
[New Thread 0x7ffff63c4640 (LWP 5376)]
!w_it.cycled_list():Error:Assert failed:in file /root/dev/tesseract/src/ccstruct/pageres.cpp, line 1502

Thread 1 "tesseract" received signal SIGILL, Illegal instruction.
tesseract::ERRCODE::error (this=this@entry=0x5555558a1340 <tesseract::ASSERT_FAILED>, caller=caller@entry=0x5555557f9123 "!w_it.cycled_list()", action=action@entry=tesseract::ABORT, format=format@entry=0x5555557f8900 "in file %s, line %d") at /root/dev/tesseract/src/ccutil/errcode.cpp:78
78            __builtin_trap();
(gdb) bt
#0  tesseract::ERRCODE::error (this=this@entry=0x5555558a1340 <tesseract::ASSERT_FAILED>, caller=caller@entry=0x5555557f9123 "!w_it.cycled_list()", action=action@entry=tesseract::ABORT,
    format=format@entry=0x5555557f8900 "in file %s, line %d") at /root/dev/tesseract/src/ccutil/errcode.cpp:78
#1  0x000055555558485c in tesseract::PAGE_RES_IT::DeleteCurrentWord (this=this@entry=0x7fffffffdc00) at /root/dev/tesseract/src/ccstruct/pageres.cpp:1502
#2  0x000055555561a972 in tesseract::Tesseract::recog_all_words (this=0x7ffff73c7010, page_res=0x5555558e18e0, monitor=monitor@entry=0x0, target_word_box=target_word_box@entry=0x0,
    word_config=word_config@entry=0x0, dopasses=dopasses@entry=0) at /root/dev/tesseract/src/ccmain/control.cpp:446
#3  0x00005555555d5553 in tesseract::TessBaseAPI::Recognize (this=this@entry=0x7fffffffe2d0, monitor=monitor@entry=0x0) at /root/dev/tesseract/src/api/baseapi.cpp:833
#4  0x00005555555d57e3 in tesseract::TessBaseAPI::ProcessPage (this=this@entry=0x7fffffffe2d0, pix=0x5555558e2230, page_index=page_index@entry=0,
    filename=filename@entry=0x7fffffffe774 "/root/dev/testimages/ACCDEE72E33B2C425E597A4411009466.jpg", retry_config=retry_config@entry=0x0, timeout_millisec=timeout_millisec@entry=0,
    renderer=0x5555558d2740) at /root/dev/tesseract/src/api/baseapi.cpp:1218
#5  0x00005555555d68e4 in tesseract::TessBaseAPI::ProcessPagesInternal (this=this@entry=0x7fffffffe2d0,
    filename=0x7fffffffe774 "/root/dev/testimages/ACCDEE72E33B2C425E597A4411009466.jpg", retry_config=retry_config@entry=0x0, timeout_millisec=timeout_millisec@entry=0,
    renderer=0x5555558d2740) at /root/dev/tesseract/src/api/baseapi.cpp:1181
#6  0x00005555555d69ea in tesseract::TessBaseAPI::ProcessPages (this=this@entry=0x7fffffffe2d0, filename=<optimized out>, retry_config=retry_config@entry=0x0,
    timeout_millisec=timeout_millisec@entry=0, renderer=<optimized out>) at /root/dev/tesseract/src/api/baseapi.cpp:998
#7  0x000055555556d6c3 in main (argc=<optimized out>, argv=<optimized out>) at /usr/include/c++/11/bits/unique_ptr.h:173

Expected Behavior

No sig abort

Suggested Fix

No response

tesseract -v

tesseract 5.5.0-26-g9f17a
 leptonica-1.82.0
  libgif 5.1.9 : libjpeg 8d (libjpeg-turbo 2.1.1) : libpng 1.6.37 : libtiff 4.3.0 : zlib 1.2.11 : libwebp 1.2.2 : libopenjp2 2.4.0
 Found AVX
 Found SSE4.1
 Found OpenMP 201511

Operating System

Ubuntu 22.04 Jammy

Other Operating System

WSL

uname -a

Linux hostname 5.10.16.3-microsoft-standard-WSL2 #1 SMP Fri Apr 2 22:23:49 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Compiler

GCC 11.4

CPU

Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz

Virtualization / Containers

No response

Other Information

I'm sure this is related to the random generator-covered series of issues (#4361 #4146 #4148 #4270). This is also reproducible in 5.5.0, unlike #4361 which worked on in 5.5.0.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions