-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce POSIX data types #177
Conversation
Updated patch: removed part for format specifiers, don't remove NULL definition (both issues should be done in separate patches). |
Maybe it would be even better to go thru the code and use, for example, |
@LinusU, yes, that would be the next step as soon as this pull request was accepted. |
👍 |
Ping? Are there any more thoughts on my proposal? |
...waiting for review @theraysmith |
@theraysmith, ping. Do you support the idea of replacing Tesseract data types by POSIX data types (so I can prepare a follow-up pull request)? Many other free software projects with similar compiler / host conditions have shown that using POSIX data types works. Especially for library interfaces, but also for the rest of the code, it would be good to get rid of project specific data types which don't provide any additional value compared with the standard. |
Ping? I suggest to apply this patch now, wait one more month and then replace all Tesseract integer types by the POSIX types. |
@theraysmith, @zdenop, do you have any comments to my last proposal? Can we proceed like that? |
Ping? |
I still think it would be a good idea to replace all proprietary data types by the POSIX ones. Is it really necessary to wait for @theraysmith (who is obviously very busy)? Lots of other free (and also commercial) software projects work pretty good with the POSIX data types, using similar environments as Tesseract (Linux and other Unixes, Windows with Cygwin / MinGW-w64 / MS and other compilers). So can we do the first step and apply this PR which is waiting for more than 7 months now? |
Another month passed. I'd still like to see Tesseract switching to POSIX data types. |
@zdenop said re: next release in #165 (comment)
Will dropping support for compilers effect Tesseract switching to POSIX data types? |
I don't think that POSIX data types are affected by the compiler decision. They exist for many years now, so any supported old or new compiler will work with POSIX data types. |
POSIX provides portable data types for signed and unsigned integer values of different size. This patch maps those POSIX data types to the Tesseract specific types. In a next step, the Tesseract data types can be eliminated by replacing them with the POSIX data types. Signed-off-by: Stefan Weil <sw@weilnetz.de>
@theraysmith, may I kindly ask you to give your consent? |
Now we're synced, yes I think that would be a positive change, but it may be something to keep for 4.00 forward, so as to maintain maximum support for old compilers in the last 3.xx version. |
Yes, just hit button "Merge pull request" if you wan to include it to currect (4.00) code |
Woohoo! 🎉 Now if we could only replace all occurences of |
Yes, replacing those Tesseract data types by the POSIX data types will be the next step. @zdenop, that means changes for nearly all source files which will give conflicts with pending pull requests. Should I nevertheless send one large PR which does the replacement, or would it be better to do it in smaller steps (starting for example with all files in |
@theraysmith : Is there plan to commit new code that would interfere with this changes? |
@theraysmith, training/stringrenderer.cpp already uses Do you care for comments after modified code? Replacing data types or NULL is easy, but the replacements are a little bit longer, and moving the comments to the right column means much hand work without code formatter. |
Training is a different kettle of fish to the recognition engine, as the
latter is currently (3.0x) *much* more portable than the training tools.
I want to see if anyone squeals when they start porting the 4.00
recognition engine to other platforms, although there are likely to be
howls of protests over the other missing ingredients first, like big-endian
support and SIMD extensions.
Although now I mention that, anyone who has a machine old enough to squeal
over C++11 probably doesn't have the horsepower to run 4.00 at a reasonable
speed.
…On Thu, Nov 24, 2016 at 10:38 AM, Stefan Weil ***@***.***> wrote:
I already put a few experimental uses of nullptr in 4.00 to see if anyone
squeals.
@theraysmith <https://github.com/theraysmith>,
training/stringrenderer.cpp already uses nullptr for more than two years
now. AFAIK nobody complained, so that seems to work. Replacing NULL by
nullptr would be good, but also touches many files, so this could be done
in the same action as the switch to POSIX data types.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#177 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL056QF3tQJra8WTACr_cvaeVjUG0yP6ks5rBdmWgaJpZM4G6wao>
.
--
Ray.
|
FYI: This change breaks several tests in the Google repository because the Google int64 is long long and the posix int64_t is just long, and the compiler says they are not compatible. |
Yes, the change to POSIX requires some work (not only in your tests), but it will help in the long run – for example as soon as there is a 128 bit architecture with long being int128_t. And it helps a lot as soon as you want to use Tesseract code in other software. |
There's also |
This fixes a warning from the Intel compiler: src/textord/cjkpitch.cpp(79): warning tesseract-ocr#177: function "<unnamed>::SimpleStats::maximum" was declared but never referenced Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes warnings from the Intel compiler: src/textord/cjkpitch.cpp(319): warning tesseract-ocr#177: function "<unnamed>::FPRow::good_gaps" was declared but never referenced src/textord/cjkpitch.cpp(383): warning tesseract-ocr#177: function "<unnamed>::FPRow::is_bad" was declared but never referenced src/textord/cjkpitch.cpp(387): warning tesseract-ocr#177: function "<unnamed>::FPRow::is_unknown" was declared but never referenced Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes a warning from the Intel compiler: src/textord/cjkpitch.cpp(79): warning tesseract-ocr#177: function "<unnamed>::SimpleStats::maximum" was declared but never referenced Signed-off-by: Stefan Weil <sw@weilnetz.de>
This fixes warnings from the Intel compiler: src/textord/cjkpitch.cpp(319): warning tesseract-ocr#177: function "<unnamed>::FPRow::good_gaps" was declared but never referenced src/textord/cjkpitch.cpp(383): warning tesseract-ocr#177: function "<unnamed>::FPRow::is_bad" was declared but never referenced src/textord/cjkpitch.cpp(387): warning tesseract-ocr#177: function "<unnamed>::FPRow::is_unknown" was declared but never referenced Signed-off-by: Stefan Weil <sw@weilnetz.de>
POSIX provides portable data types for signed and unsigned integer values
of different size.
This patch maps those POSIX data types to the Tesseract specific types.
In a next step, the Tesseract data types can be eliminated by replacing
them with the POSIX data types.
Use also standard definitions for the printf format specifiers.MS Visual Studio does not support that standard (at least not in older
versions), so local definitions are needed there.
NULL is standard, so a local definition should not be needed.Signed-off-by: Stefan Weil sw@weilnetz.de