-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TFloat (FAST_FLOAT) work done & slightly different idea used to make code easily switchable between double & float #3490
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Conflicts: # src/training/combine_tessdata.cpp
# Conflicts: # src/ccutil/errcode.h # src/ccutil/serialis.cpp # src/ccutil/tprintf.h # src/viewer/scrollview.h
# Conflicts: # Makefile.am # src/ccutil/helpers.h # src/ccutil/scanutils.h # src/ccutil/tprintf.h # unittest/Makefile.am
# Conflicts: # dll/i686-w64-mingw32/iconv.dll # dll/i686-w64-mingw32/icudt64.dll # dll/i686-w64-mingw32/icuin64.dll # dll/i686-w64-mingw32/icuuc64.dll # dll/i686-w64-mingw32/libarchive-13.dll # dll/i686-w64-mingw32/libbz2-1.dll # dll/i686-w64-mingw32/libcairo-2.dll # dll/i686-w64-mingw32/libcurl-4.dll # dll/i686-w64-mingw32/libeay32.dll # dll/i686-w64-mingw32/libexpat-1.dll # dll/i686-w64-mingw32/libffi-6.dll # dll/i686-w64-mingw32/libfontconfig-1.dll # dll/i686-w64-mingw32/libfreetype-6.dll # dll/i686-w64-mingw32/libgcc_s_sjlj-1.dll # dll/i686-w64-mingw32/libgif-7.dll # dll/i686-w64-mingw32/libglib-2.0-0.dll # dll/i686-w64-mingw32/libgobject-2.0-0.dll # dll/i686-w64-mingw32/libgomp-1.dll # dll/i686-w64-mingw32/libharfbuzz-0.dll # dll/i686-w64-mingw32/libintl-8.dll # dll/i686-w64-mingw32/libjbig-2.dll # dll/i686-w64-mingw32/libjpeg-8.dll # dll/i686-w64-mingw32/liblept-5.dll # dll/i686-w64-mingw32/liblz4-1.dll # dll/i686-w64-mingw32/liblzma-5.dll # dll/i686-w64-mingw32/liblzo2-2.dll # dll/i686-w64-mingw32/libnettle-6.dll # dll/i686-w64-mingw32/libnghttp2-14.dll # dll/i686-w64-mingw32/libopenjp2.dll # dll/i686-w64-mingw32/libpango-1.0-0.dll # dll/i686-w64-mingw32/libpangocairo-1.0-0.dll # dll/i686-w64-mingw32/libpangoft2-1.0-0.dll # dll/i686-w64-mingw32/libpangowin32-1.0-0.dll # dll/i686-w64-mingw32/libpcre-1.dll # dll/i686-w64-mingw32/libpixman-1-0.dll # dll/i686-w64-mingw32/libpng16-16.dll # dll/i686-w64-mingw32/libssh2-1.dll # dll/i686-w64-mingw32/libstdc++-6.dll # dll/i686-w64-mingw32/libtiff-5.dll # dll/i686-w64-mingw32/libwebp-7.dll # dll/i686-w64-mingw32/libwinpthread-1.dll # dll/i686-w64-mingw32/libxml2-2.dll # dll/i686-w64-mingw32/libzstd-1.dll # dll/i686-w64-mingw32/ssleay32.dll # dll/i686-w64-mingw32/zlib1.dll # dll/x86_64-w64-mingw32/iconv.dll # dll/x86_64-w64-mingw32/icudt64.dll # dll/x86_64-w64-mingw32/icuin64.dll # dll/x86_64-w64-mingw32/icuuc64.dll # dll/x86_64-w64-mingw32/libarchive-13.dll # dll/x86_64-w64-mingw32/libbz2-1.dll # dll/x86_64-w64-mingw32/libcairo-2.dll # dll/x86_64-w64-mingw32/libcurl-4.dll # dll/x86_64-w64-mingw32/libeay32.dll # dll/x86_64-w64-mingw32/libexpat-1.dll # dll/x86_64-w64-mingw32/libffi-6.dll # dll/x86_64-w64-mingw32/libfontconfig-1.dll # dll/x86_64-w64-mingw32/libfreetype-6.dll # dll/x86_64-w64-mingw32/libgcc_s_seh-1.dll # dll/x86_64-w64-mingw32/libgif-7.dll # dll/x86_64-w64-mingw32/libglib-2.0-0.dll # dll/x86_64-w64-mingw32/libgobject-2.0-0.dll # dll/x86_64-w64-mingw32/libgomp-1.dll # dll/x86_64-w64-mingw32/libharfbuzz-0.dll # dll/x86_64-w64-mingw32/libintl-8.dll # dll/x86_64-w64-mingw32/libjbig-2.dll # dll/x86_64-w64-mingw32/libjpeg-8.dll # dll/x86_64-w64-mingw32/liblept-5.dll # dll/x86_64-w64-mingw32/liblz4-1.dll # dll/x86_64-w64-mingw32/liblzma-5.dll # dll/x86_64-w64-mingw32/liblzo2-2.dll # dll/x86_64-w64-mingw32/libnettle-6.dll # dll/x86_64-w64-mingw32/libnghttp2-14.dll # dll/x86_64-w64-mingw32/libopenjp2.dll # dll/x86_64-w64-mingw32/libpango-1.0-0.dll # dll/x86_64-w64-mingw32/libpangocairo-1.0-0.dll # dll/x86_64-w64-mingw32/libpangoft2-1.0-0.dll # dll/x86_64-w64-mingw32/libpangowin32-1.0-0.dll # dll/x86_64-w64-mingw32/libpcre-1.dll # dll/x86_64-w64-mingw32/libpixman-1-0.dll # dll/x86_64-w64-mingw32/libpng16-16.dll # dll/x86_64-w64-mingw32/libssh2-1.dll # dll/x86_64-w64-mingw32/libstdc++-6.dll # dll/x86_64-w64-mingw32/libtiff-5.dll # dll/x86_64-w64-mingw32/libwebp-7.dll # dll/x86_64-w64-mingw32/libwinpthread-1.dll # dll/x86_64-w64-mingw32/libxml2-2.dll # dll/x86_64-w64-mingw32/libzstd-1.dll # dll/x86_64-w64-mingw32/ssleay32.dll # dll/x86_64-w64-mingw32/zlib1.dll # src/ccutil/errcode.h # src/ccutil/tprintf.h # src/viewer/scrollview.h
# Conflicts: # configure.ac
# Conflicts: # Makefile.am # unittest/Makefile.am
# Conflicts: # src/api/pdfrenderer.cpp
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
…ome NOT NICE: code repetition at another level. TODO: Better idea? --> Maybe namespaces and double kernel projects or compile via #define+#include-all-source-files hack collective source code pages? (Latter approach may become a problem when debugging, or will the compiler suite cope well? Will know only once done & tested.) At least this is about the point where the function template solution stops to be useful. The run-time switching desire between float and double is doable, but not by using #ifdef/#else throughout, nor templating all the way up the TFloat usage calltree.
Revert previous commit: "HMMM. This is where the float/double co-existence stuff starts to become NOT NICE: code repetition at another level." This reverts commit 8d40552.
# Conflicts: # src/arch/dotproductsse.cpp
# Conflicts: # src/arch/intsimdmatrixavx2.cpp
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
# Conflicts: # src/arch/dotproduct.cpp # src/arch/dotproductsse.cpp # src/arch/intsimdmatrixavx2.cpp
…: added that one as another enabling condition since benchmarks have shown MSVC2019's `/openmp:experimental` to deliver. :-) (See tesseract-ocr#3486 benchmark reports on @stweil's DotProductNative() implementation)
…: added that one as another enabling condition since benchmarks have shown MSVC2019's `/openmp:experimental` to deliver. :-) (See tesseract-ocr#3486 benchmark reports on @stweil's DotProductNative() implementation)
GerHobbelt
added a commit
to GerHobbelt/tesseract
that referenced
this pull request
Jul 13, 2021
…g function templates for TFloat float & double implementations to co-exist in the run-time without cluttering the code with #if/#else and no run-time switches (yet). ## Observations thus far - DRY? Check! - the whole function template (and let the C++ compiler do the heavy lifting) idea of stops somewhere. This regrettably happens to be at the weightmatrix.cpp code, where the code calls the CPU+configuration-selected SIMD implementation via function pointer: `intSimdMatrix->matrixDotVectorFunction` -- this would require code duplication of some kind (e.g. a FP32 callback pointer co-existing with a FP64 callback ptr in the struct and then have the code pick the right one, depending on current TFloat size, for example) and is thus deemed unsatisfactory (my opinion). - So far, and very probably independent of any solutions for the co-existence issue at higher levels in the code, this template approach works out well, with the compiler smartly picking the one matching the current float/double choice. - while we have double the number of specialized SIMD implementations (obviously), these do not need #if/#else checks as we can let the C++ compiler do its prototype matching job --> cleaner code. - the template functions also help clean up the serialization/de-serialization code as the `<T, ST>` dual-type approach there allows one to specify the run-time type (TFloat) and the file-storage type at the same time: also do note how this cleans up the 'Old' scales deserialization code, as the old file storage is simply 'float' instead of 'double'. - the added cost there is a double copy of file data when T==ST, but that turned out negligible in the preliminary tests as that bit of code didn't even reach the Top20 CPU Guzzlers Chart, so that extra copy can wait for smarter C++ template writers to take care of when microtuning is called for.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 14, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
GerHobbelt
added a commit
to GerHobbelt/tesseract
that referenced
this pull request
Jul 15, 2021
…g function templates for TFloat float & double implementations to co-exist in the run-time without cluttering the code with #if/#else and no run-time switches (yet). ## Observations thus far - DRY? Check! - the whole function template (and let the C++ compiler do the heavy lifting) idea of stops somewhere. This regrettably happens to be at the weightmatrix.cpp code, where the code calls the CPU+configuration-selected SIMD implementation via function pointer: `intSimdMatrix->matrixDotVectorFunction` -- this would require code duplication of some kind (e.g. a FP32 callback pointer co-existing with a FP64 callback ptr in the struct and then have the code pick the right one, depending on current TFloat size, for example) and is thus deemed unsatisfactory (my opinion). - So far, and very probably independent of any solutions for the co-existence issue at higher levels in the code, this template approach works out well, with the compiler smartly picking the one matching the current float/double choice. - while we have double the number of specialized SIMD implementations (obviously), these do not need #if/#else checks as we can let the C++ compiler do its prototype matching job --> cleaner code. - the template functions also help clean up the serialization/de-serialization code as the `<T, ST>` dual-type approach there allows one to specify the run-time type (TFloat) and the file-storage type at the same time: also do note how this cleans up the 'Old' scales deserialization code, as the old file storage is simply 'float' instead of 'double'. - the added cost there is a double copy of file data when T==ST, but that turned out negligible in the preliminary tests as that bit of code didn't even reach the Top20 CPU Guzzlers Chart, so that extra copy can wait for smarter C++ template writers to take care of when microtuning is called for.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 15, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
GerHobbelt
added a commit
to GerHobbelt/tesseract
that referenced
this pull request
Jul 15, 2021
…g function templates for TFloat float & double implementations to co-exist in the run-time without cluttering the code with #if/#else and no run-time switches (yet). ## Observations thus far - DRY? Check! - the whole function template (and let the C++ compiler do the heavy lifting) idea of stops somewhere. This regrettably happens to be at the weightmatrix.cpp code, where the code calls the CPU+configuration-selected SIMD implementation via function pointer: `intSimdMatrix->matrixDotVectorFunction` -- this would require code duplication of some kind (e.g. a FP32 callback pointer co-existing with a FP64 callback ptr in the struct and then have the code pick the right one, depending on current TFloat size, for example) and is thus deemed unsatisfactory (my opinion). - So far, and very probably independent of any solutions for the co-existence issue at higher levels in the code, this template approach works out well, with the compiler smartly picking the one matching the current float/double choice. - while we have double the number of specialized SIMD implementations (obviously), these do not need #if/#else checks as we can let the C++ compiler do its prototype matching job --> cleaner code. - the template functions also help clean up the serialization/de-serialization code as the `<T, ST>` dual-type approach there allows one to specify the run-time type (TFloat) and the file-storage type at the same time: also do note how this cleans up the 'Old' scales deserialization code, as the old file storage is simply 'float' instead of 'double'. - the added cost there is a double copy of file data when T==ST, but that turned out negligible in the preliminary tests as that bit of code didn't even reach the Top20 CPU Guzzlers Chart, so that extra copy can wait for smarter C++ template writers to take care of when microtuning is called for.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 19, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 20, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 20, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 20, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 21, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 21, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 21, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
stweil
pushed a commit
to stweil/tesseract
that referenced
this pull request
Jul 21, 2021
…6d scale01234567 = _mm256_loadu_ps(scales)`, i.e. loading float vectors into double vector types. Extract from tesseract-ocr#3490.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@stweil : saw your work in the TFloat branch.
This pullreq is FYI only (it's an unaldulterated copy of my tesseract fork's TFloat branch and thus has way too many commit diffs to be mergeable) -- I'll ready a decent pullreq tonight or tomorrow, but I wanted to send a heads up so you can decide whether you like this or not.
The key idea here with FAST_FLOAT is to use template<T, ST> functions for Serialize and DeSerialize, so we don't have code cluttered with lots of #ifdef/ifndef/... to make it happen.
Problem being the tesseract data files: those carry data in
double
format (orfloat
for 'old' scales[]), while the run-time type is dictated by the new TFloat type you came up with: that one is either float or double, depending on a compile-time define. (FAST_FLOAT)The idea implemented here is to have the Serialize and DeSerialize functions (in TFile and elsewhere) do any necessary conversions between the run-time type and file-storage (persisted) type.
See for example this bit of code from serialis.h:
which now has the extra Serialize() members:
and
where
T
represents the run-time type (Tfloat) andST
represents the storage type (double
for new data files;float
for old scales[] data).Then the code can easily dictate what the output to disk is going to be and thus be cross-compatible with other builds, which have their FAST_FLOAT /un/defined, e.g. this snippet from
weightmatrix.cpp
:Note the
<float>
and<double>
usage there: this in the function calls: these now dictate the output format (and at a place in the code where this is relevant: now this code remains easy to read as the file format can be read off the code lines without any trouble).I hope you like it. The commit list attached to this pullreq is a mess: disregard anything but the last couple commits (a cleaned up pullreq will follow next day), where this work was done (including work on AVX/SSE code to cope with the new float vs double TFloat approach:
There's other work in there too, which will be filed in a separate pullreq as it's only sideways related: