Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messy project management, fixes randomly reverted #122

Open
pps83 opened this issue Feb 15, 2024 · 10 comments
Open

Messy project management, fixes randomly reverted #122

pps83 opened this issue Feb 15, 2024 · 10 comments

Comments

@pps83
Copy link
Contributor

pps83 commented Feb 15, 2024

I'm wondering why you treat project history as some sort of dumpster where most of commits have no commit messages, and pretty much 95% commit history has no value. I don't even mention extremely messy code.

Just like that randomly you commit back old bugs. Do you review yourself what you commit?
For example, this fix:

9f58944#diff-e05c108e3bce4066346277d43bd50252d91769b4387f680dccc03afcc3f8e3e8R452

was randomly reverted. Here's current version:

https://github.com/powturbo/TurboPFor-Integer-Compression/blame/06d6aad98b4be5471289f35d5d04fac4469cf6df/lib/vsimple.c#L456

There was an extremely annoying bug in turbopfor where it would produce different results because some registers weren't cleared. I just double checked and sure enough, that change is randomly removed. I didn't test latest code if it's buggy, but the change clearly was reverted.

That's the commit where it was fixed: fac4c73 (note extremely useless commit message, wtf).

Here's current master and the fix is removed: https://github.com/powturbo/TurboPFor-Integer-Compression/blob/master/lib/vp4c.c#L417 Here's the relevant bugreport: #60

wtf, how do you manage to do that kind of stuff.

Sometimes I look at code like that, and I have no guesses how you end up with code like this:
https://github.com/powturbo/TurboPFor-Integer-Compression/blob/master/lib/vp4c.c#L430
like you have a program that randomly makes code unreadable?

@ordinaryorange
Copy link

I cant comment on the code quality or commit in question, but I have some tests which are passing on p4nzenc256v32 encoding and decoding. I can only fault #115

@pps83
Copy link
Contributor Author

pps83 commented Feb 23, 2024

I reported and fixed some bugs, luckily I added unit tests in my code to verify this mess won't happen again. Sure enough, with latest code the bugs are back. In short, p4nenc256v32 and p4nzenc32 do not produce bitexact results. Total wtf state of the project. Not sure what's the reason, I was expecting asan to catch the bug, but it didn't help

@pps83
Copy link
Contributor Author

pps83 commented Feb 23, 2024

what a trainwreck, @powturbo what's the reason almost all header guards were removed

@mcmayer
Copy link

mcmayer commented Feb 23, 2024

@pps83 I admire your patience. I wrote a Rust wrapper a while back, and I basically turned my back, because the thing is an unruly mess. Thanks for the effort!

@powturbo
Copy link
Owner

First of all IT'S MY CODE AND I CAN DO WHATEVER I THINK IT'S NECESSARY OR NOT.
There are almost 1000 functions included in the TurboPFor package and this is my way to manage the code
in my limited spare time.
You must revisit your disrespectfull attitude against people spending their spare time contributing to
the open source community. Hopefully it's not late for you.

Last, I've made a full refactoring of the library which is now more friendly to use.
All the old git history is also obsolete.

The only 2 files to use are "ic.h" under the directory "include" and the library "libic" (see icapp.c).
Other header files are not needed. icapp is the best working demonstration that TurboPFor is state of art compression library (see the benchmarks in the issues section).

I've made countless tests with icapp and actually I'm not aware of potential bugs (except the one with low priority reported in the issues).
If you find a bug you're welcome to report it and I'll look at it when time permits.

I'm using github only to upload my local repository.

@mcamayer : you've not written a wrapper to the TurboPFor library, but a fully different interface and nobody was interested on it.
There is now a true rust wrapper included in TurboPFor which makes other rust wrappers obsolete.
You're welcome to test it if you're interested.

@ordinaryorange
Copy link

@pps83 are you able to share your tests for those functions. I've got over 2000 test over 250+ functions and all are passing bar the one I previously mentioned. I'm interested to see what causes the bit differences you mention.

I personally think think this project is an excellent compilation of integer and floating point algorithms, I've not found anything close to the sheer extent of functions on offer, that in itself is an accomplishment. @powturbo methods and approach may be different, but I have to agree that the opening of this issue isn't exactly constructive, and if we are not at least supportive to what has been done to date then there is less chance of things improving, or worse still it just disappears all together, and that does not help anyone.

@pps83
Copy link
Contributor Author

pps83 commented Feb 26, 2024

@ordinaryorange I have my own wrappers, but you can see the idea from the code:

#undef NDEBUG
#include "gtest/gtest.h"
#include "util.h"
#include "coders.h"

uint32_t rand32();

static void testCoder(const coders::IntegerCodec& coder, std::span<uint32_t> in)
{
    const size_t SZ = in.size();
    std::vector<uint8_t> buf8(4 * (SZ + SZ / 4 + 64), 0xaa);
    std::vector<uint32_t> buf32a, buf32b;
    size_t a = coder.encode(in, buf32a);
    size_t b = coder.encode(&in[0], in.size(), buf32b);
    size_t c = coder.encode(&in[0], in.size(), &buf8[0], buf8.size());
    buf8.resize(c);

    const uint8_t* p0 = (const uint8_t*)buf8.data();
    const uint8_t* p1 = (const uint8_t*)buf32a.data();
    const uint8_t* p2 = (const uint8_t*)buf32b.data();
    for (unsigned i = 0; i < c; ++i)
    {
        uint8_t c0 = p0[i], c1 = p1[i], c2 = p2[i];
        if (c0 != c1 || c0 != c2)
        {
            printf("coder: %s. not equal at pos:%u 0x%02x != 0x%02x (total size:%zu)\n", std::string(coder.name()).c_str(), i, c0, c0 != c1 ? c1 : c2, c);
            assert(0);
        }
    }

    assert(a == b);
    assert(a == c);
    assert(buf32a == buf32b);
    assert(0 == memcmp(buf32a.data(), buf8.data(), c));

    //size_t rate = c * 10000 / (SZ * 4);
    //LOG("coder: %s (%zu.%02zu%%)", std::string(coder.name()).c_str(), rate / 100, rate % 100);

    std::vector<uint32_t> out32a_buf(SZ + 32), out32b_buf(SZ + 32);
    std::span<uint32_t> out32a(out32a_buf.data(), SZ);
    std::span<uint32_t> out32b(out32b_buf.data(), SZ);

    buf8.insert(buf8.end(), 128, 0); // +128 for padding
    a = coder.decode(&buf8[0], c, out32a);
    b = coder.decode(&buf8[0], c, out32b.data(), SZ);

    out32a_buf.resize(SZ);
    out32b_buf.resize(SZ);

    assert(a == b);
    assert(a == SZ);
    assert(out32a_buf == out32b_buf);
}

TEST(Compression, testIntegerCodec)
{
    //const unsigned sizes[] = { 1, 2, 3, 4, 5, 6, 7, 13, 47, 127, 331, 997, 1331, 2345, 13331 };
    const unsigned sizes[] = { 1, 3, 7, 13, 47, 127, 331, 1331 };
    for (const unsigned SZ : sizes)
    {
        for (int nn = 0; nn < 4; nn++)
        {
            const unsigned n = (nn & 1) ? 1000 : 100;
            const unsigned diff = 10;
            std::vector<uint32_t> in_buf(SZ + 32, n); // +32 for padding
            std::span<uint32_t> in(in_buf.data(), SZ);
            if (nn > 1)
            {
                for (auto& n1 : in)
                    n1 += (rand32() % (diff * 2)) - diff;
            }
            for (auto& coder : coders::allCodecs())
            {
                if (coder.name() != "Null" || coder.name() != "simdgroupsimple_ringbuf")
                    testCoder(coder, in);
            }
        }
    }
}

p4nenc256v32 and p4nzenc32 do not produce identical outputs from the same input. Looks like some uninitialized memory is used

@pps83
Copy link
Contributor Author

pps83 commented Feb 26, 2024

I personally think think this project is an excellent compilation of integer and floating point algorithms, I've not found anything close to the sheer extent of functions on offer, that in itself is an accomplishment.

I have the same opinion. But if you see an issue, there is no way to trace or figure it out from the commit history.
Most headers now don't have header guards, and the ic.h is simply concatenation of existing headers. Well, just like that some might want to do the same with src files (I happen to do that for my project), and it's impossible to build it now due to missing header guards

@pps83
Copy link
Contributor Author

pps83 commented Apr 15, 2024

@powturbo please try to compile your code as c++, you'll see how badly riddled the code with all kinds of issues. In some cases, output param is declared as const pointer, in some cases in/out names have to be reversed. In all the cases where you try to do 4x per loop manual unrolling - it's buggy and does no unrolling at all: if you looked at endless warnings, you'd see that the compiler is trying to tell you that "no value of type 'bool' promoted to type 'int' can equal the given constant".

@pps83
Copy link
Contributor Author

pps83 commented Jun 12, 2024

VLC code is broken in 32bit mode. Will corrupt data.

#define bitput_t T3(uint, __WORDSIZE, _t)
(and all the other places that reference __WORDSIZE must be changed to 64)

what a bugware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants