Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Adds basic schema evolution tests #5611

Merged
merged 4 commits into from
Nov 14, 2019

Conversation

dbaileychess
Copy link
Collaborator

This change adds some basic schema evolution tests in C++. It verifies that the proper evolution of a schema is valid in a backwards- and forwards-compatible manner. It also checks that the Verifiers for the multiple versions of the schema can correctly read new and old data.

It adds two schema, that are properly evolved from each other. The second version of the schema is annotated with the changes from the first version. The generated C++ headers for those schema are also included.

I tried to test the most common evolution points:

  • Field deprecation
  • Field reordering using the id attribute
  • Field additions
  • Union additions
  • Enum additions
  • Field renaming

Tests: flattest runs successfully.

For a follow up to this PR, tests should be added to test improper evolution of schema.

I would also take suggestions on other tests that should be performed.

King,
Queen
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enum(union) without at least one explicit value isn't a good example for a scheme with evolution.
The left-added value will break the schema.

enum Enum : byte {
    Person, // 0
    King, // 1, was 0
    Queen // 2, was 1
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is just looking at properly evolved schema changes. I suggest another PR for testing cases where the schema are not evolved properly (like your Person example above).

@aardappel
Copy link
Collaborator

Nice! This will be good to have.

Of course, the test coverage we get out of a correct schema evolution is minimal, since this already a fairly common path. Testing incorrectness is harder, and would almost be done better with some kind of fuzzing approach (see the C++ tests that randomly generate schemas at runtime)?

Also of note is --conform which could be used to test expected fail/pass of certain schemas.

Whenever possible, I'd prefer for such tests to work at runtime (do not generate code), just to keep the bulk of test files under control.

@dbaileychess
Copy link
Collaborator Author

Part of this is to test the runtime verifiers as well, as there was a bug in the verifier when a Union was evolved.

@aardappel
Copy link
Collaborator

Looks like some of the CI errors may be related to this PR

@dbaileychess
Copy link
Collaborator Author

For the Xcode issue in travis I see:

"ld: library not found for -lgrpc++_unsecure
clang: error: linker command failed with exit code 1 (use -v to see invocation)"

So it looks like a linking error that shouldn't be impacted by this PR.

I did see a previous error about the Enum value 3 not be valid in the verifier (though the current issue masks this), but that seems like a bug in the verifier to me. Especially since it works in all the other environments.

@aardappel
Copy link
Collaborator

I see:

tests\test.cpp(2328): error C3861: 'snprintf': identifier not found [C:\projects\flatbuffers\flattests.vcxproj]
tests\test.cpp(2331): error C3861: 'snprintf': identifier not found [C:\projects\flatbuffers\flattests.vcxproj]
tests\test.cpp(2347): error C2143: syntax error : missing ',' before ':' [C:\projects\flatbuffers\flattests.vcxproj]
tests\test.cpp(2347): error C2530: 'verifier' : references must be initialized [C:\projects\flatbuffers\flattests.vcxproj]
tests\test.cpp(2347): error C3531: 'verifier': a symbol whose type contains 'auto' must have an initializer [C:\projects\flatbuffers\flattests.vcxproj]
tests\test.cpp(2347): error C2143: syntax error : missing ';' before '{' [C:\projects\flatbuffers\flattests.vcxproj]
tests\test.cpp(2348): error C2100: illegal indirection [C:\projects\flatbuffers\flattests.vcxproj]
tests\test.cpp(2348): error C2664: 'Evolution::V1::VerifyRootBuffer' : cannot convert parameter 1 from 'int' to 'flatbuffers::Verifier &' [C:\projects\flatbuffers\flattests.vcxproj]
tests\test.cpp(2349): error C2100: illegal indirection [C:\projects\flatbuffers\flattests.vcxproj]
tests\test.cpp(2349): error C2664: 'Evolution::V2::VerifyRootBuffer' : cannot convert parameter 1 from 'int' to 'flatbuffers::Verifier &' [C:\projects\flatbuffers\flattests.vcxproj]

Instead of snprintf, maybe use NumToString and friends?

@dbaileychess
Copy link
Collaborator Author

That's odd, I don't see that. Maybe you have a different view into Travis than non owners? Or I might just be looking into the wrong place.

I'll explore using NumToString

@aardappel
Copy link
Collaborator

Do you not see AppVeyor test results? I thought that information is the same for everyone. Can you see:

https://ci.appveyor.com/project/google-admin/flatbuffers/builds/28618031
https://ci.appveyor.com/project/google-admin/flatbuffers/builds/28618031/job/4hsgnubr3atcgk26

Not sure what that "google-admin" is doing in there, maybe that is the source of the problem?

@dbaileychess
Copy link
Collaborator Author

I see those results. So it was user error, I was just looking at the wrong place.

@dbaileychess dbaileychess force-pushed the evolution-tests branch 4 times, most recently from 19b602f to 3863fd1 Compare November 8, 2019 15:47
@dbaileychess
Copy link
Collaborator Author

For the XCode build I see the following error:

1: Test command: /Users/travis/build/google/flatbuffers/flattests
1: Test timeout computed to be: 9.99988e+06
1: /Users/travis/build/google/flatbuffers/tests/test_assert.h:62:9: runtime error: load of value 3, which is not a valid value for type 'Evolution::V1::Enum'
1: SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Users/travis/build/google/flatbuffers/tests/test_assert.h:62:9 in 
1/2 Test #1: flattests ........................***Exception: Child aborted  0.27 sec
/Users/travis/build/google/flatbuffers/tests/test_assert.h:62:9: runtime error: load of value 3, which is not a valid value for type 'Evolution::V1::Enum'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /Users/travis/build/google/flatbuffers/tests/test_assert.h:62:9 in 

Which is seemingly failing because the V1 of the schema doesn't contain an enum value of 3 (which was added in the V2 of the schema). But all the other builds are fine, so I am confused on why it would fail in this configuration.

@vglavnyy
Copy link
Contributor

vglavnyy commented Nov 8, 2019

XCode build uses Apple LLVM version 9.1.0 (clang-902.0.39.1) with ASAN+UBSAN.
Better to use static_cast to underlying type inside TEST_EQ.
Or use 3 as the first argument of TEST_EQ.

@dbaileychess
Copy link
Collaborator Author

Thanks for the pointer, I'll try it out.

Won't the static_cast of int8_t to Evolution::V1::Enum inside the generated code for a value not in Evolution::V1::Enum also throw an error if it is being sanitized?

@dbaileychess dbaileychess force-pushed the evolution-tests branch 3 times, most recently from e53a08b to bfd292e Compare November 8, 2019 18:49
@vglavnyy
Copy link
Contributor

vglavnyy commented Nov 8, 2019

I have installed clang9.
The problem isn't in TEST_EQ.

    auto v_d = root_v2_viewed_from_v1->d(); // ubsan catches here
    TEST_EQ(v_d, 3);

The v_d has type Evolution::V1::Enum and 3 isn't the value of this enum.
The UBSAN catches it.
This works:

    auto v_d = static_cast<int8_t>(root_v2_viewed_from_v1->d());
    TEST_EQ(v_d, 3);

@dbaileychess
Copy link
Collaborator Author

Thanks for looking into this.

But isn't this a problem for forward compatibility? Reading an evolved schema causes an exception even though it is valid?

@aardappel you might have some opinions.

@vglavnyy
Copy link
Contributor

vglavnyy commented Nov 9, 2019

But isn't this a problem for forward compatibility? Reading an evolved schema causes an exception even though it is valid?

There are two points of view on this issue:

  1. The test program is ill-formed because of write values out of enums range.
  2. Generated types (enum) can't be used in schemas with evolution.

The first is trivial and described by the C++ standard (C++17, 10.2.8):

For an enumeration whose underlying type is fixed, the values of the enumeration are the values of the underlying type. Otherwise, for an enumeration where emin is the smallest enumerator and emax is the largest, the values of the enumeration are the values in the range bmin to bmax, defined as follows: Let K be 1 for a two's complement representation and 0 for a one's complement or sign-magnitude representation. bmax is the smallest value greater than or equal to max(|emin| − K, |emax|) and equal to 2M − 1, where M is a non-negative integer. bmin is zero if emin is non-negative and −(bmax + K) otherwise.

The enum Evolution::V1::Enum declared without fixed underlying type and has range [0, 2^1-1]=[0,1]. The value 3 is out of this range.

By the standard:

A value of integral or enumeration type can be explicitly converted to a complete enumeration type. If the enumeration type has a fixed underlying type, the value is first converted to that type by integral conversion, if necessary, and then to the enumeration type. If the enumeration type does not have a fixed underlying type, the value is unchanged if the original value is within the range of the enumeration values (10.2), and otherwise, the behavior is undefined.

The UBSAN catches that UB in runtime.
The test.cpp has similar case:

flatbuffers/tests/test.cpp

Lines 1605 to 1613 in 33d5dd9

// Check that Color to string don't crash while decode a mixture of Colors.
// 1) Example::Color enum is enum with unfixed underlying type.
// 2) Valid enum range: [0; 2^(ceil(log2(Color_ANY))) - 1].
// Consequence: A value is out of this range will lead to UB (since C++17).
// For details see C++17 standard or explanation on the SO:
// stackoverflow.com/questions/18195312/what-happens-if-you-static-cast-invalid-value-to-enum-class
TEST_EQ_STR("", EnumNameColor(static_cast<Color>(0)));
TEST_EQ_STR("", EnumNameColor(static_cast<Color>(Color_ANY - 1)));
TEST_EQ_STR("", EnumNameColor(static_cast<Color>(Color_ANY + 1)));

@vglavnyy
Copy link
Contributor

vglavnyy commented Nov 9, 2019

The second is a consequence of first:
Never use enums without fixed underlying type (both in fbs-schema and C++ generated code).
Such enums always have a risk of breaking forward and backward compatibility and UB in C++ code.

Example of well-declared enum:

enum Enum : int32 { King =0, Queen, }

This enum has the fixed underlying type and at least one fixed value.
One can add values from left and right without compatibility issues.

There are two (thee) possible ways to map this fbs-declaration into C++ code:

  1. enum Enum { King=0, Queen} - bad (as we have).
  2. enum Enum : int32_t { King=0, Queen} - acceptable, doesn't have issues with compatibility or UB.
  3. enum class Enum : int32_t { King=0, Queen} - best (modern C++ way).

Now, the third is enabled if --scoped-enum is set.

From my point of view enums without a fixed underlying type must be deprecated both in schemas and C++.
Generated C++ code (by default) should use the third-form of declaration.
Only if --unscoped-enum is set then generate the second form.
This problem has a very long story, the latest #5285.
The MSVC2010 is a toxic legacy in this story.

@dbaileychess
Copy link
Collaborator Author

I don't believe the test is ill-formed, I never explicitly write a value that is not valid in its current context. Both the V1 and V2 buffers are valid according to their schema. The issue arises when reading those buffers with older versions of the schema. Reading a valid buffer with an older schema shouldn't cause any exceptions/UB to be occur, but it appears to happen is the case. But only in the case of using C++17 according to the information you provide. And that's why only the XCode configuration was catching it. So these test did expose a real compatibility error.

Perhaps the generated getters for enum values should not cast the underlying type to the enum type, but just return the underlying type as is. This will at least prevent flatbuffers from being the code that is causing the UB.

Also, the Evolution::V1::Enum is explicitly backed by a byte in the schema (I just didn't have --scoped-enums option on). And while the member King isn't explicitly set a value, it defaults to 0, so it is functionality equivalent to:

enum Enum : byte { King = 0, Queen = 1 }

Are you also suggesting that option --scoped-enums becomes the default value for C++? I think I am fine with that.

@vglavnyy
Copy link
Contributor

Are you also suggesting that option --scoped-enums becomes the default value for C++? I think I am fine with that.

Yes, generated enums must be declared as fixed type enums.

The cast (or initialization) from an integer to an enum without the fixed type is a problem in all versions of C++. C++11 standard (2013, N3690):

A value of integral or enumeration type can be explicitly converted to an enumeration type. The value is unchanged if the original value is within the range of the enumeration values (7.2). Otherwise, the resulting value is unspecified (and might not be in that range).

@aardappel
Copy link
Collaborator

We have to somehow be able to support values outside of enum range, so we have to work around any UB issues.

Our current minspec of VS2010 is arbitrary, as that seemed sensible at the time. FlatBuffers is widely enough used that it is valuable to keep supporting older C++. That said, we can start working towards flipping defaults for the most painful features.

For --scoped-enums, we could first introduce a flag --old-cpp-enums or something, and if anyone specifies neither of these flags, output a warning that they should specify one of these flags, thus encouraging people to use --scoped-enums. Then in the next release, we flip the default, hopefully breaking fewer people?

@dbaileychess
Copy link
Collaborator Author

Sound good to me. I'll probably add the --scoped-enums to generation step in tests/gernerate_code.sh for these schema.

@vglavnyy
Copy link
Contributor

We have to somehow be able to support values outside of enum range, so we have to work around any UB issues.

I suggest add a --legacy-cpp flag.
Without this flag flatc will generate C++ enums with a fixed underlying type:

enum Enum : int32_t { King=0, Queen}

Additionally, if this flag isn't active, flatc will generate a warning about non-fixed enum in a fbs-schema.
This flag will preserve MSVC 2010 and other legacy compilers.

@aardappel
Copy link
Collaborator

Maybe we want a more general --cpp-standard=N flag, that we can use for all sorts of features?

Problem with enums is that there are reasons why even under a newer compiler one might prefer the old style enums, if you are using them very much like integers or bit-flags (new ones require hacks like FLATBUFFERS_DEFINE_BITMASK_OPERATORS).

@vglavnyy
Copy link
Contributor

The --legacy-cpp preserves old-style enum. It only "add" specification of an underlying type:

enum name : type { enumerator = constexpr , enumerator = constexpr , ... }

Maybe we want a more general --cpp-standard=N flag

I'm working with this, a PR isn't ready yet.
However, this new flag doesn't resolve UB if an old compiler used.
The --legacy-cpp will resolve all UB issues with enums.
With this flag, our tests will pass without alarms from UBSAN.

@aardappel
Copy link
Collaborator

@vglavnyy ok sgtm, I await your new flags :)

@aardappel aardappel merged commit dda0950 into google:master Nov 14, 2019
@aardappel
Copy link
Collaborator

Thanks!

@dbaileychess dbaileychess deleted the evolution-tests branch November 15, 2019 03:32
@vglavnyy
Copy link
Contributor

This merge broke the tests under MSVC2019.

2>\flatbuffers\tests\test.cpp(2357): message : see reference to function template instantiation 'void TestEq<const Evolution::V2::TableC*,int>(T,U,const char *,const char *,int,const char *)' being compiled
2> with
2> [
2> T=const Evolution::V2::TableC *,
2> U=int
2> ]
2>\flatbuffers\tests\test_assert.h(62,1): warning C4311: '': pointer truncation from 'T' to 'U'
2> with
2> [
2> T=const Evolution::V2::TableC *
2> ]
2> and
2> [
2> U=int
2> ]

Problem is here:

TEST_EQ(root_v1_viewed_from_v2->j(), NULL);
TEST_EQ(root_v1_viewed_from_v2->e()->c(), NULL);
TEST_EQ(root_v1_viewed_from_v2->c_as_TableC(), NULL);

The TestEq can't accept pointers for comparison.
It is better to replace it with a checking like this.

TEST_ASSERT(nullptr == root_v1_viewed_from_v2->j());

@dbaileychess Could you fix this?
@aardappel What do think about gtest and gmock for testing?

Update:
I tried to replace by TEST_ASSERT and got memory leak:

ALL TESTS PASSED
Detected memory leaks!
Dumping objects ->
E:\github\flatbuffers\tests\test.cpp(2345) : {53967} normal block at 0x000002A0A6E802E0, 48 bytes long.
Data: < > > 10 3E ED A6 A0 02 00 00 10 01 00 00 00 00 00 00
E:\github\flatbuffers\tests\test.cpp(2345) : {52581} normal block at 0x000002A0A6E80270, 48 bytes long.
Data: < > E0 F5 E7 A6 A0 02 00 00 A0 00 00 00 00 00 00 00
Object dump complete.

line 2345:      verifiers[i] = new flatbuffers::Verifier(&binaries[i].front(), bufLen);

Why raw pointers are used instead of unique_ptr?

@vglavnyy
Copy link
Contributor

vglavnyy commented Nov 16, 2019

vglavnyy referenced this pull request Jun 16, 2020
* Add flatc '--cpp_std' switch and sandbox for C++17 code generator

- Added 'flac --cpp_std legacy' for compatibility with old compilers (VS2010);
- Added experimental switch 'flac --cpp_std c++17' for future development;
- Added C++17 sandbox test_cpp17.cpp;
- C++ code generator generates enums with explicit underlying type to avoid problems with the forward and backward schema compatibility;
- Adjusted CMakeLists.txt, CI and generate code scripts to support of introduced '--cpp_std';

* Fix --cpp_std values: c++0x, c++11, c++17

* Add 'cpp::CppStandard' enum

* Add testing engine into test_cpp17

* Rebase to upstream/master

* Set default '--cpp-std C++0x'

* Fix code generation (--cpp_std C++11) in CMakeLists.txt

- Fix dependency declaration of grpctest target

* Revert --cpp-std for the tests from explicit C++11 to flatc default value (C++0x)
LuckyRu pushed a commit to LuckyRu/flatbuffers that referenced this pull request Oct 2, 2020
* Added basic schema evolution tests

* Add BUILD targets for evolution tests. Added to test/generate_code scripts

* Use vector.front() instead of vector.data()

* Added --scoped-enums option for evolution test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants