Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for reading optional table metadata #251

Merged
merged 3 commits into from
Feb 6, 2019

Conversation

ahmedahamid
Copy link
Contributor

@ahmedahamid ahmedahamid commented Dec 10, 2018

Add logic to read optional table metadata that was added to the binlog in MySQL 8.0.1.

@ahmedahamid
Copy link
Contributor Author

ahmedahamid commented Dec 10, 2018

Hello @shyiko ,

This PR adds support for new optional table metadata introduced in MySQL 8.0.1. Among other things, it adds column names to the binlog, subject to MySQL configuration.

Would you be open to working with me on this change? In particular, I have some questions about whether you'd like to include all newly included fields or just a subset thereof. The full list is documented over here.

@shyiko
Copy link
Owner

shyiko commented Dec 24, 2018

Hi Ahmed.
Thank you for the PR!

to include all newly included fields or just a subset thereof

all available

Let me know if I can help to move this forward.

@ahmedahamid ahmedahamid changed the title Add support for optional table metadata Add support for reading optional table metadata Feb 2, 2019
@ahmedahamid ahmedahamid force-pushed the master branch 2 times, most recently from 21fab96 to c5106da Compare February 2, 2019 20:48
@ahmedahamid
Copy link
Contributor Author

ahmedahamid commented Feb 2, 2019

Hi Stanely,
Thank you for your openness to contributions. This change is now ready for review.

A few things I need to mention:

  • I apologize for the force-pushes; I didn't want you to have to look at a long trail of WIP commits. Will only push incremental changes once you start reviewing the PR.
  • This change was tested against MySQL Ver 8.0.15 for Linux on x86_64 (MySQL Community Server - GPL), with binlog_row_metadata set to FULL and MINIMAL.
  • I used a dummy table with the following definition to test this change
CREATE TABLE `people` (
  `person_id` int(11) NOT NULL,
  `name` varchar(20) DEFAULT NULL,
  `age` smallint(5) unsigned DEFAULT NULL,
  `score` tinyint(3) unsigned DEFAULT NULL,
  `tag` varchar(20) DEFAULT NULL,
  `total` tinyint(3) unsigned DEFAULT NULL,
  PRIMARY KEY (`person_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

One last thing I would like to mention is that my team is planning to utilize this library in Brooklin, a CDC system LinkedIn is open-sourcing in a few months.

@shyiko
Copy link
Owner

shyiko commented Feb 5, 2019

Thank you, Ahmed! This looks great!
My only concern is number of allocations (e.g. here and here) (latter can be avoiding by using ByteArrayInputStream::enterBlock(int)). Normally I wouldn't care but considering that deserializers are executed for millions and millions of events that can become expensive. Would you be willing to give it another look? (I'm willing to merge it as-is if you're not)
Thanks again!

@ahmedahamid
Copy link
Contributor Author

Thanks for the super prompt feedback, Stanley. My pleasure!

Please, feel free to give as much feedback as you want about this change. I am certainly willing to make more changes/improvements.

  1. I certainly agree regarding optimizing allocations. I made changes to reuse TableMapEventMetadataDeserializer, and used ByteArrayInputStream::enterBlock(int) as prescribed.

  2. The reason I initially chose not to use ByteArrayInputStream::enterBlock(int) is because I wanted to avoid interfering with the logic in EventDeserializer::deserializeEventData(), where it uses ByteArrayInputStream::enterBlock(int) to mark the event boundary before invoking the relevant deserializer, which is especially useful if a serializer throws. But now that I've used ByteArrayInputStream::enterBlock(int) in TableMapEventMetadataDeserializer::deserialize(), I believe EventDeserializer won't be able to skip a Table Map event in case TableMapEventMetadataDeserializer::deserialize() throws for any reason.

  3. I tried running ./mvnw -P coverage clean verify with and without my changes after installing Vagrant and VirtualBox, but got the error in this gist in both cases. Is this a known issue?

Thanks again.

@shyiko
Copy link
Owner

shyiko commented Feb 6, 2019

2. I believe EventDeserializer won't be able to skip a Table Map event in case TableMapEventMetadataDeserializer::deserialize() throws for any reason.

That's a very good point. One way to fix that (aside from restoring block size in try-finally) would be to pass new ByteArrayInputStream(inputStream.read(inputStream.available())) instead of inputStream in
https://github.com/shyiko/mysql-binlog-connector-java/pull/251/files#diff-0bfd61978b7d64eec747ebd4b56dee24R43.

3. Is this a known issue?

That's weird. I haven't seen that before. Here is my (current) setup:

$ system_profiler SPSoftwareDataType | grep "System Version"
      System Version: macOS 10.14 (18A391)

$ vagrant --version
Vagrant 2.0.2

$ VirtualBox --help | grep "VirtualBox Manager"
Oracle VM VirtualBox Manager 5.2.22

I'd try

cd supplement/vagrant/mysql-8.0.1-sandbox-prepackaged
vagrant up
# you should now be able to run tests from IDEA (no changes required)

@ahmedahamid
Copy link
Contributor Author

Thanks again, Stanley.

(aside from restoring block size in try-finally)

There could be cases where reading a field fails midway. I wasn't sure there was a reliable way of knowing the exact block size to restore in such cases.

pass new ByteArrayInputStream(inputStream.read(inputStream.available())) instead of inputStream

Done. Thanks for the suggestion.

I ran the integration tests successfully on a mac. I was running them on CentOS 7 earlier.

One last thing I wanted to bring up is that, with this change, there will be 3 different/redundant versions of string join, i.e. here, here, and here. Would you like me to address this in this change? If so, do you have any preference as to where such utility method should exist?

@shyiko
Copy link
Owner

shyiko commented Feb 6, 2019

Awesome! Thank you, Ahmed. This is one of those perfectly executed PRs! Merging in 🙇

@shyiko shyiko merged commit 91677e9 into shyiko:master Feb 6, 2019
@ahmedahamid
Copy link
Contributor Author

It was a pleasure working with you on this, Stanley. Thank you for the quick turnaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants