Description
Dolt can consume MySQL binary log events through a replication connection to a source MySQL server, but Dolt can't yet write its own binary log or send binary log events to attached replicas.
Customer Benefits
MySQL binary logs are used by customers for:
- Replication – Events from the binary log can be streamed to replicas to keep them in sync with the source. Dolt can consume these events, but cannot produce them yet. Dolt already provides replication support to other Dolt servers, so the big benefit here would be allowing Dolt to be the source server and replicate to a MySQL replica.
- Change Data Capture – Similar to the replication use case above, the MySQL binlog can also be used to integrate with popular change data capture tools, such as Debezium. In addition to replication use cases, this has several other uses, such as event-based data change auditing. Dolt is already adept at answering questions about exactly what data has changed, but doesn't currently provide a standard interface for easily plugging into tools like Debezium.
- Data Recovery – The binary log provides a sequenced log of the changes made to a MySQL database; in the event of data loss, this can be used to replay and restore data. Dolt provides this same benefit through versioning all data, as long as the data has been included in a Dolt commit. Dolt doesn't provide a separate log, but backups combined with Dolt's stored history mostly achieve the same effect, with the major difference being that the binlog has a granularity of SQL transactions, while Dolt's history has a granularity of Dolt commits.
The two biggest advantages this feature would bring to Dolt are: 1) the ability to easily integrate with popular Change Data Capture tools like Debezium through the standard binlog event interface, and 2) the ability to have a Dolt server replicate data to a MySQL replica. Customers have specifically asked about Debezium integration, so an MVP would focus on that use case.
Design Decisions and Scope
Branches
An initial implementation of writing a binary log should be limited to supporting a single branch. MySQL doesn't have the concept of multiple branches, so a binary log naturally covers all operations happening in the server. For a database with branches, the binary log needs to be branch-specific for it to be usable.
GMS or Dolt Integration
The initial implementation should be done in the Dolt layer, and if needed, we can go back and create new interfaces to enable it more generally at the GMS layer. It would be ideal to provide support for writing a binary log at the GMS layer, since it is a MySQL capability. We chose not to implement reading binary logs events at the GMS layer, because it requires storage-engine specific integration and we weren't ready to commit to building out new interfaces to enable that. Writing binary logs will likely have similar issues, and will also need to be aware of the branch being used, which is only a concept in Dolt.
In Scope
- GTID-based replication coordination – this is the recommended and most modern approach to coordinating access to the binlog event stream, and enables clients and servers to avoid having to deal with file and file position to identify positions in the event stream.
- Row-based replication – MySQL provides several replication formats, but we will only target row-based replication, the default replication format.
- Asynchronous replication – MySQL replication is asynchronous by default. Other modes, such as semisynchronous replication will not be in the initial scope.
- Flushing binlog to disk at each transaction – This is the safest setting since the binlog will always be kept immediately in sync with committed transactions, but trades off performance for this safety.
Out of Scope
- binlog event checksums
- binlog encryption
- binlog filtering
Implementation Components
MySQL Commands
Log Writer
Responsible for writing out the binary log files to disk. Integrates with the transaction commit logic in the SQL engine.
Log Manager
Responsible for purging and periodically rotating binary logs.
Configuration
System variables for replication configuration, including enabling/disabling binary logging, such as log_bin
GMS Interface
A new interface in go-mysql-server is needed so that GMS can dispatch replication command handling to Dolt's implementation, similar to BinlogReplicaController.
GTID Tracking
A replication source server is responsible for creating Global Transaction IDs for each SQL transaction, and updating system variables to show the current GTID.
Replication Protocol Messages
Dolt uses a fork of the Vitess library for reading and writing MySQL wire protocol messages. Our fork doesn't currently have all the support needed for building a binlog event source, but the main branch of the official Vitess repo seems to have what we need. This support will need to be ported over to our fork and tested. A quick prototype on the following branches shows that a Vitess-based replica can connect to a Vitess-based primary server and pull hardcoded binlog events:
Open Questions
- Debezium integration support – Debezium integration hasn't been tested yet, and may require additional scope. For example, some replication information is accessible from MySQL's
performance_schema
database, which Dolt doesn't currently support. We may need to implement portions of this database to unblock Debezium support. Debezium also has it's own initial snapshot feature we relies on global read locks. There are other alternatives for this, but this needs to be investigated.
High-Level Estimate
Providing a binlog and binlog event streaming is roughly about the same sized scope as consuming a binlog event stream, possibly slightly larger (e.g. integration testing with Debezium, managing log files on disk, creating binlog events). Implementing support for consuming a binlog event stream took ~3 months, so an initial estimate for a first version usable by customers in production is 3 to 4 months for one developer. This estimate includes time for other responsibilities that the developer would need to take care of as well as some buffer for unexpected surprises. Once we get deeper into the work, we can provide a more accurate estimate.