Skip to content

Transaction: on retry, replays should compare checksums of prior numbered statements that succeeded #34

Closed
@odeke-em

Description

Coming here from a project that plans on adding Cloud Spanner as a backend for Django.

In AUTOCOMMIT=off mode, we need to hold a Transaction for perhaps an indefinitely long time.
Cloud Spanner will abort:
a) Transactions when not used for 10seconds or more -- we can periodically send a SELECT 1=1 to keep it active
b) Transactions even when refreshed, can and will abort. This is because Cloud Spanner has a high abort rate

Thus we need to retry Transactions!

Current retry

The current code for retrying in this repository is to just re-invoke the function that was passed into *.run_in_transaction afresh with a new Transaction per

while True:
if self._transaction is None:
txn = self.transaction()
else:
txn = self._transaction
if txn._transaction_id is None:
txn.begin()
try:
attempts += 1
return_value = func(txn, *args, **kw)
except Aborted as exc:
del self._transaction
_delay_until_retry(exc, deadline, attempts)
continue
except GoogleAPICallError:
del self._transaction
raise
except Exception:
txn.rollback()
raise
try:
txn.commit()
except Aborted as exc:
del self._transaction
_delay_until_retry(exc, deadline, attempts)
except GoogleAPICallError:
del self._transaction
raise
else:
return return_value

Recommended retry

However, the correct way to retry Transactions as @bvandiver explained to me

You are getting quite close to the implementation in the open source JDBC driver. Rather than re-inventing things, I would suggest following their implementation. Of note, your current replay mechanism can lead to wrong answers. Imagine the canonical "transfer balance" transaction which decrements the balance in acct A, then increases the balance in acct B. However, between abort and retry someone deletes acct A - resulting in money magically appearing in acct B and no error (the update silently fails to update any rows). The long and the short of it is that you need to hash the results of all queries + DML and confirm on your retry that they give the same answers. You need query too (think a query to check if there was sufficient balance in acct A).

a) For every result returned by an operation on a Transaction, compute the checksum and add it a FIFO stack
b) At the point that a prior Transaction fails, that's the bottom of our stack
c) When retrying the Transaction from the first statement, compare its checksum with the same ordinal number/index on the FIFO stack -- if any of them don't match, abort the Transaction as not retryable

This is what the Java spanner-jdbc implementation does

Suggestion

The implementation of this feature when attempted outside of this package involves a whole lot of hacking since we need to consume the raw data sent to StreamedResultSets which requires then proto marshalling and wrapping StreamedResult -- quite non-ideal and will actually involve patches to python-spanner.

@bvandiver and I chatted again about this today and I also briefly raised this issue to @skuruppu this afternoon too.

Metadata

Assignees

Labels

api: spannerIssues related to the googleapis/python-spanner API.priority: p2Moderately-important priority. Fix may not be included in next release.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions