feat: support aborted transactions internal retry #544

IlyaFaer · 2020-10-23T12:34:46Z

Implement aborted transactions retry mechanism.

While executing SQL statements in !autocommit mode, connection must remember every executed statement. In case the transaction aborted, all of these statements should be re-executed. Doing this, connection also must calculate checksum of every statement results, so that we could check if the retried transaction got the same results that the original one got. In case the checksums are not equal there is no way to continue transaction due to underlying data being changed during retry.

Closes #539

google/cloud/spanner_dbapi/checksum.py

google/cloud/spanner_dbapi/connection.py

google/cloud/spanner_dbapi/cursor.py

IlyaFaer · 2020-10-23T12:50:39Z

@c24t, @olavloite, I've pushed a part of transaction retry mechanism implementation, PTAL.

google/cloud/spanner_dbapi/cursor.py

google/cloud/spanner_dbapi/checksum.py

google/cloud/spanner_dbapi/cursor.py

tests/spanner_dbapi/test_checksum.py

google/cloud/spanner_dbapi/checksum.py

tests/spanner_dbapi/test_checksum.py

IlyaFaer · 2020-10-26T10:16:50Z

Pushed the second step of the implementation. Added actual retrying of aborted transaction. Requires to add more unit tests yet, for Cursor - to check if fetch* methods are retrying transactions correctly.

IlyaFaer · 2020-10-27T10:36:06Z

tests/spanner_dbapi/test_connect.py


            client_mock.assert_called_once_with(
-                project=PROJECT,
-                credentials=CREDENTIALS,
-                client_info=CLIENT_INFO,


This test is actually broken by another PR, but nox doesn't see it and doesn't run it. These tests were not moved to unit directory.

IlyaFaer · 2020-10-27T10:48:16Z

tests/spanner_dbapi/test_connection.py

 from google.cloud.spanner_dbapi import Connection, InterfaceError
+from google.cloud.spanner_dbapi.checksum import ResultsChecksum
 from google.cloud.spanner_dbapi.connection import AUTOCOMMIT_MODE_WARNING
 from google.cloud.spanner_v1.database import Database
 from google.cloud.spanner_v1.instance import Instance


Why these tests are still here? They were copied into unit directory, so I suppose they should be erased from this directory?!

Good question. Looks like this (and test_connect) weren't moved in #532?

https://github.com/q-logic/python-spanner-django/blob/41abaebb6f2e0b1cf16704aa1e394acc5a47e68b/tests/spanner_dbapi/test_connection.py

The test files weren't exactly copied, #532 changed them and added some new tests. E.g. the version on master now doesn't include test_transaction_autocommit_warnings.

@mf2199 can you confirm that you meant to change/remove these tests before removing tests/spanner_dbapi in this PR?

google/cloud/spanner_dbapi/cursor.py

IlyaFaer · 2020-10-28T08:53:18Z

@olavloite, am I understood correctly, that commit() can also be aborted? And in this case we should retry the whole transaction?

olavloite · 2020-10-28T11:16:41Z

@olavloite, am I understood correctly, that commit() can also be aborted? And in this case we should retry the whole transaction?

@IlyaFaer Yes, that is correct.

google/cloud/spanner_dbapi/connection.py

c24t

Besides cleaning up the tests, I wonder if this PR is catching Aborted errors in the right place. Since we're only calling execute_sql and not the streaming variant, I'd expect to see errors immediately instead of when we iterate over the results in the cursor. Am I missing something here?

tests/unit/spanner_dbapi/test_cursor.py

google/cloud/spanner_dbapi/cursor.py

google/cloud/spanner_dbapi/connection.py

olavloite · 2020-10-29T18:57:04Z

google/cloud/spanner_dbapi/cursor.py

+            return
+        except Aborted:
+            self.connection.retry_transaction()
+            return self.fetchone()


I think this will be a problem. Assuming that this is using the ExecuteStreamingSql RPC, then each next() call could potentially mean that a new RPC is executed. So for the sake of simplicity, assume in the example below that each call to next() executes the ExecuteStreamingSql RPC.

Assume the following situation:

The table Singers contains the following singers (last names): Allison, Morrison, Pieterson

The application executes the query SELECT LastName FROM Singers ORDER BY LastName in transaction 1.

The client application calls fetchone() which returns 'Allison'.

Some other transaction executes `DELETE FROM Singers WHERE LastName='Pieterson'.

The first transaction is aborted by the backend. A retry is executed and the retry logic checks that the checksum of the retried result set is equal to the original attempt, which it is as the first record is still 'Allison'.

The client application calls fetchone() again. This should return 'Morrison', but as it needs to call ExecuteStreamingSql it will (probably) use the transaction id of the original transaction (unless that transaction id has somehow been replaced in the underlying iterator). If it does use the old transaction id, the RPC will fail with yet another Aborted error, and that will repeat itself until the transaction retry limit has been reached.

Seems to me we can just drop the _transaction property, so that Connection will initiate a new one on the next execute() call.

@IlyaFaer @c24t

Sorry for reopening this, and this comment should not be considered blocking for merging this PR, but I think we need to look into this once more. Only dropping the _transaction property will in this case not be enough for the following reason:

When executeSql is called, a streaming iterator is returned to the application.

That streaming iterator is linked with the transaction that was active at that moment, and a reference to that transaction is also held in the iterator.

If a transaction is aborted and the client application has consumed only parts of a streaming iterator, that iterator is no longer valid (at least: it will also throw an exception if it needs to receive more data from the server).

The JDBC driver client solves the above problem by wrapping all streaming iterators before returning these to the client application. That makes it possible for the JDBC driver to replace the underlying streaming iterator with a new one when a transaction has been aborted and successfully retried.

We should add that to the Python DBApi as well, but we could do that in a separate PR to prevent this PR from becoming even bigger than it already is.

@c24t, @olavloite, hm-m. I think we're protected from errors here, because our connection API doesn't actually give streaming result objects to a user.

Here is where we're getting a streaming iterator:

python-spanner-django/google/cloud/spanner_dbapi/cursor.py

Lines 167 to 170 in 196c449

self._result_set = transaction.execute_sql(

sql, params, param_types=get_param_types(params)

)

self._itr = PeekIterator(self._result_set)

So, iterator is held in the protected property _itr, and users will be streaming it with Cursor.fetch*() methods, without actual access to the iterator itself:

python-spanner-django/google/cloud/spanner_dbapi/cursor.py

Lines 204 to 212 in 196c449

def fetchone(self):

"""Fetch the next row of a query result set, returning a single

sequence, or None when no more data is available."""

self._raise_if_closed()

try:

return next(self)

except StopIteration:

return None

Where next(self) is calling next(self._itr) here:

python-spanner-django/google/cloud/spanner_dbapi/cursor.py

Lines 293 to 296 in 196c449

def __next__(self):

if self._itr is None:

raise ProgrammingError("no results to return")

return next(self._itr)

Thus, if a transaction failed, the connection will drop the transaction, checkout a new one, re-run all the statements, each of which will replace _itr with a new streamed iterator. So, all the iterators are processed internally, and will be replaced on a retry, as I see.

olavloite

Thanks for the updates. I think this is coming very to close to what we need, but I have a couple of small questions.

google/cloud/spanner_dbapi/exceptions.py

google/cloud/spanner_dbapi/connection.py

c24t

No more substantial comments from me, but note that this whole PR will have to move to python-spanner now that googleapis/python-spanner#160 is in.

@olavloite, does this address all your comments?

olavloite · 2020-11-12T13:47:20Z

No more substantial comments from me, but note that this whole PR will have to move to python-spanner now that googleapis/python-spanner#160 is in.

@olavloite, does this address all your comments?

Yes, my comments have been addressed. I do think that it's important that we get the (integration) tests running soon, as there could still be corner cases that we haven't thought of yet that only occur incidentally.

I also have a separate concern regarding streaming iterators, but that should not block the merging of this PR.

c24t · 2020-11-17T21:21:29Z

This PR won't be merged here, see googleapis/python-spanner#168.

Fixes #34. See googleapis/python-spanner-django#544.

IlyaFaer · 2020-11-30T07:03:14Z

Merged into the original Spanner client repo

feat: support aborted transactions internal retry

50580b4

IlyaFaer added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. api: spanner Issues related to the googleapis/python-spanner-django API. labels Oct 23, 2020

google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Oct 23, 2020

IlyaFaer commented Oct 23, 2020

View reviewed changes

olavloite reviewed Oct 23, 2020

View reviewed changes

add a transaction retry method

481db2a

IlyaFaer added 3 commits October 27, 2020 12:58

add retrying all the statements up to failed one

313e16d

Merge branch 'master' into aborted_transactions_retry

3a7537c

resolve conflicts

4edf6c1

IlyaFaer commented Oct 27, 2020

View reviewed changes

move unit tests

f87129a

IlyaFaer commented Oct 27, 2020

View reviewed changes

fix tests

49e17be

IlyaFaer commented Oct 27, 2020

View reviewed changes

google/cloud/spanner_dbapi/cursor.py Show resolved Hide resolved

IlyaFaer commented Oct 28, 2020

View reviewed changes

google/cloud/spanner_dbapi/cursor.py Show resolved Hide resolved

IlyaFaer added 4 commits October 28, 2020 10:58

small fixes

a8158b3

fix lint and unit tests

ccf5385

fix lint and unit tests

a537628

fix imports

0b1a641

add aborted commit retry

0e2ca3e

olavloite reviewed Oct 28, 2020

View reviewed changes

google/cloud/spanner_dbapi/connection.py Show resolved Hide resolved

c24t reviewed Oct 28, 2020

View reviewed changes

tests/unit/spanner_dbapi/test_cursor.py Show resolved Hide resolved

google/cloud/spanner_dbapi/cursor.py Outdated Show resolved Hide resolved

google/cloud/spanner_dbapi/connection.py Outdated Show resolved Hide resolved

google/cloud/spanner_dbapi/connection.py Show resolved Hide resolved

IlyaFaer added 2 commits October 29, 2020 12:00

add retry inside retry, cleanup tests

a4ffab5

lint fix

fc890c2

olavloite reviewed Oct 29, 2020

View reviewed changes

IlyaFaer added 3 commits October 30, 2020 12:35

use new transaction on retry

d204836

fix imports

9ea2a01

add retrying limit

7e70d86

IlyaFaer marked this pull request as ready for review November 4, 2020 07:25

IlyaFaer requested a review from a team as a code owner November 4, 2020 07:25

olavloite reviewed Nov 4, 2020

View reviewed changes

google/cloud/spanner_dbapi/exceptions.py Outdated Show resolved Hide resolved

google/cloud/spanner_dbapi/connection.py Outdated Show resolved Hide resolved

google/cloud/spanner_dbapi/connection.py Show resolved Hide resolved

IlyaFaer and others added 4 commits November 5, 2020 10:59

Merge branch 'master' into aborted_transactions_retry

870e170

rename the exception, fix remembering retried statements

450b91b

erase excess while cycle - retry_transaction() is already using its own

578eaa2

Merge branch 'master' into aborted_transactions_retry

59d597a

c24t reviewed Nov 12, 2020

View reviewed changes

c24t approved these changes Nov 17, 2020 •

edited

Loading

View reviewed changes

c24t mentioned this pull request Nov 17, 2020

feat(dbapi): add aborted transactions retry support googleapis/python-spanner#168

Merged

c24t pushed a commit to googleapis/python-spanner that referenced this pull request Nov 23, 2020

feat(dbapi): add aborted transactions retry support (#168)

d59d502

Fixes #34. See googleapis/python-spanner-django#544.

IlyaFaer closed this Nov 30, 2020

IlyaFaer deleted the aborted_transactions_retry branch November 30, 2020 07:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support aborted transactions internal retry #544

feat: support aborted transactions internal retry #544

IlyaFaer commented Oct 23, 2020 •

edited

Loading

IlyaFaer commented Oct 23, 2020 •

edited

Loading

IlyaFaer commented Oct 26, 2020 •

edited

Loading

IlyaFaer Oct 27, 2020 •

edited

Loading

IlyaFaer Oct 27, 2020

c24t Oct 28, 2020

IlyaFaer commented Oct 28, 2020

olavloite commented Oct 28, 2020

c24t left a comment

olavloite Oct 29, 2020

IlyaFaer Oct 30, 2020

olavloite Nov 12, 2020

IlyaFaer Nov 13, 2020 •

edited

Loading

olavloite left a comment

c24t left a comment

olavloite commented Nov 12, 2020 •

edited

Loading

c24t commented Nov 17, 2020

IlyaFaer commented Nov 30, 2020

	self._result_set = transaction.execute_sql(
	sql, params, param_types=get_param_types(params)
	)
	self._itr = PeekIterator(self._result_set)

	def fetchone(self):
	"""Fetch the next row of a query result set, returning a single
	sequence, or None when no more data is available."""
	self._raise_if_closed()

	try:
	return next(self)
	except StopIteration:
	return None

	def __next__(self):
	if self._itr is None:
	raise ProgrammingError("no results to return")
	return next(self._itr)

feat: support aborted transactions internal retry #544

feat: support aborted transactions internal retry #544

Conversation

IlyaFaer commented Oct 23, 2020 • edited Loading

IlyaFaer commented Oct 23, 2020 • edited Loading

IlyaFaer commented Oct 26, 2020 • edited Loading

IlyaFaer Oct 27, 2020 • edited Loading

Choose a reason for hiding this comment

IlyaFaer Oct 27, 2020

Choose a reason for hiding this comment

c24t Oct 28, 2020

Choose a reason for hiding this comment

IlyaFaer commented Oct 28, 2020

olavloite commented Oct 28, 2020

c24t left a comment

Choose a reason for hiding this comment

olavloite Oct 29, 2020

Choose a reason for hiding this comment

IlyaFaer Oct 30, 2020

Choose a reason for hiding this comment

olavloite Nov 12, 2020

Choose a reason for hiding this comment

IlyaFaer Nov 13, 2020 • edited Loading

Choose a reason for hiding this comment

olavloite left a comment

Choose a reason for hiding this comment

c24t left a comment

Choose a reason for hiding this comment

olavloite commented Nov 12, 2020 • edited Loading

c24t commented Nov 17, 2020

IlyaFaer commented Nov 30, 2020

IlyaFaer commented Oct 23, 2020 •

edited

Loading

IlyaFaer commented Oct 23, 2020 •

edited

Loading

IlyaFaer commented Oct 26, 2020 •

edited

Loading

IlyaFaer Oct 27, 2020 •

edited

Loading

IlyaFaer Nov 13, 2020 •

edited

Loading

olavloite commented Nov 12, 2020 •

edited

Loading