Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix spurious aborts when retrying transactions #34

Closed
wants to merge 3 commits into from

Conversation

3noch
Copy link
Contributor

@3noch 3noch commented Oct 11, 2019

Transactions that enter an error state must be aborted manually by
issuing a "ROLLBACK". However, if the transaction error happened during
a "COMMIT" then the rollback happens automatically. Issuing a "ROLLBACK"
at this point causes PostgreSQL to issue a "WARNING: There is no
transaction in progress". This warning can have much worse causes (e.g.
you "COMMIT" but never began a transaction). This change makes the
transaction retrying logic never cause PostgreSQL to issue this warning
making it a more useful warning for detecting real bugs.

Transactions that enter an error state must be aborted manually by
issuing a "ROLLBACK". However, if the transaction error happened during
a "COMMIT" then the rollback happens automatically. Issuing a "ROLLBACK"
at this point causes PostgreSQL to issue a "WARNING: There is no
transaction in progress". This warning can have much worse causes (e.g.
you "COMMIT" but never began a transaction). This change makes the
transaction retrying logic never cause PostgreSQL to issue this warning
making it a more useful warning for detecting real bugs.
@3noch 3noch force-pushed the eac@fix-spurious-aborts branch from 04bb48f to 184c74f Compare October 11, 2019 21:51
@phadej
Copy link
Collaborator

phadej commented Oct 15, 2019

Is there some easy to reproduce error case for COMMIT, so we can write a test?

@3noch
Copy link
Contributor Author

3noch commented Oct 15, 2019

As for reproduction, we would need to produce a serialization error which can be pretty tricky to do. If you run two identical mutating transactions in SERIALIZABLE isolation in parallel at high frequency you might get one of these errors within a few seconds. But even then, you'd need to instrument this code to determine if you actually hit it.

@3noch
Copy link
Contributor Author

3noch commented Oct 15, 2019

Of course, I confirmed that this fixed the error in my specific use case. But that's not nearly as reassuring. Are there tests for this code at all?

@phadej
Copy link
Collaborator

phadej commented Oct 16, 2019

I think there aren't, but that's not to reason to add them.

I'd be happy with any withTransactionModeRetry tests, during the transaction generatingSqlError is easy, isn't it? I'm not asking you to write that though: If you could sketch how the test for SqlError in COMMIT would work (what kind of queries to run, etc.), I can try to make it into a test. I hope that on actually multi-core machine (which I luckily have) situation will happen multiple times.

About instrumentation: good point, yet I'll be fine if that test doesn't work before this patch, but works afterwards.

@3noch
Copy link
Contributor Author

3noch commented Oct 16, 2019

I'll be fine if that test doesn't work before this patch, but works afterwards.

The main thing this fixes is a warning printed to stdout by [presumably] libpq. To make a test fail prior to this change you'd have to monitor stdout and look for the warning!

@3noch
Copy link
Contributor Author

3noch commented Oct 16, 2019

One way to test this would be to use deferred constraints to get an exception at commit. shouldRetry could act as some sort of instrumentation if you're willing to use unsafePerformIO.

@3noch
Copy link
Contributor Author

3noch commented Oct 16, 2019

CREATE TABLE tmp (value INT NOT NULL UNIQUE DEFERRABLE);
BEGIN;
SET CONSTRAINTS ALL DEFERRED;
INSERT INTO tmp (value) VALUES (1);
INSERT INTO tmp (value) VALUES (1);
COMMIT;

This will throw an error on the COMMIT.

@phadej
Copy link
Collaborator

phadej commented Oct 16, 2019

@3noch thanks for an example!

@@ -157,20 +157,19 @@ withTransactionModeRetry :: TransactionMode -> (SqlError -> Bool) -> Connection
withTransactionModeRetry mode shouldRetry conn act =
mask $ \restore ->
retryLoop $ E.try $ do
a <- restore act
a <- restore act `E.onException` rollback_ conn
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, rollback_ is not exported and it's also suspicious. It catches exceptions thrown when issuing the rollback. Why is that a good idea?

@phadej
Copy link
Collaborator

phadej commented Nov 15, 2020

Merged as part of #54

@phadej phadej closed this Nov 15, 2020
@3noch
Copy link
Contributor Author

3noch commented Nov 16, 2020

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants