-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes participant to do retries on startup as waiting for DB #10759
Conversation
Is this related? #10758 |
indeed. see #10758 (comment) |
5a568be
to
fc6579d
Compare
@@ -31,6 +31,7 @@ private[platform] class FlywayMigrations( | |||
.configure() | |||
.locations((locations(enableAppendOnlySchema, dbType) ++ additionalMigrationPaths): _*) | |||
.dataSource(dataSource) | |||
.connectRetries(600) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deferring waiting for connectivity to Flyway.
this would do exponential backoff waiting 1 2 4 8 16 ... seconds between retries
Flyway is called almost every case at startup: exception sandboxnext reset triggered restart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flyway says it always waits 1sec between retries. Waiting for 10min before declaring there is no connectivity to the database seems a bit long, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed offline:
- Flyway does exponential backoff now (despite docs) flyway/flyway@37675a0
- lowered it to 10 ~ 1023 sec
@@ -257,7 +257,7 @@ private[backend] object PostgresStorageBackend | |||
val pgSimpleDataSource = new PGSimpleDataSource() | |||
pgSimpleDataSource.setUrl(jdbcUrl) | |||
|
|||
Using.resource(pgSimpleDataSource.getConnection())(checkCompatibility) | |||
Try(Using.resource(pgSimpleDataSource.getConnection())(checkCompatibility)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a consequence this won't do much on the first datasource initialisation if noconnectivity, but it will later as ds-s are created for dbdispatchers (at that time we already waited once for connectivity)
@@ -257,7 +257,7 @@ private[backend] object PostgresStorageBackend | |||
val pgSimpleDataSource = new PGSimpleDataSource() | |||
pgSimpleDataSource.setUrl(jdbcUrl) | |||
|
|||
Using.resource(pgSimpleDataSource.getConnection())(checkCompatibility) | |||
Try(Using.resource(pgSimpleDataSource.getConnection())(checkCompatibility)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This silently drops any errors, wouldn't it be better to use com.daml.timer.RetryStrategy.exponentialBackoff
or similar to retry at least a few times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is a good idea to retry here, see explanation #10758 (comment)
And yes this will silently drop errors. But still it will run multiple times as we instantiate other datasources later, when connectivity should be already there. Not ideal I know.
The way I see it either we do something quickly if this is pushing (like this) and then we do the proper thing, or we just do the proper thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think silently dropping errors during data source creation is a good idea, but this is a quick fix to unblock others.
A proper fix should be added soon after #10705 lands.
…vity CHANGELOG_BEGIN CHANGELOG_END
fc6579d
to
a65772c
Compare
This PR has been created by a script, which is not very smart and does not have all the context. Please do double-check that the version prefix is correct before merging. @akrmn is in charge of this release. Commit log: ``` 35a853f Async logging for http-json-perf tests and publish logs as a pipeline artifact (#10803) a349217 get oracle perf test output (#10800) 5f448f2 Cleanup DataSource initialization (#10794) 0e250d0 Fix parallel ingestion initialization (#10797) b8bd5e6 interface PoC: protobuf definitions (#10796) a1da025 FreePort draw from outside ephemeral port range (#10774) 945a86c Fix a race in a test of the RecoveringIndexer (#10792) 4093bbd fix macOS Bazel cache (#10795) ed99fe5 ParticipantPruningIT added events for canton safe-pruning and multi-node synchronization (#10775) 5b28aef Deprecate kvutils rejections generated only by participant state v1 [KVL-1090] (#10791) 116d6a5 DPP-577 Use BIGINT instead of TIMESTAMP (#10761) cdf4bf1 Check compatibility of timestamps (#10793) 098a08c Add Moisés to release rotation (#10789) 7ae199f Follow-up to h2 storage backend user/password in url: Review feedback switch to regex and move to pure function (#10541) a5999ab remove dead code (#10765) 4b7391b LedgerApiTestTool prints skipped (excluded) tests (#10726) a0f0fb3 [DPP-418] Enable outstanding test case for private key decryption (#10778) d750666 Do not drop details when converting between gRPC `Status` classes [KVL-1084] (#10745) 5ec9a23 Fix link to the gRPC health checks in the driver for PostgreSQL docs (#10779) 8538fe9 HA protected ledger initialization for append only schema [DPP-551] (#10705) 23b0fe7 Fix exclusions for command service tests (#10777) c4e0a75 LF: drop ad-hoc ImmArray builders (#10763) 303ba90 participant-state: Re-enable integration test for command deduplication [KVL-1089] (#10751) 793253c bump cleanup threshold (#10771) 4d67684 Fix flaky DBLock tests (#10770) 7515871 Fix exclusions for command dedup in compat tests (#10767) 50fecfb Wrap missing label names in quotes (#10749) b1c6e87 fix claims check in auth middleware (#10768) 5595d55 [JSON-API] Use the token from incoming requests to update the package list (#10602) dbafea0 Add unit tests for DBLockStorageBackend [DPP-495] (#10746) e5a6d70 Added buffer size metrics for getTransactions/getTransactionTrees (#10744) f76c868 update NOTICES file (#10762) bb908d0 Create a link to party management service in main Ledger API page. (#10138) e45d852 Fixes participant to do retries on startup as waiting for DB connectivity (#10759) f68a129 Report some oracle_json_perf numbers on slack (#10754) 7270ee3 Handle dynamic port in auth middleware client (trigger service) (#10755) ff1308e [Docs] Add info on logs on Kubernetes & metrics in the ops section (#10525) d267390 Update buf image [kvl-1049] (#10752) d2180cf Update exclusion for command deduplication to include full version range (#10750) 41881ba Command dedup: migrate kvutils to use v2 services [KVL-1049] (#10679) 4525b8c [JSON-API] vanilla oracle_perf ci job (#10688) 16df8a5 [Short] Remove unused code (#10719) b28afcf [DPP-438] Update docs on metrics that no longer use <party_name> in their name (#10728) 0662025 Clarify what the `buf` image is and how it should be used (#10741) 9071a05 sandbox-classic: Reintroduce SqlLedger tests for the mutable index. (#10722) f576cdf fix flaky test in RecoveringIndexer (#9619) f6a75b4 ledger-api-common: Do not mock final classes. (#10733) e9c8af5 Bump ghc-lib to include dropped parsing code for generic templates (#10735) 862a290 rotate release duty after 1.17.0-snapshot.20210831.7702.0.f058c2f1 (#10731) 12cb92b update compat versions for 1.17.0-snapshot.20210831.7702.0.f058c2f1 (#10737) 73c0de8 [DPP-578] Temporarily disable flaky test - Mockito plugin issue (#10734) 963bcb1 release 1.17.0-snapshot.20210831.7702.0.f058c2f1 (#10730) c11323d LF: Refactor engine test reinterpret (#10724) a995fa3 [DPP-574] Update docs about TLS: encrypted private key (#10727) 66970b7 Not sharing the absolute Deadline object. (#10713) c6c304b Improve script error on invalid script identifier format (#10702) b748fd6 Support TLS in Daml Helper --json-api requests (#10709) ``` Changelog: ``` kvutils - protobuf rejections generated by the participant state v1 API are deprecated (Inconsistent, Disputed, ResourcesExhausted, PartyNotKnownOnLedger) [Docs] Fixed link to the gRPC health checks in the driver for PostgreSQL docs participant-state: Emit completions (CommandRejected) for duplicate commands when using pre-execution - Fix a bug in the auth middleware where insufficient credentials could still give access to list of running triggers. - [JSON API] The cli option `--access-token-file` is now deprecated. It is not needed anymore and you can safely remove it. Reason is that the operations which prior required a token at startup are now done on demand using the token of the incoming request. - [Docs] Information was added in the `Operating Daml` section on how to aggregate logs on Kubernetes in conjuction with Daml services & what options exists for exporting metrics from daml services (not Kubernetes specific) participant-state: - Migrate to use only v2 Read/Write Services. This includes the use of new models for rejections/updates/submission results. - Calculate deduplicate until for committer side deduplication based on the submitter given deduplication period - if using the deprecated deduplicateUntil then just set the given timestamp - if using duration then calculate the new deduplicateUntil by using the formula (submissionTime + deduplicationDuration + minSkew) - if using offset deduplication then calculate deduplicateUntil by using the formula (submissionTime + maxDeduplicationDuration + minSkew) - if the deduplication period is not set then we don't set deduplicateUntil - Emit completions with status `ALREADY_EXISTS` for duplicate commands - [Java Bindings] DamlLedgerClient.Builder allows to set a timeout for command using `withTimeout`. - [Daml Assistant] The `daml ledger` commands now accepts `--tls` in combination with `--json-api` to access a JSON API behind a TLS reverse proxy. ``` CHANGELOG_BEGIN CHANGELOG_END
This PR has been created by a script, which is not very smart and does not have all the context. Please do double-check that the version prefix is correct before merging. @akrmn is in charge of this release. Commit log: ``` 35a853f Async logging for http-json-perf tests and publish logs as a pipeline artifact (#10803) a349217 get oracle perf test output (#10800) 5f448f2 Cleanup DataSource initialization (#10794) 0e250d0 Fix parallel ingestion initialization (#10797) b8bd5e6 interface PoC: protobuf definitions (#10796) a1da025 FreePort draw from outside ephemeral port range (#10774) 945a86c Fix a race in a test of the RecoveringIndexer (#10792) 4093bbd fix macOS Bazel cache (#10795) ed99fe5 ParticipantPruningIT added events for canton safe-pruning and multi-node synchronization (#10775) 5b28aef Deprecate kvutils rejections generated only by participant state v1 [KVL-1090] (#10791) 116d6a5 DPP-577 Use BIGINT instead of TIMESTAMP (#10761) cdf4bf1 Check compatibility of timestamps (#10793) 098a08c Add Moisés to release rotation (#10789) 7ae199f Follow-up to h2 storage backend user/password in url: Review feedback switch to regex and move to pure function (#10541) a5999ab remove dead code (#10765) 4b7391b LedgerApiTestTool prints skipped (excluded) tests (#10726) a0f0fb3 [DPP-418] Enable outstanding test case for private key decryption (#10778) d750666 Do not drop details when converting between gRPC `Status` classes [KVL-1084] (#10745) 5ec9a23 Fix link to the gRPC health checks in the driver for PostgreSQL docs (#10779) 8538fe9 HA protected ledger initialization for append only schema [DPP-551] (#10705) 23b0fe7 Fix exclusions for command service tests (#10777) c4e0a75 LF: drop ad-hoc ImmArray builders (#10763) 303ba90 participant-state: Re-enable integration test for command deduplication [KVL-1089] (#10751) 793253c bump cleanup threshold (#10771) 4d67684 Fix flaky DBLock tests (#10770) 7515871 Fix exclusions for command dedup in compat tests (#10767) 50fecfb Wrap missing label names in quotes (#10749) b1c6e87 fix claims check in auth middleware (#10768) 5595d55 [JSON-API] Use the token from incoming requests to update the package list (#10602) dbafea0 Add unit tests for DBLockStorageBackend [DPP-495] (#10746) e5a6d70 Added buffer size metrics for getTransactions/getTransactionTrees (#10744) f76c868 update NOTICES file (#10762) bb908d0 Create a link to party management service in main Ledger API page. (#10138) e45d852 Fixes participant to do retries on startup as waiting for DB connectivity (#10759) f68a129 Report some oracle_json_perf numbers on slack (#10754) 7270ee3 Handle dynamic port in auth middleware client (trigger service) (#10755) ff1308e [Docs] Add info on logs on Kubernetes & metrics in the ops section (#10525) d267390 Update buf image [kvl-1049] (#10752) d2180cf Update exclusion for command deduplication to include full version range (#10750) 41881ba Command dedup: migrate kvutils to use v2 services [KVL-1049] (#10679) 4525b8c [JSON-API] vanilla oracle_perf ci job (#10688) 16df8a5 [Short] Remove unused code (#10719) b28afcf [DPP-438] Update docs on metrics that no longer use <party_name> in their name (#10728) 0662025 Clarify what the `buf` image is and how it should be used (#10741) 9071a05 sandbox-classic: Reintroduce SqlLedger tests for the mutable index. (#10722) f576cdf fix flaky test in RecoveringIndexer (#9619) f6a75b4 ledger-api-common: Do not mock final classes. (#10733) e9c8af5 Bump ghc-lib to include dropped parsing code for generic templates (#10735) 862a290 rotate release duty after 1.17.0-snapshot.20210831.7702.0.f058c2f1 (#10731) 12cb92b update compat versions for 1.17.0-snapshot.20210831.7702.0.f058c2f1 (#10737) 73c0de8 [DPP-578] Temporarily disable flaky test - Mockito plugin issue (#10734) 963bcb1 release 1.17.0-snapshot.20210831.7702.0.f058c2f1 (#10730) c11323d LF: Refactor engine test reinterpret (#10724) a995fa3 [DPP-574] Update docs about TLS: encrypted private key (#10727) 66970b7 Not sharing the absolute Deadline object. (#10713) c6c304b Improve script error on invalid script identifier format (#10702) b748fd6 Support TLS in Daml Helper --json-api requests (#10709) ``` Changelog: ``` kvutils - protobuf rejections generated by the participant state v1 API are deprecated (Inconsistent, Disputed, ResourcesExhausted, PartyNotKnownOnLedger) [Docs] Fixed link to the gRPC health checks in the driver for PostgreSQL docs participant-state: Emit completions (CommandRejected) for duplicate commands when using pre-execution - Fix a bug in the auth middleware where insufficient credentials could still give access to list of running triggers. - [JSON API] The cli option `--access-token-file` is now deprecated. It is not needed anymore and you can safely remove it. Reason is that the operations which prior required a token at startup are now done on demand using the token of the incoming request. - [Docs] Information was added in the `Operating Daml` section on how to aggregate logs on Kubernetes in conjuction with Daml services & what options exists for exporting metrics from daml services (not Kubernetes specific) participant-state: - Migrate to use only v2 Read/Write Services. This includes the use of new models for rejections/updates/submission results. - Calculate deduplicate until for committer side deduplication based on the submitter given deduplication period - if using the deprecated deduplicateUntil then just set the given timestamp - if using duration then calculate the new deduplicateUntil by using the formula (submissionTime + deduplicationDuration + minSkew) - if using offset deduplication then calculate deduplicateUntil by using the formula (submissionTime + maxDeduplicationDuration + minSkew) - if the deduplication period is not set then we don't set deduplicateUntil - Emit completions with status `ALREADY_EXISTS` for duplicate commands - [Java Bindings] DamlLedgerClient.Builder allows to set a timeout for command using `withTimeout`. - [Daml Assistant] The `daml ledger` commands now accepts `--tls` in combination with `--json-api` to access a JSON API behind a TLS reverse proxy. ``` CHANGELOG_BEGIN CHANGELOG_END Co-authored-by: Azure Pipelines DAML Build <support@digitalasset.com>
CHANGELOG_BEGIN
CHANGELOG_END
Pull Request Checklist
CHANGELOG_BEGIN
andCHANGELOG_END
tagsNOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with
/AzurePipelines run
totrigger the build.