Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kutils: Make the validation and the preloading of PackageCommitter parametric #7460

Merged
merged 7 commits into from
Oct 8, 2020

Conversation

remyhaemmerle-da
Copy link
Collaborator

@remyhaemmerle-da remyhaemmerle-da commented Sep 22, 2020

This PR creates 3 validation modes:

  • Strict : Specifies that the committer should validate the package
    before committing them to the ledger. When using this mode, the
    packages committed to the ledger can be fully trusted and do not
    have to be validated when loaded into the engine.

  • Lenient: Specifies that the committer should perform a fast
    validation of the packages before committing them to the ledger.
    This mode is useful for ledger integrations that cannot handle
    long-running submissions (> 10s). When using this mode, the
    packages committed to the ledger cannot be trusted and must be
    validated every time they are loaded into the engine.

  • No : Specifies that the committer should not perform any
    validation the packages before committing them to the ledger. This
    should be used only by non distributed ledgers, like DAML-on-SQL,
    where the validation done in the API server can be trusted.

This PR creates 3 preloading modes:

  • Synchronous : Specifies that the packages should be preloading
    into the engine before committed.

  • Asynchronous: Specify that the packages should be preloaded into
    the engine asynchronously with the rest of the commit process. This
    mode is useful for ledger integrations that cannot handle
    long-running submissions (> 10s). Failure of the preloading process
    will not affect the commit.

  • Preloading: Specify that the packages should not be preloaded into
    the engine.

CHANGELOG_BEGIN

  • [Integration Kit] In kvutils, add metric
    daml.kvutils.committer.package_upload.validate_timer to track
    package validation time.

CHANGELOG_END

Pull Request Checklist

  • Read and understand the contribution guidelines
  • Include appropriate tests
  • Set a descriptive title and thorough description
  • Add a reference to the issue this PR will solve, if appropriate
  • Include changelog additions in one or more commit message bodies between the CHANGELOG_BEGIN and CHANGELOG_END tags
  • Normal production system change, include purpose of change in description

NOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with /AzurePipelines run to
trigger the build.

@remyhaemmerle-da remyhaemmerle-da force-pushed the remy-engine-validate-2 branch 4 times, most recently from 91f3b9f to 8e20001 Compare September 23, 2020 07:06
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the migration plan? Are we going to leave it in post-commit mode for v1.6 and then switch to pre-commit mode in 1.7? Make pre-commit opt-in forever? Something else?

Whatever the plan, please document it in the changelog.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the right direction. It would be good to have a test case for the different validation modes in kvutils. Since it isn't clear how we roll out the change I would at this point strive to get the functionality in (with enough test coverage!) and then work on getting consensus on how to roll this out.

@remyhaemmerle-da remyhaemmerle-da added the component/dlik DAML Ledger Integration Kit label Sep 29, 2020
@remyhaemmerle-da
Copy link
Collaborator Author

What's the migration plan? Are we going to leave it in post-commit mode for v1.6 and then switch to pre-commit mode in 1.7? Make pre-commit opt-in forever? Something else?

Whatever the plan, please document it in the changelog.

Currently there is no migration plan. We just add the possibility of pre-commit validation.
I will let the ledger team decide exactly how to roll out the change.
As far I understand what @dajmaki explained me, we cannot simply migrate, some ledger support long submission run but some other don't, so we will have to continue supporting the 3 different modes.

@remyhaemmerle-da remyhaemmerle-da force-pushed the remy-engine-validate-2 branch 2 times, most recently from 316bd88 to 77f0126 Compare September 29, 2020 14:17
"decode_entry" -> decodeEntry,
"validate_entry" -> validateEntry,
"filter_duplicates" -> filterDuplicates,
"preload" -> preload,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need synchronous preloading ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think we need that, unless it's more efficient to validate+preload when committing rather than validating and loading the package to engine separately.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think the system is more predictable if we preload it synchronously when committing as that means as soon as the package upload completes we can start using the package without having a surprising delay on the first transaction. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a trade off:

  • If we preload we are indeed more predictable. It is also slightly more efficient as we decode only once. (I think our bigger dars need around 20s to decode).
  • If we don't, we do not overload the engine with useless packages. Potentially the code for scenario/trigger/script can end in the dar even it is useless for the Ledger engine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the feature useful for testing/development. In production, dar upload should be so seldom that it should not really matter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the preload should by synchronous for the pro-preload reasons you both mentioned.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will preload in the 3 cases.


override protected val committerName: String = "package_upload"

metrics.daml.kvutils.committer.packageUpload.loadedPackages(() =>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is inherited from earlier code or not, but this would only update the metric once at startup which isn't the idea. It should be done from within a step or probably preferably this is a metric that the engine should maintain and it shouldn't belong here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is from the original code.
Currently the Engine does not handle metric. If we want to do so, I will prefer wrapping it in a kv specific class.
For now, I moved this code at the end of the preload step.

override def context: Context = NoContext

override protected def prettyInternal: String =
s"the set of packages ${pkgIds.mkString("{", ",", "}")} is not self consistent, it missing the dependencies ${pkgIds
Copy link

@ghost ghost Sep 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s"the set of packages ${pkgIds.mkString("{", ",", "}")} is not self consistent, it missing the dependencies ${pkgIds
s"The set of packages ${pkgIds.mkString("{", ",", "}")} is not self-consistent, it is missing the dependencies ${pkgIds

// This mode is useful for ledger integrations that cannot handle
// long-running submissions (> 10s).
// See PackageCommitter for more detail.
// FIXME: Currently `PostCommit` is the mode used by default
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have an issue and a plan for addressing this. Probably not worth leaving a FIXME in the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened KVL-578 to make the validation mode configurable.

* When using this committer, packages on the ledger cannot be trusted and should be
* validated each time uploaded to the engine.
*/
private[committer] class PackageUploader(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for not bringing this up in previous review round, but the naming here is also confusing. Could you name these so that it's clear to the reader that these are different "PackageCommitter" variants? E.g. NonValidatingPackageCommitter, HashValidatingPackageCommitter and maybe FullValidatingPackageCommitter (or better names if you can think of them). This way it's clear at a glance what these are.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Jussi.


override protected val steps: Iterable[(StepInfo, Step)] =
List(
"deduplicate_submission" -> deduplicateSubmission,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes me little bit nervous that the step lists are duplicated. If one needs to add a new step one needs to remember to add it to all three places. Since all of them are extra validation steps the order isn't as important, so could it be safer to have e.g. "extra_validation_steps" or some such that gets spliced into steps in PackageCommitter? What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I share @dajmaki's concern.

private val logger = LoggerFactory.getLogger(this.getClass)

def this(engine: Engine, metrics: Metrics) = this(engine, metrics, false)
def this(engine: Engine, metrics: Metrics) =
this(engine, metrics, false, CommitPackageValidationMode.PostCommit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid having multiple places with default values. I think this constructor should also have the packageValidation parameter. Any concerns with such an API change, @miklos-da?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped the default values.


override protected val steps: Iterable[(StepInfo, Step)] =
List(
"deduplicate_submission" -> deduplicateSubmission,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I share @dajmaki's concern.

* When using this committer, packages on the ledger cannot be trusted and should be
* validated each time uploaded to the engine.
*/
private[committer] class PackageUploader(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Jussi.

"decode_entry" -> decodeEntry,
"validate_entry" -> validateEntry,
"filter_duplicates" -> filterDuplicates,
"preload" -> preload,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the preload should by synchronous for the pro-preload reasons you both mentioned.

@remyhaemmerle-da remyhaemmerle-da force-pushed the remy-engine-validate-2 branch 5 times, most recently from fb23c6b to 89ff6d7 Compare October 2, 2020 13:07
@remyhaemmerle-da remyhaemmerle-da force-pushed the remy-engine-validate-2 branch 3 times, most recently from 850d0e7 to d3b6c9c Compare October 5, 2020 08:26
@remyhaemmerle-da remyhaemmerle-da marked this pull request as ready for review October 5, 2020 08:28
@remyhaemmerle-da remyhaemmerle-da changed the title Add pre commit package validation in KV kutils: Make validation and preloading parametric in PackageCommitter Oct 5, 2020
@remyhaemmerle-da remyhaemmerle-da changed the title kutils: Make validation and preloading parametric in PackageCommitter kutils: Make the validation and the preloading of PackageCommitter parametric Oct 6, 2020
Copy link
Contributor

@gerolf-da gerolf-da left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @remyhaemmerle-da! The separation into preloading and validation is great!

@@ -712,6 +710,15 @@ object ParticipantStateIntegrationSpecBase {
private val darFile = new File(rlocation("ledger/test-common/model-tests.dar"))
private val archives = darReader.readArchiveFromFile(darFile).get.all

// 2 self consistent archives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't have dependencies to other archives.

remyhaemmerle-da and others added 7 commits October 8, 2020 11:30
This PR creates 3 validation modes:

+ `Strict` : Specifies that the committer should validate the package
   before committing them to the ledger.  When using this mode, the
   packages committed to the ledger can be fully trusted and do not
   have to be validated when loaded into the engine.

+ `Lenient`: Specifies that the committer should perform a fast
   validation of the packages before committing them to the ledger.
   This mode is useful for ledger integrations that cannot handle
   long-running submissions (> 10s).  When using this mode, the
   packages committed to the ledger cannot be trusted and must be
   validated every time they are loaded into the engine.

+ `No` : Specifies that the committer should not perform any
   validation the packages before committing them to the ledger.  This
   should be used only by non distributed ledgers, like DAML-on-SQL,
   where the validation done in the API server is can be trusted.

This PR creates 3 preloading modes:

+ `Synchronous` : Specifies that the packages should be preloading
   into the engine before committed.

+ `Asynchronous`: Specify that the packages should be preloaded into
  the engine asynchronously with the rest of the commit process.  This
  mode is useful for ledger integrations that cannot handle
  long-running submissions (> 10s).  Failure of the preloading process
  will not affect the commit.

+ `Preloading`: Specify that the packages should not be preloaded into
  the engine.

CHANGELOG_BEGIN
- [Integration Kit] In kvutils, add metric
  `daml.kvutils.committer.package_upload.validate_timer` to track
  package validation time.
CHANGELOG_END
Co-authored-by: Miklos <57664299+miklos-da@users.noreply.github.com>
Co-authored-by: Gerolf Seitz <gerolf.seitz@daml.com>
Co-authored-by: Miklos <57664299+miklos-da@users.noreply.github.com>
@azure-pipelines
Copy link
Contributor

Command 'rerun' is not supported by Azure Pipelines.

Supported commands
  • help:
    • Get descriptions, examples and documentation about supported commands
    • Example: help "command_name"
  • list:
    • List all pipelines for this repository using a comment.
    • Example: "list"
  • run:
    • Run all pipelines or specific pipelines for this repository using a comment. Use this command by itself to trigger all related pipelines, or specify specific pipelines to run.
    • Example: "run" or "run pipeline_name, pipeline_name, pipeline_name"
  • where:
    • Report back the Azure DevOps orgs that are related to this repository and org
    • Example: "where"

See additional documentation.

@azure-pipelines
Copy link
Contributor

Azure Pipelines successfully started running 1 pipeline(s).

@remyhaemmerle-da remyhaemmerle-da merged commit be35f3a into master Oct 8, 2020
@remyhaemmerle-da remyhaemmerle-da deleted the remy-engine-validate-2 branch October 8, 2020 13:03
remyhaemmerle-da added a commit that referenced this pull request Oct 12, 2020
This PR fixes a bug in the PackageCommitter introduced recently.
Dependencies not decoded are assumed to be already preloaded and
should not bee looked for.

CHANGELOG_BEGIN
CHANGELOG_END
remyhaemmerle-da added a commit that referenced this pull request Oct 12, 2020
This PR fixes a bug in the PackageCommitter introduced recently.
Dependencies not decoded are assumed to be already preloaded and
should not bee looked for.

CHANGELOG_BEGIN
CHANGELOG_END
remyhaemmerle-da added a commit that referenced this pull request Oct 12, 2020
This PR fixes a bug in the PackageCommitter introduced recently.
Dependencies not decoded are assumed to be already preloaded and
should not bee looked for.

CHANGELOG_BEGIN
CHANGELOG_END
remyhaemmerle-da added a commit that referenced this pull request Oct 13, 2020
This PR fixes a bug in the PackageCommitter introduced recently.
Dependencies not decoded are assumed to be already preloaded and
should not bee looked for.

CHANGELOG_BEGIN
CHANGELOG_END
@stefanobaghino-da stefanobaghino-da added component/json-api HTTP JSON API team/ledger-clients Related to the Ledger Clients team's components. labels Aug 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/dlik DAML Ledger Integration Kit component/json-api HTTP JSON API team/ledger-clients Related to the Ledger Clients team's components.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants