Identity and package management #2532

oggy- · 2019-08-14T10:05:41Z

Hey folks,

here's a write-up of the identity and package management services. This also includes the ledger topology write-up for completeness, but let's keep the discussion on that topic in a separate PR:
#2476

Pull Request Checklist

Read and understand the contribution guidelines
Include appropriate tests
Set a descriptive title and thorough description
Add a reference to the issue this PR will solve, if appropriate
Add a line to the release notes, if appropriate
Normal production system change, include purpose of change in description

NOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with /AzurePipelines run to
trigger the build.

bame-da

Very nice piece overall. Just a few bits that confused me given that Canton ledgers can split and merge.

bame-da · 2019-08-15T09:30:23Z

docs/source/concepts/identity-and-package-management.rst

+Identity and Package Management
+###############################
+
+A DAML Ledger is a software system that enables parties to automate the management of their rights and obligations through smart contract code.


Don't think you need to introduce DAML at this point.

True - just needed an overture for why we need identity and package management. Rephrased

bame-da · 2019-08-15T09:41:14Z

docs/source/concepts/identity-and-package-management.rst

+
+By definition, identifiers identify parties, and are thus unique for a ledger.
+They do not, however, have to be unique across different ledgers.
+That is, two parties with identical identifiers in two different ledgers are not the same party.


Is this true in a Canton world?
If I have two parties with the same identifier in two un-connected Canon domains, this would suggest they are different parties. However, as soon as these domains somehow get connected, even if through several domain-hops, they are the same? Or do you consider all Canton domains in the world to be part of one partitioned leder?

I consider Canton the latter (we call it the "virtual global ledger"). I weakened the statement to "not necessarily", though, as what I wrote was not true in general.

bame-da · 2019-08-15T09:43:25Z

docs/source/concepts/identity-and-package-management.rst

+For example, a party with a display name "Attorney of Nigerian Prince" might well be controlled by a real-world entity without a bar exam.
+However, particular ledger deployments might make stronger guarantees about this link.
+Finally, the association of identifiers to display names may change over time.
+For example, a party might change its display name from "Bruce" to "Caitlyn" -- as long as the identifier remains the same, so does the party.


Are "display names" defined at ledger-level, domain-level or participant-level?

AFAIK, participant level - added.

docs/source/concepts/identity-and-package-management.rst

bame-da · 2019-08-15T09:46:20Z

docs/source/concepts/identity-and-package-management.rst

+The ``AllocateParty`` call can take the desired identifier and displayed name as optional parameters, but these are merely hints and the ledger implementation may completely ignore them.
+
+If the call returns a new identifier, the :ref:`participant node <participant-node-def>` serving this call is ready to host the party with this identifier.
+The returned identifier is guaranteed to be **unique** in the ledger; namely, no other call of the ``AllocateParty`` method at this or any other ledger participant may return the same identifier.


Again, this doesn't make sense to me in a Canton world where ledgers can merge and split.

I'm not sure what issue you're raising here: is the statement too weak, or too strong?

I don't see any stronger statements we could make: we can only guarantee that the returned identifier is unique within "the ledger". I can easily have two Sandbox instances who return the same identifiers when AllocateParty is called.

I don't think it's too weak either; I believe it should hold for all ledgers with central operators, and also for Canton. Even when Canton is viewed as a single virtual global ledger, AllocateParty will never return the same identifier, even when initiated at two participants who are cut off from each other (at least with the current Canton identity management system).

After discussing with Bernhard: it's too strong; in Canton we can have two participants that share the same private key, and thus allocate the same party identifiers. I'll rephrase to something weaker.

bame-da · 2019-08-15T09:52:55Z

docs/source/concepts/identity-and-package-management.rst

+The unit of packaging for DAML-LF is the :ref:`.dalf <dar-file-dalf-file>` file.
+Each ``.dalf`` file is uniquely identified by its **package identifier**, which is the hash of its contents.
+A :ref:`.dar <dar-file-dalf-file>` file is a simple archive containing multiple ``.dalf`` files, and has no identifier of its own.
+DAML ledgers support uploading only ``.dar`` files.


This seems rather bizarre. We are saying that the deployable unit of code is a dalf file with a unique identifier, but to actually deploy it, you must wrap it in an otherwise function-less container.
Plus, both Sandbox and Canton do allow deployment of dalf files, don't they?

They do, but the official API only takes .dar files. I added an explanation as to why .dars are still useful.

bame-da · 2019-08-15T09:55:43Z

docs/source/concepts/identity-and-package-management.rst

+However, the operators of the ledger infrastructure nodes may still wish to review and vet any DAML code before allowing it to execute.
+One reason for this is that the DAML interpreter currently lacks a notion of reproducible resource limits.
+Thus, executing a DAML contract might result in high memory or CPU usage.
+Furthermore, security bugs in the DAML interpreter or JVM might enable malicious code to break out of the sandbox.


Sure, that's possible, but I wouldn't suggest this in our docs.

docs/source/concepts/identity-and-package-management.rst

meiersi-da

@oggy- : this is good stuff! Forgive my very direct language in the comments... I was minimizing typing 😊

I haven't yet completed the review, but need to make a save-point. See all comments as suggestions and hints at how one reader perceived your text when reading without context top-down. Your call on how to address.

meiersi-da · 2019-08-16T14:36:40Z

docs/source/concepts/ledger-topologies.rst

+The topologies can impact both the functional and non-functional properties of the resulting ledger.
+This document:
+
+1. Provides one useful categorization of the existing implementations' topologies.


add headings of the two topologies to give a preview and easy link targets

meiersi-da · 2019-08-16T14:36:59Z

docs/source/concepts/ledger-topologies.rst

+This document:
+
+1. Provides one useful categorization of the existing implementations' topologies.
+   The categorization is not the only one possible, and it is not always clear-cut.


Suggest to drop this line. Covered by one useful above.

meiersi-da · 2019-08-16T14:37:18Z

docs/source/concepts/ledger-topologies.rst

+
+1. Provides one useful categorization of the existing implementations' topologies.
+   The categorization is not the only one possible, and it is not always clear-cut.
+   Its main aim is to group the implementations according to their high-level properties.


Drop as well. Not clear how it helps reader.

meiersi-da · 2019-08-16T14:38:29Z

docs/source/concepts/ledger-topologies.rst

+   The categorization is not the only one possible, and it is not always clear-cut.
+   Its main aim is to group the implementations according to their high-level properties.
+
+2. Describes the general ledger properties of each category.


Having written the above comments: consider replacing the two items with a short summary of the two categories, and add a forward link to Package and Party Management where they are helpful to know.

meiersi-da · 2019-08-16T14:39:40Z

docs/source/concepts/ledger-topologies.rst

+
+.. _trust-domain:
+
+We call a system operated by a single real-world entity a **trust domain**.


Add definition to Glossary of Background concepts, and reference from here; or perhaps better intro of topologies page.

meiersi-da · 2019-08-16T14:41:40Z

docs/source/concepts/ledger-topologies.rst

+
+- it provides no scaling
+
+- it is not highly available


these two properties are "not not true" ;-) all the Google Apps are examples of highly available and scalable services operated in a single trust domain.

==> remove these two bullet points

meiersi-da · 2019-08-16T14:43:45Z

docs/source/concepts/ledger-topologies.rst

+
+- the real-world entity operating the physical shared ledger is fully trusted with preserving the ledger's integrity
+
+- the real-world entity operating the physical shared ledger has full insight into the entire ledger, and is thus fully trusted with privacy


real-world entity vs. single physical copy vs. trust domain. Consider defining operating entity as the entity operating trust domain use that in the definition of Fully Centralized Ledger topology

meiersi-da · 2019-08-16T14:46:06Z

docs/source/concepts/ledger-topologies.rst

+The :ref:`DAML Sandbox <sandbox-manual>` uses this topology.
+While simple, this topology has certain downsides:
+
+- it provides no scaling


suggestion: use ordered lists to make it easy to reference items in discussions

meiersi-da · 2019-08-16T14:46:20Z

docs/source/concepts/ledger-topologies.rst

+
+The first four problems can be solved or mitigated as follows:
+
+- scaling by splitting the system up into separate functional components and parallelizing execution


consider using ordered list for ease of reference

meiersi-da

Another save-point. Looking forward to reviewing the remainder of the pkg and identity mngt. Thanks for the good work @oggy- !

meiersi-da · 2019-08-16T15:31:32Z

docs/source/concepts/ledger-topologies.rst

+
+.. _participant-node-def:
+
+2. **Participant nodes**, (also called Client nodes in some platforms) which serve the ledger API to a subset of the system parties, which we say are hosted by this participant.


s/ledger API/Ledger API

meiersi-da · 2019-08-16T15:37:10Z

docs/source/concepts/ledger-topologies.rst

+`Canton <http://canton.io>`__ and DAML on `R3 Corda <https://www.corda.net>`__ are two such implementations.
+The main drawback of this topology is that availability can be influenced by the participant nodes.
+In particular, transactions cannot be committed if they use data that is only stored on unresponsive nodes.
+Spreading the data among additional trusted entities can mitigate the problem.


Completed review of topology section: very nice decomposition of topologies @oggy- ! This is going to be a good basis for providing a "Which DAML ledger should I use?" section!

meiersi-da · 2019-08-16T15:44:05Z

docs/source/concepts/identity-and-package-management.rst

+
+#. the minimal behavioral guarantees for identity and package services across all ledger implementations, and
+
+#. guidelines to understand how the :ref:`ledger's topology <daml-ledger-topologies>` influences the unspecified part of the behavior.


It would be good if this section clarified

who the audience of the document is

what questions the seciton will answer.

My take is that the audience is ledger implementors and ledger operators. There seem to be multiple questions:

How should I implement party and package managment?

Which ledger should I select provided my requirements on party and package management?

meiersi-da

@oggy- I've completed the review :) Good stuff! Thanks a lot!

I've added extensive comments, but none of them are critical. Please address as you see fit!

meiersi-da · 2019-08-19T07:47:55Z

docs/source/concepts/identity-and-package-management.rst

+The access to these services is usually more restricted compared to the other Ledger API services, as they are part of the administrative API.
+Any implementation of the services is guaranteed to accept inputs and provide outputs of the format specified by these services.
+However, the services' *behavior* -- the relationship between the inputs and outputs that the various parties observe -- is largely implementation dependent.
+The remainder of the document will presents both:


s/will presents/presents

meiersi-da · 2019-08-19T07:54:42Z

docs/source/concepts/identity-and-package-management.rst

+In fact, in such a ledger, the participants ``P1`` and ``P2`` might not have a way to communicate to each other, or might not even be aware of each other's existence.
+
+For diagnostics, the ledger also provides a ``ListKnownParties`` method which lists parties known to the participant node.
+The parties can be local (i.e., hosted by the participant) or not.


Good section! Thanks @oggy- !

meiersi-da · 2019-08-19T08:17:38Z

docs/source/concepts/identity-and-package-management.rst

+For example, if the operator is a stock exchange, it might guarantee that a real-world exchange participant whose legal name is "Bank Inc." is represented by a ledger party with the identifier "Bank Inc.".
+Alternatively, it might use a random identifier, but guarantee that the display name is "Bank Inc.".
+Ledgers with :ref:`partitioned topologies <partitioned-topologies>` in general might not have such a single store of identities.
+The solutions for linking the identifiers to real-world identities could rely on certificate chains, `verifiable credentials <https://www.w3.org/TR/vc-data-model/>`__, or other mechanisms.


Consider adding a sentence like

An additional option is to use for a DAML-based application to include a Know-Your-Customer workflow in DAML to establish the link from a party to a real world identity.

meiersi-da · 2019-08-19T08:19:12Z

docs/source/concepts/identity-and-package-management.rst

+Templates in a ``.dalf`` file can references templates from other ``.dalf`` files, i.e., ``.dalf`` files can depend on other ``.dalf`` files.
+A :ref:`.dar <dar-file-dalf-file>` file is a simple archive containing multiple ``.dalf`` files, and has no identifier of its own.
+The archive provides a convenient way to package ``.dalf`` files together with their dependencies.
+The Ledger API supports only ``.dar`` file uploads.


Add: However storage of data on-ledger typically happens using .dalf files, as only they have globally unique identifiers.

Co-Authored-By: Bernhard Elsner <40762178+bame-da@users.noreply.github.com>

oggy- · 2019-08-19T15:19:32Z

@meiersi-da thanks for the review and I hope it saves you some discussions in the future :) I think I was able to address all your comments in a satisfactory way

oggy- requested review from meiersi-da, mziolekda and bame-da August 14, 2019 10:05

bame-da approved these changes Aug 15, 2019

View reviewed changes

meiersi-da reviewed Aug 16, 2019

View reviewed changes

meiersi-da approved these changes Aug 19, 2019

View reviewed changes

oggy- and others added 5 commits August 19, 2019 15:21

Identity and package management

b3d5786

Apply suggestions from code review

577fb56

Co-Authored-By: Bernhard Elsner <40762178+bame-da@users.noreply.github.com>

Bernhard's comments

de28b4f

Simon's comments

553fd7d

Update license

c0867c0

oggy- force-pushed the identity-and-package-management branch from 52b2763 to c0867c0 Compare August 19, 2019 15:15

Oh noes - I broked teh linkz!

4eb2b55

oggy- merged commit c71237b into master Aug 19, 2019

oggy- deleted the identity-and-package-management branch August 19, 2019 17:09


		.. _trust-domain:

		We call a system operated by a single real-world entity a trust domain.


		- the real-world entity operating the physical shared ledger is fully trusted with preserving the ledger's integrity

		- the real-world entity operating the physical shared ledger has full insight into the entire ledger, and is thus fully trusted with privacy


		The first four problems can be solved or mitigated as follows:

		- scaling by splitting the system up into separate functional components and parallelizing execution


		.. _participant-node-def:

		2. Participant nodes, (also called Client nodes in some platforms) which serve the ledger API to a subset of the system parties, which we say are hosted by this participant.


		#. the minimal behavioral guarantees for identity and package services across all ledger implementations, and

		#. guidelines to understand how the :ref:`ledger's topology <daml-ledger-topologies>` influences the unspecified part of the behavior.

Identity and package management #2532

Identity and package management #2532

Conversation

oggy- commented Aug 14, 2019

Pull Request Checklist

bame-da left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oggy- Aug 16, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meiersi-da left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meiersi-da left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

meiersi-da left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oggy- commented Aug 19, 2019

oggy- Aug 16, 2019 •

edited

Loading