Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Draft version constraint generation #5472

Merged
merged 51 commits into from
May 20, 2020
Merged

Conversation

Gertjan423
Copy link
Contributor

@Gertjan423 Gertjan423 commented Apr 7, 2020

First draft version of static verification tool.

  • Reads DAR files.
  • Partially evaluates the code.
  • Generates constraints for all fields, as the tool does not yet read annotations regarding which fields should be tracked.

Main TODOs at the moment:

  • Implement some undefined's and TODO's throughout the code.
  • Take additional inputs or read annotations to determine which fields should be tracked.
  • Extend with additional language constructs (recursion).
  • Additional testing.

@Gertjan423 Gertjan423 requested a review from hurryabit April 7, 2020 16:42
@Gertjan423 Gertjan423 requested a review from cocreature as a code owner April 7, 2020 16:42
Copy link
Contributor

@cocreature cocreature left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, congrats on your first PR @Gertjan423 🎉 A few preliminary comments but I very much like the direction this is going in!

@@ -0,0 +1,57 @@
cabal-version: 2.4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don’t need a cabal file, we build our code with Bazel.


-- | Static verification of DAML packages.
module DA.Daml.LF.Verify
( module LF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you reexport LF if this is main?

genForExercise tem ch cid par arg = do
cidOut <- genExpr cid
argOut <- genExpr arg
-- TODO: Take possibility into account that the choice is not found?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be caught by the LF typechecker, so no need to do more than crash here.

-- TODO: This is a random error, as we do not have access to the expected
-- type, which we need to constructed the error we really want.
-- Perhaps we do need to define our own set of errors.
-- _ -> throwError (EExpectedRecordType ty)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally format to be somewhat friendly to diffs. So no alignment, line breaks and indentation such that the amount of indentation does not depend on identifier length. Not a hard rule but I like being somewhat consistent here.

ETyApp expr typ -> genForTyApp expr typ
EVar name -> genForVar name
EVal w -> genForVal w
EUpdate (UCreate tem arg) -> genForCreate tem arg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn’t seem quite right to me. You are conflating evaluation and execution afaict. Update works like IO in Haskell. Evaluating an Update expression does not do anything. The way this is implemented at runtime is by passing around an opaque token (similar to the RealWorld token in Haskell) so Update expressions are actually functions. You can probably do something similar for your analysis: If you start from a choice you look at the update expression evaluated to the token. If you just see an Update expression that is not applied to the token, you don’t look inside.

-- + or create our own world environment to store values. This also makes
-- sense as these values work similar to how they work in the type checker,
-- except that we need to store a partially evaluated definition as well.
-- Both approaches make sense, but both imply a lot of code duplication, so they
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn’t worry too much about duplicating code for now.

-- abstract over it. Same for this returning foldM.
genPackage' :: MonadDelta m => Delta
-> (PackageId, (Package, Maybe PackageName)) -> m Delta
genPackage' deli (idi, (paci, _)) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do all variable names end with i? 🙂

-> (PackageId, (Package, Maybe PackageName)) -> m Delta
genPackage' deli (idi, (paci, _)) =
-- TODO: This is getting quite unreadable.
foldM (\deli' modi -> genModule' deli' (PRImport idi, modi))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m probably missing something but it looks to me like this only works if modules are sorted topologically. Otherwise, you might not be able to resolve a cross-module reference since you haven’t yet included the delta for that module?

let qname = Qualified pac mod (fst $ dvalBinder val)
return $ set devals (singleton qname (expOut ^. goExp, expOut ^. goUpd)) emptyDelta

-- TODO: Handle annotated choices, by returning a set of annotations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is an annotated choice? Presumably that’s some new annotation that you want to add to express constraints that should be verified?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m still not quite sure what an annotated choice is 🙂

(set usCre [UpdCreate tem fs] emptyUpdateSet)
(argOut ^. goDel))
_ -> throwError EEnumTypeWithParams
-- TODO: This is a random error, as we do not have access to the expected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m perfectly happy with just calling error for now for anything that isn’t well-typed LF.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually use something ala error "IMPOSSIBLE: Ill-typed DAML-LF" in situations like this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The impossible happened 😮

-- ^ The skolemised term variables.
, _devals :: !(HashMap (Qualified ExprValName) (Expr, UpdateSet))
-- ^ The bound values.
, _dchs :: !(HashMap ((Qualified TypeConName), ChoiceName) UpdateSet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few more unnecessary parentheses around, but hlint should catch that in the end anyway.

Suggested change
, _dchs :: !(HashMap ((Qualified TypeConName), ChoiceName) UpdateSet)
, _dchs :: !(HashMap (Qualified TypeConName, ChoiceName) UpdateSet)

Comment on lines 109 to 110
lookupDExprVar x = ask >>= \ del -> unless (elem x $ del ^. devars)
$ throwError $ EUnknownExprVar x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think lenses buy us anything here. I also think a >>= \x -> ... is a sign to use do notation.

Suggested change
lookupDExprVar x = ask >>= \ del -> unless (elem x $ del ^. devars)
$ throwError $ EUnknownExprVar x
lookupDExprVar x = do
del <- ask
unless (x `elem` _devars del) $ throwError $ EUnknownExprVar x

Comment on lines 41 to 43
= extendGOUpds (genOut2 ^. goUpd)
$ extendGODelta (genOut2 ^. goDel)
genOut1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that just plain function application?

Suggested change
= extendGOUpds (genOut2 ^. goUpd)
$ extendGODelta (genOut2 ^. goDel)
genOut1
= extendGOUpds (_goUpd genOut2)
$ extendGODelta (_goDel genOut2)
genOut1

-- TODO: This is getting quite unreadable.
foldM (\deli' modi -> genModule' deli' (PRImport idi, modi))
deli (NM.toList $ packageModules paci)
>>= return . (concatDelta deli)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

act >>= return . f is the same as fmap f act or f <$> act, depending on what you prefer.

ETmApp fun arg -> genForTmApp fun arg
ETyApp expr typ -> genForTyApp expr typ
EVar name -> genForVar name
EVal w -> genForVal w
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't align the arrows in case expressions. This will lead to a lot of "diff noise" down the line.

genValue pac mod val = do
expOut <- genExpr (dvalBody val)
let qname = Qualified pac mod (fst $ dvalBinder val)
return $ set devals (singleton qname (expOut ^. goExp, expOut ^. goUpd)) emptyDelta
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return $ set devals (singleton qname (expOut ^. goExp, expOut ^. goUpd)) emptyDelta
return $ emptyDelta{_devals = singleton qname (_goExp expOut, _goUpd expOut)}

Options{..} <- execParser optionsParserInfo
pkgs <- readPackages optInputDars
let _delta = runDelta $ genPackages pkgs
putStrLn "Constraints generated."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to deeply force _delta here, otherwise nothing will really happen in the test runs. Maybe just print it?

concatDelta (Delta vars1 vals1 chs1) (Delta vars2 vals2 chs2) =
Delta (vars1 ++ vars2) (vals1 `union` vals2) (chs1 `union` chs2)
-- TODO: union makes me slightly nervous, as it allows overlapping keys
-- (and just uses the first)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you want unionWith concatUpdateSet.

introDelta delta = local (concatDelta delta)

-- TODO: This is a bit strange in a reader monad.
-- Figure out a way to extend, instead of overwrite every time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe withDelta would communicate your intent better?

setDelta delta = local (const delta)

extVarDelta :: MonadDelta m => ExprVarName -> m a -> m a
extVarDelta x = local (over devars ((:) x))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have a function

varDelta :: ExprVarName -> Delta

and use introDelta (varDelta x) instead of extVarDelta x? At least to me, these primitives feel "more orthogonal" to me.

-- TODO: Random error.

-- | Helper functions mirrored from Env.
-- TODO: Reduce duplication by abstracting over MonadGamma and MonadDelta?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, I wouldn't bother with it for now. You don't want to tie yourself into unnecessary constraints. 😃

Comment on lines 41 to 43
= extendGOUpds (_goUpd genOut2)
$ extendGODelta (_goDel genOut2)
genOut1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you write this in a more direct style, you can remove a fair bit of code below:

Suggested change
= extendGOUpds (_goUpd genOut2)
$ extendGODelta (_goDel genOut2)
genOut1
= GenOutput
{ _goExp = _goExp genOut1
, _goDel = _goDel genOut1 `concatUpdateSet` _goDel genOut2
, _goUpd = _goUpd genOut1 `concatUpdateSet` _goUpd genOut2
}

-- ^ The updates, performed by this expression.
, _goDel :: Delta
-- ^ The context extension, made by this expression.
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my experience, spelling out identifiers has proven beneficial most of the time, ie., _goExpr, _goUpdate and _goDelta. (Expr is such a common unambiguous abbreviation that it's ok, whereas Del could also abbreviate "delete".)

Comment on lines 68 to 71
buildDelta :: MonadDelta m => Delta -> (a -> m Delta) -> [a] -> m Delta
buildDelta del op args = foldM step del args
where
step d x = concatDelta d <$> setDelta d (op x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suspicion, fuelled by your comment on setDelta, is that you want something like this:

introDeltas :: MonadDelta m => (a -> m Delta) -> [a] -> m b -> m b
introDeltas _ [] act = act
introDeltas f (x:xs) act = do
  delta <- f x
  introDelta delta (introDeltas f xs act)

and then have functions like

introModule :: MonadDelta m => PackageRef -> Module -> m b -> m b

Is that correct?

(set usCre [UpdCreate tem fs] emptyUpdateSet)
(argOut ^. goDel))
_ -> throwError EEnumTypeWithParams
-- TODO: This is a random error, as we do not have access to the expected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually use something ala error "IMPOSSIBLE: Ill-typed DAML-LF" in situations like this one.

-- ^ The list of archive updates.
, _usCho :: ![UpdChoice]
-- ^ The list of choice updates.
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to account for fetches as well?

@hurryabit hurryabit marked this pull request as draft April 9, 2020 09:17
@cocreature
Copy link
Contributor

@Gertjan423 CI is currently failing due to the format check. You can easily fix that by running ./fmt.sh locally or if you want to have it be a bit faster ./fmt.sh --diff. The latter is fast enough that some people even set it up for a precommit or at least prepush hook.

@@ -784,6 +784,15 @@ nixpkgs_package(
repositories = dev_env_nix_repos,
)

nixpkgs_package(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that once we want to actually make this a feature we ship to users we do need to support Windows. But let’s worry about this after we get this merged.


data Options = Options
{ optInputDar :: FilePath
, optChoiceTmpl :: String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be easier to understand if you already use the proper LF type here.

choiceName = ChoiceName (pack optChoiceName)
fieldTmpl = TypeConName [pack optFieldTmpl]
fieldName = FieldName (pack optFieldName)
solver <- locateRunfiles "z3_nix/bin/z3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this only works in our Bazel setup. We need to figure out what we do if we ship things to users. Not sure if we want to ship an SMT solver so a reasonable option might be to let users pass a path to the SMT solver they want to use.

compiler/daml-lf-verify/src/DA/Daml/LF/Verify.hs Outdated Show resolved Hide resolved
readPackages :: [FilePath] -> IO [(PackageId, (Package, Maybe PackageName))]
readPackages dars = concatMapM darToPackages dars
where
darToPackages :: FilePath -> IO [(PackageId, (Package, Maybe PackageName))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to be repeating that piece of code over and over. We should probably move it into some helper library at some point (not in this PR).

-- ^ Function for looking up references in the update set.
-> (updset -> ([Cond ref], updset))
-- ^ Function popping the references from the update set.
-> (updset -> updset -> updset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give this a Semigroup instance?

compiler/daml-lf-verify/src/DA/Daml/LF/Verify/Context.hs Outdated Show resolved Hide resolved

-- | Analyse a case expression.
-- TODO: Atm only boolean cases are supported
genForCase :: (GenPhase ph, MonadEnv m ph)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SMT does have ADTs so we could in principle use those but you can also encode pattern matches without using them.

log <- S.newLogger 1
sol <- S.newSolver spath ["-in"] (Just log)
vars <- declareVars sol $ filterVars _cVars (_cCres ++ _cArcs)
cre <- foldl S.add (S.real 0) <$> mapM (cexp2sexp vars) _cCres
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work? You are working with integers in some places.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it works somewhat by accident because 0 is both a real and an integer literal. We should check the type.

verify condDar False tmpl choice tmpl field >>= \case
Success -> assertFailure "Verification wrongfully passed for FailA - content"
Unknown -> assertFailure "Verification inconclusive for FailA - content"
Fail _ -> return ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ar the results not deterministic or why are you not just using assertEqual on the full Fail expression?

Copy link
Contributor

@hurryabit hurryabit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few files that should be deleted.

There's also the change to Numeric. I'd like to understand why we do this before we merge. If you can undo this change and everything still works, please do so and feel free to merge.

compiler/daml-lf-verify/daml/conditionals/.dlint.yaml Outdated Show resolved Hide resolved
compiler/daml-lf-verify/daml/conditionals/daml/Main.daml Outdated Show resolved Hide resolved
compiler/daml-lf-verify/daml/conditionals/daml/Setup.daml Outdated Show resolved Hide resolved
compiler/daml-lf-verify/daml/conditionals/pom.xml Outdated Show resolved Hide resolved
compiler/daml-lf-verify/daml/conditionals/.gitignore Outdated Show resolved Hide resolved
compiler/daml-lf-verify/daml/conditionals/daml.yaml Outdated Show resolved Hide resolved
compiler/daml-lf-verify/daml/quickstart/.gitignore Outdated Show resolved Hide resolved
@Gertjan423 Gertjan423 marked this pull request as ready for review May 19, 2020 16:26
@hurryabit
Copy link
Contributor

@Gertjan423 You need to run ./fmt.sh in the root of the repo again.

@Gertjan423 Gertjan423 merged commit 2af134c into master May 20, 2020
@Gertjan423 Gertjan423 deleted the daml-lf-verification branch May 20, 2020 06:08
garyverhaegen-da added a commit that referenced this pull request May 20, 2020
CI currently errors with:

```
Subprocess:
git checkout efe6545
 -- docs/source/support/release-notes.rst
failed with exit code 127; output:
---

---
err:
---
Previous HEAD position was 2af134c... WIP: Draft version constraint
generation (#5472)
HEAD is now at efe6545... 1.2.0-snapshot.20200520.4224.0.2af134ca
(#6040)
/bin/sh: 2: --: not found

---

```

because the line

```
latest_release_notes_sha <- shell "git log -n1 --format=%H HEAD -- LATEST"
```

will assign a string that ends in a newline, and then when we try to
construct the shell command:

```
(shell_ $ "git checkout " <> latest_sha <> " -- docs/source/support/release-notes.rst")
```

we actually get two lines for Bash to execute.

CHANGELOG_BEGIN
CHANGELOG_END
garyverhaegen-da added a commit that referenced this pull request May 20, 2020
CI currently errors with:

```
Subprocess:
git checkout efe6545
 -- docs/source/support/release-notes.rst
failed with exit code 127; output:
---

---
err:
---
Previous HEAD position was 2af134c... WIP: Draft version constraint
generation (#5472)
HEAD is now at efe6545... 1.2.0-snapshot.20200520.4224.0.2af134ca
(#6040)
/bin/sh: 2: --: not found

---

```

because the line

```
latest_release_notes_sha <- shell "git log -n1 --format=%H HEAD -- LATEST"
```

will assign a string that ends in a newline, and then when we try to
construct the shell command:

```
(shell_ $ "git checkout " <> latest_sha <> " -- docs/source/support/release-notes.rst")
```

we actually get two lines for Bash to execute.

CHANGELOG_BEGIN
CHANGELOG_END
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants