-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support nested steps workflow parallelism #1046
Conversation
Related issue: argoproj#1035 What is solved: - Parallelism for nested steps (StepGroups) What is not solved: - Parallelism for nested DAG This commit make nested StepGroup workflow parallelism on the outer workflow limiting inner workflow execution. This is done by making checkParallelism called by the inner workflow checks if the number of its running siblings (the nodes with the same parent node) is >= the parent node's parallelism.
workflow/controller/operator.go
Outdated
@@ -866,8 +887,9 @@ func (woc *wfOperationCtx) getLastChildNode(node *wfv1.NodeStatus) (*wfv1.NodeSt | |||
// for the created node (if created). Nodes may not be created if parallelism or deadline exceeded. | |||
// nodeName is the name to be used as the name of the node, and boundaryID indicates which template | |||
// boundary this node belongs to. | |||
func (woc *wfOperationCtx) executeTemplate(templateName string, args wfv1.Arguments, nodeName string, boundaryID string) (*wfv1.NodeStatus, error) { | |||
woc.log.Debugf("Evaluating node %s: template: %s", nodeName, templateName) | |||
func (woc *wfOperationCtx) executeTemplate(templateName string, args wfv1.Arguments, nodeName string, boundaryID string, parentName string) (*wfv1.NodeStatus, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be necessary to change the method signature to include parentName. parentName is the same as woc.wf.Status.Nodes[boundaryID].Name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reply. I tried to use boundaryID to find the parent node, but I found it not necessarily the case. An example is in the example workflow I used in the issue #1035.
The seq-step (template B)
's boundaryID
is the top node (template A), while its parent node is a StepGroup
node between it and the top node. I could only find the children nodes with the parentNode, but not with the top node.
I'm afraid I can't express it precisely, so let me show the figure.
Some related logs are here if it helps (see the lines starting with "Evaluating node "):
time="2018-10-27T03:53:03Z" level=info msg="Processing workflow" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="Updated phase -> Running" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=debug msg="Evaluating node seq-test-ms4fj: template: A, boundaryID: , parentName: " namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="tmpl type: Steps, parallelism:0xc4207ea8c0, node:<nil>" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="Steps node seq-test-ms4fj (seq-test-ms4fj) initialized Running" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="StepGroup node seq-test-ms4fj[0] (seq-test-ms4fj-3412503640) initialized Running" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=error msg="shouldExecute , proceed: true, error: <nil>" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=debug msg="Evaluating node seq-test-ms4fj[0].seq-step(0:a): template: B, boundaryID: seq-test-ms4fj, parentName: seq-test-ms4fj[0]" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="tmpl type: Steps, parallelism:<nil>, node:<nil>" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=debug msg="counted 0/1 active children in boundary seq-test-ms4fj of parent seq-test-ms4fj[0]" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="Steps node seq-test-ms4fj[0].seq-step(0:a) (seq-test-ms4fj-643033778) initialized Running" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="StepGroup node seq-test-ms4fj[0].seq-step(0:a)[0] (seq-test-ms4fj-3735306292) initialized Running" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=error msg="shouldExecute , proceed: true, error: <nil>" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=debug msg="Evaluating node seq-test-ms4fj[0].seq-step(0:a)[0].jobs(0:1): template: one-job, boundaryID: seq-test-ms4fj-643033778, parentName: seq-test-ms4fj[0].seq-step(0:a)[0]" namespace=default workflow
=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="tmpl type: Container, parallelism:<nil>, node:<nil>" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=debug msg="Executing node seq-test-ms4fj[0].seq-step(0:a)[0].jobs(0:1) with container template: &{one-job {[{seq-id <nil> 0xc42088bdc0 <nil> }] []} {[] [] <nil>} map[] nil {map[] map[]} <nil> [] &Containe
r{Name:,Image:alpine,Command:[/bin/sh -c],Args:[echo a; sleep 10],WorkingDir:,Ports:[],Env:[],Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},},VolumeMounts:[],LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:ni
l,TerminationMessagePath:,ImagePullPolicy:,SecurityContext:nil,Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[],TerminationMessagePolicy:,VolumeDevices:[],} <nil> <nil> <nil> <nil> [] <nil> <nil> <nil> <nil> []}\n" namespace=default workf
low=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=debug msg="Creating Pod: seq-test-ms4fj[0].seq-step(0:a)[0].jobs(0:1) (seq-test-ms4fj-2529398666)" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="Created pod: seq-test-ms4fj[0].seq-step(0:a)[0].jobs(0:1) (seq-test-ms4fj-2529398666)" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="Pod node seq-test-ms4fj[0].seq-step(0:a)[0].jobs(0:1) (seq-test-ms4fj-2529398666) initialized Pending" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=error msg="shouldExecute , proceed: true, error: <nil>" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=debug msg="Evaluating node seq-test-ms4fj[0].seq-step(0:a)[0].jobs(1:2): template: one-job, boundaryID: seq-test-ms4fj-643033778, parentName: seq-test-ms4fj[0].seq-step(0:a)[0]" namespace=default workflow
=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="tmpl type: Container, parallelism:<nil>, node:<nil>" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=debug msg="Executing node seq-test-ms4fj[0].seq-step(0:a)[0].jobs(1:2) with container template: &{one-job {[{seq-id <nil> 0xc420b09c30 <nil> }] []} {[] [] <nil>} map[] nil {map[] map[]} <nil> [] &Containe
r{Name:,Image:alpine,Command:[/bin/sh -c],Args:[echo a; sleep 10],WorkingDir:,Ports:[],Env:[],Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},},VolumeMounts:[],LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:ni
l,TerminationMessagePath:,ImagePullPolicy:,SecurityContext:nil,Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[],TerminationMessagePolicy:,VolumeDevices:[],} <nil> <nil> <nil> <nil> [] <nil> <nil> <nil> <nil> []}\n" namespace=default workf
low=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=debug msg="Creating Pod: seq-test-ms4fj[0].seq-step(0:a)[0].jobs(1:2) (seq-test-ms4fj-765137104)" namespace=default workflow=seq-test-ms4fj
time="2018-10-27T03:53:03Z" level=info msg="Created pod: seq-test-ms4fj[0].seq-step(0:a)[0].jobs(1:2) (seq-test-ms4fj-765137104)" namespace=default workflow=seq-test-ms4fj
workflow/controller/operator.go
Outdated
boundaryTemplate := woc.wf.GetTemplate(boundaryNode.TemplateName) | ||
if boundaryTemplate.Parallelism != nil { | ||
// for stepgroups, parent is different from boundary | ||
activeSiblings := woc.countActiveChildren(parentName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it's possible to miscalculate the parallelism since we are checking active pods independently of child steps/dag template invocations, and not summing up the counts of each to compare against the parallelism limit. I think the calculation needs to be the summation of both pods, as well as dag/step templates for the parallelism calculation to be accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I put the two countings together.
workflow/controller/operator.go
Outdated
@@ -495,6 +495,27 @@ func (woc *wfOperationCtx) countActivePods(boundaryIDs ...string) int64 { | |||
return activePods | |||
} | |||
|
|||
// countActiveChildren counts the number of active (Pending/Running) children nodes of parent parentName | |||
func (woc *wfOperationCtx) countActiveChildren(parentName string) int64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think countActiveChildren() should replace the use of countActivePods() and sum the number of all NodeTypePod, NodeTypeSteps, NodeTypeDAG nodes within a boundaryID as an aggregate. It can have similar logic to countActivePods() but modified slightly. Something like:
activeChildren := 0
for _, node := range woc.wf.Status.Nodes {
if boundaryID != "" && node.BoundaryID != boundaryID {
continue
}
switch node.Type {
case wfv1.NodeTypePod, wfv1.NodeTypeSteps, wfv1.NodeTypeDAG:
default:
continue
}
switch node.Phase {
case wfv1.NodePending, wfv1.NodeRunning:
activeChildren++
}
}
return activeChildren
Then, the existing calls to countActivePods()
in checkParallelism()
would be replaced with the call to countActiveChildren()
.
We would then only use countActivePods() when checking against the global parallelism limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Finding children nodes with the same boundaryID should be able to solve my problem in #1046 (comment). (I was stuck at using the Children
field to find children 😅). I'll let you know if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's working! 😀
Update: After the changes thanks to @jessesuen's suggestion, parallelism for nested DAG is supported too! |
* Updated ARTIFACT_REPO.md (argoproj#1049) * Updated examples/README.md (argoproj#1051) * Support for K8s API based Executor (argoproj#1010) * Submodules are dirty after checkout -- need to update (argoproj#1052) * Parameter and Argument names should support snake case (argoproj#1048) * Add namespace explicitly to pod metadata (argoproj#1059) * Update dependencies to K8s v1.12 and client-go 9.0 * Adding SAP Hybris in Who uses Argo (argoproj#1064) * Add Cratejoy to list of users (argoproj#1063) * Raise not implemented error when artifact saving is unsupported (argoproj#1062) * Adding native GCS support for artifact storage and retrieval * Support nested steps workflow parallelism (argoproj#1046) * Auto-complete workflow names (argoproj#1061) * Auto-complete workflow names * Use cobra revision at fe5e611709b0c57fa4a89136deaa8e1d4004d053 * Fix string format arguments in workflow utilities. (argoproj#1070) * fix argoproj#1078 Azure AKS authentication issues (argoproj#1079) * Issue argoproj#740 - System level workflow parallelism limits & priorities (argoproj#1065) * Issue argoproj#740 - System level workflow parallelism limits & priorities * Apply reviewer notes * Add new article and minor edits. (argoproj#1083) * Update docs to outline bare minimum set of privileges for a workflow * Use relative links on README file (argoproj#1087) * Fix typo in demo.md (argoproj#1089) Fix a small typo in demo.md that I encounted when reading through the getting started guide. * Drop reference to removed `argo install` command. (argoproj#1074) * Initialize child node before marking phase. Fixes panic on invalid `When` (argoproj#1075) * argoproj#1081 added retry logic to s3 load and save function (argoproj#1082) * adding logo to be used by the OS Site (argoproj#1099) * Update ROADMAP.md * Update docs with examples using the K8s REST API * Issue argoproj#1114 - Set FORCE_NAMESPACE_ISOLATION env variable in namespace install manifests (argoproj#1116) * Fix examples docs of parameters. (argoproj#1110) * Remove docker_lib mount volume which is not needed anymore (argoproj#1115) * Remove docker_lib mount volume which is not needed anymore * Remove unused hostPathDir * add support for ppc64le and s390x (argoproj#1102) * Install mime-support in argoexec to set proper mime types for S3 artifacts (resolves argoproj#1119) * Adding Quantibio in Who uses Argo (argoproj#1111) * Adding Quantibio in Who uses Argo * fix spelling mistake * Fix output artifact and parameter conflict (argoproj#1125) `SaveArtifacts` deletes the files that `SaveParameters` might still need, so we're calling `SaveParameters` first. Fixes argoproj#1124 * Update generated swagger to fix verify-codegen (argoproj#1131) * Allow owner reference to be set in submit util (argoproj#1120) * Issue argoproj#1104 - Remove container wait timeout from 'argo logs --follow' (argoproj#1142) * Issue argoproj#1132 - Fix panic in ttl controller (argoproj#1143) * Issue argoproj#1040 - Kill daemoned step if workflow consist of single daemoned step (argoproj#1144) * Fix global artifact overwriting in nested workflow (argoproj#1086) * Fix issue where steps with exhausted retires would not complete (argoproj#1148) * add support for other archs (argoproj#1137) * Reflect minio chart changes in documentation (argoproj#1147) * Issue argoproj#1136 - Fix metadata for DAG with loops (argoproj#1149) * Issue argoproj#1136 - Fix metadata for DAG with loops * Add slack badge to README (argoproj#1164) * Fix failing TestAddGlobalArtifactToScope unit test * Fix tests compilation error (argoproj#1157) * Replace exponential retry with poll (argoproj#1166) * add support for hostNetwork & dnsPolicy config (argoproj#1161) * Support HDFS Artifact (argoproj#1159) Support HDFS Artifact (argoproj#1159) * Update codegen for network config (argoproj#1168) * Add GitHub to users in README.md (argoproj#1151) * Add Preferred Networks to users in README.md (argoproj#1172) * Add missing patch in namespace kustomization.yaml (argoproj#1170) * Validate ArchiveLocation artifacts (argoproj#1167) * Update README and preview notice in CLA. * Update README. (argoproj#1173) (argoproj#1176) * Argo users: Equinor (argoproj#1175) * Do not mount unnecessary docker socket (argoproj#1178) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations (argoproj#1177) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations * Add output artifacts to influxdb-ci example * Increased S3 artifact retry time and added log (argoproj#1138) * Issue argoproj#1123 - Fix 'kubectl get' failure if resource namespace is different from workflow namespace (argoproj#1171) * Refactor Makefile/Dockerfile to remove volume binding in favor of build stages (argoproj#1189) * Add Docker Hub build hooks * Add documentation how to use parameter-file's (argoproj#1191) * Issue argoproj#988 - Submit should not print logs to stdout unless output is 'wide' (argoproj#1192) * Fix missing docker binary in argoexec image. Improve reuse of image layers * Fischerjulian adds ruby to rest docs (argoproj#1196) * Adds link to ruby kubernetes library. * Links to a ruby example on how to start a workflow * Updated OWNERS (argoproj#1198) * Update community/README (argoproj#1197) * Issue argoproj#1128 - Use polling instead of fs notify to get annotation changes (argoproj#1194) * Minor spelling, formatting, and style updates. (argoproj#1193) * Dockerfile: argoexec base image correction (fixes argoproj#1209) (argoproj#1213) * Set executor image pull policy for resource template (argoproj#1174) * Add schedulerName to workflow and template spec (argoproj#1184) * Issue argoproj#1190 - Fix incorrect retry node handling (argoproj#1208) * fix dag retries (argoproj#1221) * Executor can access the k8s apiserver with a out-of-cluster config file (argoproj#1134) Executor can access the k8s apiserver with a out-of-cluster config file * Update README with typo fixes (argoproj#1220) * Update README.md (argoproj#1236) * Remove extra quotes around output parameter value (argoproj#1232) Ensure we do not insert extra single quotes when using valueFrom: jsonPath to set the value of an output parameter for resource templates. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Update README.md (argoproj#1224) * Include stderr when retrieving docker logs (argoproj#1225) * Add Gardener to "Who uses Argo" (argoproj#1228) * Add feature to continue workflow on failed/error steps/tasks (argoproj#1205) * Fix the Prometheus address references (argoproj#1237) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported (argoproj#1245) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported This PR is fixed the Issue#1223 reported by @shanesiebken . Argo kubernetes resource workflow failed on patch action. --patch or -p option is required for kubectl patch action. This PR is including the manifest yaml as patch argument for kubectl. This Fix will support the Patch action in Argo kubernetes resource workflow. This Fix will support only JSON merge strategic in patch action * udpated formating * typo, executo -> executor (argoproj#1243) * Issue#1165 fake outputs don't notify and task completes successfully (argoproj#1247) * Issue#1165 fake outputs don't notify and task completes successfully This PR is addressing the Issue#1165 reported by @alexfrieden. Issue/Bug: Argo is finishing the task successfully even artifact /file does exist. Fix: Validate the created gzip contains artifact or file. if file/artifact doesn't exist, Current step/stage/task will be failed with log message . Sample Log: ''' INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) status Running -> Error INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) message: failed to save outputs: File or Artifact does not exist. /tmp/hello_world.txt INFO[0029] Step group node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) deemed failed: child 'artifact-passing-lkvj8-1949982165' failed namespace=default workflow=artifact-passing-lkvj8 INFO[0029] node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) phase Running -> Failed namespace=default workflow=artifact-passing-lkvj8 ''' * fixed gometalinter errcheck issue * Git cloning via SSH was not verifying host public key (argoproj#1261) * Update versions (argoproj#1218) * Proxy Priority and PriorityClassName to pods (argoproj#1179) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 (argoproj#1264) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 This PR is addressing the feature request argoproj#1186. Issue: Nodestatus element keeps growing for big workflow. Workflow will fail once the workflow total size reachs 1 MB (maz size limit in ETCD) . Solution: Compressing the Nodestatus once size reachs the 1 MB which increasing 60% to 80% more steps to execute in compress mode. Latest: Argo cli and Argo UI will able to decode and print nodestatus from compressednoode. Limitation: Kubectl willl not decode the compressedNode element * added Operator.go * revert the testing yaml * Fixed the lint issue * fixed * fixed lint * Fixed Testcase * incorporated the review comments * Reverted the change * incorporated review comments * fixing gometalinter checks * incorporated review comments * Update pod-limits.yaml * updated few comments * updated error message format * reverted unwanted files * Reduce redundancy pod label action (argoproj#1271) * Add the `mergeStrategy` option to resource patching (argoproj#1269) * This adds the ability to pass a mergeStrategy to a patch resource. this is valuable because the default merge strategy for kubernetes is 'strategic', which does not work with Custom Resources. * This also updates the resource example to demonstrate how it is used * Fix bug with DockerExecutor's CopyFile (argoproj#1275) The check to see if the source path was in the tgz archive was wrong when source path was a folder, the arguments to strings.Contains were inverted. * Add workflow labels and annotations global vars (argoproj#1280) * Argo CI is current inactive (argoproj#1285) * Issue#896 Workflow steps with non-existant output artifact path will succeed (argoproj#1277) * Issue#896 Workflow steps with non-existant output artifact path will succeed Issue: argoproj#897 Solution: Added new element "optional" in Artifact. The default is false. This flag will make artifact as optional and existence check will be ignored if input/output artifact has optional=true. Output Artifact ( optional=true ): Artifact existence check will be ignored during the save artifact in destination and continued workflow Input Artifact ( optional=true ): Artifact exist check will be ignored during load artifact from source and continued workflow * added end of line * removed unwanted whitespace * Deleted test code * go formatted * added formatting directives * updated Codegen * Fixed format on merge conflict * format fix * updated comments * improved error case * Fix for Resource creation where template has same parameter templating (argoproj#1283) * Fix for Resource creation where template has same parameter templating This PR will enable to support the custom template variable reference. Soulltion: Workflow variable reference resolve will check the Workflow variable prefix. * added test * fixed gofmt issue * fixed format * fixed gofmt on common.go * fixed testcase * fixed gofmt * Added unit testcase and documented * fixed Gofmt format * updated comments * Admiralty: add link to blog post, add user (argoproj#1295) * Add dns config support (argoproj#1301) * Speed up podReconciliation using parallel goroutine (argoproj#1286) * Speed up podReconciliation using parallel goroutine * Fix make lint issue * put checkandcompress back * Add community meeting notes link (argoproj#1304) * Add Karius to users in README.md (argoproj#1305) * Added support for artifact path references (argoproj#1300) * Added support for artifact path references Adds new `{{inputs.artifacts.<NAME>.path}}` and `{{outputs.artifacts.<NAME>.path}}` placeholders. * Add support for init containers (argoproj#1183) * Secrets should be passed to pods using volumes instead of API calls (argoproj#1302) * Secrets should be passed to pods using downward API instead of API calls * Fixed Gogfmt format * fixed file close Gofmt * updated review comments * fixed gofmt * updated review comments * CheckandEstimate implementation to optimize podReconciliation (argoproj#1308) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Update operator.go * Update operator.go * Add alibaba cloud to officially using argo list (argoproj#1313) * Refactor checkandEstimate to optimize podReconciliation (argoproj#1311) * Refactor checkandEstimate to optimize podReconciliation * Move compress function to persistUpdates * Fix formatting issues in examples documentation (argoproj#1310) * Fix nil pointer dereference with secret volumes (argoproj#1314) * Archive location should conditionally be added to template only when needed * Fix SIGSEGV in watch/CheckAndDecompress. Consolidate duplicate code (resolves argoproj#1315) * Implement support for PNS (Process Namespace Sharing) executor (argoproj#1214) * Implements PNS (Process Namespace Sharing) executor * Adds limited support for Kubelet/K8s API artifact collection by mirroring volume mounts to wait sidecar * Adds validation to detect when output artifacts are not supported by the executor * Adds ability to customize executor from workflow-controller-configmap (e.g. add environment variables, append command line args such as loglevel) * Fixes an issue where daemon steps were not getting terminated properly * Reorganize manifests to kustomize 2 and update version to v2.3.0-rc1 * Update v2.3.0 CHANGELOG.md * Export the methods of `KubernetesClientInterface` (argoproj#1294) All calls to these methods previously generated a panic at runtime because the calls resolved to the default, panic-always implementation, not to the overrides provided by `k8sAPIClient` and `kubeletClient`. Embedding an exported interface with unexported methods into a struct is the only way to implement that interface in another package. When doing this, the compiler generates default, panic-always implementations for all methods from the interface. Implementors can override exported methods, but it's not possible to override an unexported method from the interface. All invocations that go through the interface will come to the default implementation, even if the struct tries to provide an override. * Update README.md (argoproj#1321) * Issue1316 Pod creation with secret volumemount (argoproj#1318) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed the duplicate mountpath issue * Support parameter substitution in the volumes attribute (argoproj#1238) * `argo list` was not displaying non-zero priorities correctly * Fix regression where argoexec wait would not return when podname was too long * wait will conditionally become privileged if main/sidecar privileged (resolves argoproj#1323) * Update version to v2.3.0-rc2. Update changelog * Add documentation on releasing * use a secret selector for getting credentials * fixing build issues * linter issues * fixing jenkinsfile(?) * jenkins * jenkins * jenkins * jenkins * jenkins? * jenkins :( * jenkins :( * jenkins * jenkins * jenkins * jenkins * gopkg * use GetSecretFromVolMount instead of GetSecrets * actually build argoexec * Fix argoproj#1340 parameter substitution bug Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com>
* Updated ARTIFACT_REPO.md (argoproj#1049) * Updated examples/README.md (argoproj#1051) * Support for K8s API based Executor (argoproj#1010) * Submodules are dirty after checkout -- need to update (argoproj#1052) * Parameter and Argument names should support snake case (argoproj#1048) * Add namespace explicitly to pod metadata (argoproj#1059) * Update dependencies to K8s v1.12 and client-go 9.0 * Adding SAP Hybris in Who uses Argo (argoproj#1064) * Add Cratejoy to list of users (argoproj#1063) * Raise not implemented error when artifact saving is unsupported (argoproj#1062) * Adding native GCS support for artifact storage and retrieval * Support nested steps workflow parallelism (argoproj#1046) * Auto-complete workflow names (argoproj#1061) * Auto-complete workflow names * Use cobra revision at fe5e611709b0c57fa4a89136deaa8e1d4004d053 * Fix string format arguments in workflow utilities. (argoproj#1070) * fix argoproj#1078 Azure AKS authentication issues (argoproj#1079) * Issue argoproj#740 - System level workflow parallelism limits & priorities (argoproj#1065) * Issue argoproj#740 - System level workflow parallelism limits & priorities * Apply reviewer notes * Add new article and minor edits. (argoproj#1083) * Update docs to outline bare minimum set of privileges for a workflow * Use relative links on README file (argoproj#1087) * Fix typo in demo.md (argoproj#1089) Fix a small typo in demo.md that I encounted when reading through the getting started guide. * Drop reference to removed `argo install` command. (argoproj#1074) * Initialize child node before marking phase. Fixes panic on invalid `When` (argoproj#1075) * argoproj#1081 added retry logic to s3 load and save function (argoproj#1082) * adding logo to be used by the OS Site (argoproj#1099) * Update ROADMAP.md * Update docs with examples using the K8s REST API * Issue argoproj#1114 - Set FORCE_NAMESPACE_ISOLATION env variable in namespace install manifests (argoproj#1116) * Fix examples docs of parameters. (argoproj#1110) * Remove docker_lib mount volume which is not needed anymore (argoproj#1115) * Remove docker_lib mount volume which is not needed anymore * Remove unused hostPathDir * add support for ppc64le and s390x (argoproj#1102) * Install mime-support in argoexec to set proper mime types for S3 artifacts (resolves argoproj#1119) * Adding Quantibio in Who uses Argo (argoproj#1111) * Adding Quantibio in Who uses Argo * fix spelling mistake * Fix output artifact and parameter conflict (argoproj#1125) `SaveArtifacts` deletes the files that `SaveParameters` might still need, so we're calling `SaveParameters` first. Fixes argoproj#1124 * Update generated swagger to fix verify-codegen (argoproj#1131) * Allow owner reference to be set in submit util (argoproj#1120) * Issue argoproj#1104 - Remove container wait timeout from 'argo logs --follow' (argoproj#1142) * Issue argoproj#1132 - Fix panic in ttl controller (argoproj#1143) * Issue argoproj#1040 - Kill daemoned step if workflow consist of single daemoned step (argoproj#1144) * Fix global artifact overwriting in nested workflow (argoproj#1086) * Fix issue where steps with exhausted retires would not complete (argoproj#1148) * add support for other archs (argoproj#1137) * Reflect minio chart changes in documentation (argoproj#1147) * Issue argoproj#1136 - Fix metadata for DAG with loops (argoproj#1149) * Issue argoproj#1136 - Fix metadata for DAG with loops * Add slack badge to README (argoproj#1164) * Fix failing TestAddGlobalArtifactToScope unit test * Fix tests compilation error (argoproj#1157) * Replace exponential retry with poll (argoproj#1166) * add support for hostNetwork & dnsPolicy config (argoproj#1161) * Support HDFS Artifact (argoproj#1159) Support HDFS Artifact (argoproj#1159) * Update codegen for network config (argoproj#1168) * Add GitHub to users in README.md (argoproj#1151) * Add Preferred Networks to users in README.md (argoproj#1172) * Add missing patch in namespace kustomization.yaml (argoproj#1170) * Validate ArchiveLocation artifacts (argoproj#1167) * Update README and preview notice in CLA. * Update README. (argoproj#1173) (argoproj#1176) * Argo users: Equinor (argoproj#1175) * Do not mount unnecessary docker socket (argoproj#1178) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations (argoproj#1177) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations * Add output artifacts to influxdb-ci example * Increased S3 artifact retry time and added log (argoproj#1138) * Issue argoproj#1123 - Fix 'kubectl get' failure if resource namespace is different from workflow namespace (argoproj#1171) * Refactor Makefile/Dockerfile to remove volume binding in favor of build stages (argoproj#1189) * Add Docker Hub build hooks * Add documentation how to use parameter-file's (argoproj#1191) * Issue argoproj#988 - Submit should not print logs to stdout unless output is 'wide' (argoproj#1192) * Fix missing docker binary in argoexec image. Improve reuse of image layers * Fischerjulian adds ruby to rest docs (argoproj#1196) * Adds link to ruby kubernetes library. * Links to a ruby example on how to start a workflow * Updated OWNERS (argoproj#1198) * Update community/README (argoproj#1197) * Issue argoproj#1128 - Use polling instead of fs notify to get annotation changes (argoproj#1194) * Minor spelling, formatting, and style updates. (argoproj#1193) * Dockerfile: argoexec base image correction (fixes argoproj#1209) (argoproj#1213) * Set executor image pull policy for resource template (argoproj#1174) * Add schedulerName to workflow and template spec (argoproj#1184) * Issue argoproj#1190 - Fix incorrect retry node handling (argoproj#1208) * fix dag retries (argoproj#1221) * Executor can access the k8s apiserver with a out-of-cluster config file (argoproj#1134) Executor can access the k8s apiserver with a out-of-cluster config file * Update README with typo fixes (argoproj#1220) * Update README.md (argoproj#1236) * Remove extra quotes around output parameter value (argoproj#1232) Ensure we do not insert extra single quotes when using valueFrom: jsonPath to set the value of an output parameter for resource templates. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Update README.md (argoproj#1224) * Include stderr when retrieving docker logs (argoproj#1225) * Add Gardener to "Who uses Argo" (argoproj#1228) * Add feature to continue workflow on failed/error steps/tasks (argoproj#1205) * Fix the Prometheus address references (argoproj#1237) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported (argoproj#1245) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported This PR is fixed the Issue#1223 reported by @shanesiebken . Argo kubernetes resource workflow failed on patch action. --patch or -p option is required for kubectl patch action. This PR is including the manifest yaml as patch argument for kubectl. This Fix will support the Patch action in Argo kubernetes resource workflow. This Fix will support only JSON merge strategic in patch action * udpated formating * typo, executo -> executor (argoproj#1243) * Issue#1165 fake outputs don't notify and task completes successfully (argoproj#1247) * Issue#1165 fake outputs don't notify and task completes successfully This PR is addressing the Issue#1165 reported by @alexfrieden. Issue/Bug: Argo is finishing the task successfully even artifact /file does exist. Fix: Validate the created gzip contains artifact or file. if file/artifact doesn't exist, Current step/stage/task will be failed with log message . Sample Log: ''' INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) status Running -> Error INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) message: failed to save outputs: File or Artifact does not exist. /tmp/hello_world.txt INFO[0029] Step group node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) deemed failed: child 'artifact-passing-lkvj8-1949982165' failed namespace=default workflow=artifact-passing-lkvj8 INFO[0029] node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) phase Running -> Failed namespace=default workflow=artifact-passing-lkvj8 ''' * fixed gometalinter errcheck issue * Git cloning via SSH was not verifying host public key (argoproj#1261) * Update versions (argoproj#1218) * Proxy Priority and PriorityClassName to pods (argoproj#1179) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 (argoproj#1264) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 This PR is addressing the feature request argoproj#1186. Issue: Nodestatus element keeps growing for big workflow. Workflow will fail once the workflow total size reachs 1 MB (maz size limit in ETCD) . Solution: Compressing the Nodestatus once size reachs the 1 MB which increasing 60% to 80% more steps to execute in compress mode. Latest: Argo cli and Argo UI will able to decode and print nodestatus from compressednoode. Limitation: Kubectl willl not decode the compressedNode element * added Operator.go * revert the testing yaml * Fixed the lint issue * fixed * fixed lint * Fixed Testcase * incorporated the review comments * Reverted the change * incorporated review comments * fixing gometalinter checks * incorporated review comments * Update pod-limits.yaml * updated few comments * updated error message format * reverted unwanted files * Reduce redundancy pod label action (argoproj#1271) * Add the `mergeStrategy` option to resource patching (argoproj#1269) * This adds the ability to pass a mergeStrategy to a patch resource. this is valuable because the default merge strategy for kubernetes is 'strategic', which does not work with Custom Resources. * This also updates the resource example to demonstrate how it is used * Fix bug with DockerExecutor's CopyFile (argoproj#1275) The check to see if the source path was in the tgz archive was wrong when source path was a folder, the arguments to strings.Contains were inverted. * Add workflow labels and annotations global vars (argoproj#1280) * Argo CI is current inactive (argoproj#1285) * Issue#896 Workflow steps with non-existant output artifact path will succeed (argoproj#1277) * Issue#896 Workflow steps with non-existant output artifact path will succeed Issue: argoproj#897 Solution: Added new element "optional" in Artifact. The default is false. This flag will make artifact as optional and existence check will be ignored if input/output artifact has optional=true. Output Artifact ( optional=true ): Artifact existence check will be ignored during the save artifact in destination and continued workflow Input Artifact ( optional=true ): Artifact exist check will be ignored during load artifact from source and continued workflow * added end of line * removed unwanted whitespace * Deleted test code * go formatted * added formatting directives * updated Codegen * Fixed format on merge conflict * format fix * updated comments * improved error case * Fix for Resource creation where template has same parameter templating (argoproj#1283) * Fix for Resource creation where template has same parameter templating This PR will enable to support the custom template variable reference. Soulltion: Workflow variable reference resolve will check the Workflow variable prefix. * added test * fixed gofmt issue * fixed format * fixed gofmt on common.go * fixed testcase * fixed gofmt * Added unit testcase and documented * fixed Gofmt format * updated comments * Admiralty: add link to blog post, add user (argoproj#1295) * Add dns config support (argoproj#1301) * Speed up podReconciliation using parallel goroutine (argoproj#1286) * Speed up podReconciliation using parallel goroutine * Fix make lint issue * put checkandcompress back * Add community meeting notes link (argoproj#1304) * Add Karius to users in README.md (argoproj#1305) * Added support for artifact path references (argoproj#1300) * Added support for artifact path references Adds new `{{inputs.artifacts.<NAME>.path}}` and `{{outputs.artifacts.<NAME>.path}}` placeholders. * Add support for init containers (argoproj#1183) * Secrets should be passed to pods using volumes instead of API calls (argoproj#1302) * Secrets should be passed to pods using downward API instead of API calls * Fixed Gogfmt format * fixed file close Gofmt * updated review comments * fixed gofmt * updated review comments * CheckandEstimate implementation to optimize podReconciliation (argoproj#1308) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Update operator.go * Update operator.go * Add alibaba cloud to officially using argo list (argoproj#1313) * Refactor checkandEstimate to optimize podReconciliation (argoproj#1311) * Refactor checkandEstimate to optimize podReconciliation * Move compress function to persistUpdates * Fix formatting issues in examples documentation (argoproj#1310) * Fix nil pointer dereference with secret volumes (argoproj#1314) * Archive location should conditionally be added to template only when needed * Fix SIGSEGV in watch/CheckAndDecompress. Consolidate duplicate code (resolves argoproj#1315) * Implement support for PNS (Process Namespace Sharing) executor (argoproj#1214) * Implements PNS (Process Namespace Sharing) executor * Adds limited support for Kubelet/K8s API artifact collection by mirroring volume mounts to wait sidecar * Adds validation to detect when output artifacts are not supported by the executor * Adds ability to customize executor from workflow-controller-configmap (e.g. add environment variables, append command line args such as loglevel) * Fixes an issue where daemon steps were not getting terminated properly * Reorganize manifests to kustomize 2 and update version to v2.3.0-rc1 * Update v2.3.0 CHANGELOG.md * Export the methods of `KubernetesClientInterface` (argoproj#1294) All calls to these methods previously generated a panic at runtime because the calls resolved to the default, panic-always implementation, not to the overrides provided by `k8sAPIClient` and `kubeletClient`. Embedding an exported interface with unexported methods into a struct is the only way to implement that interface in another package. When doing this, the compiler generates default, panic-always implementations for all methods from the interface. Implementors can override exported methods, but it's not possible to override an unexported method from the interface. All invocations that go through the interface will come to the default implementation, even if the struct tries to provide an override. * Update README.md (argoproj#1321) * Issue1316 Pod creation with secret volumemount (argoproj#1318) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed the duplicate mountpath issue * Support parameter substitution in the volumes attribute (argoproj#1238) * `argo list` was not displaying non-zero priorities correctly * Fix regression where argoexec wait would not return when podname was too long * wait will conditionally become privileged if main/sidecar privileged (resolves argoproj#1323) * Update version to v2.3.0-rc2. Update changelog * Add documentation on releasing * use a secret selector for getting credentials * fixing build issues * linter issues * fixing jenkinsfile(?) * jenkins * jenkins * jenkins * jenkins * jenkins? * jenkins :( * jenkins :( * jenkins * jenkins * jenkins * jenkins * gopkg * use GetSecretFromVolMount instead of GetSecrets * actually build argoexec * Fix argoproj#1340 parameter substitution bug Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * fixing gcs upload method
* wip * wip * initial working version of gcs artifact storage * addressing pr feedback * updating codegen * wip * fixing issue with workflow saving * check to see if stat result is nil * adding a jenkinsfile * a small change which will hopefully speed up jenkins builds a lot * cleanup of docker push logic * cleanup of docker push logic * cleanup of docker push logic * cleanup of docker push logic * cleanup of docker push logic * changing the import path * preserving original link in a readme * use semantic version tagging (#9) * [CSE-11] adding config file loader (#10) * adding configmap loader * PR #10 should have been a minor version not a patch (#11) * adding autodeploy to jenkinsfile (#12) * fixing autodeployments (#13) * [CSE-13] extended error handling for workflows (#16) * wip * mixed case imports cause all sorts of problems, switch to lowercase * fixing build issu * fixing error deserialization * fixing error deserialization * unmatched string logic * make workflows fail on error trigger * properly evaluate workflow failures * dev version bump * serialize errors and warnings into wf crd * ugh go types * rewriting error handling to support file sources * temporarily commenting out a test * fixing warning handler * fixing error handling fixing executor fixing executor Fixing executor fixing executor fixing executor fixing executor asdf fixing executor fixing operator operator fixing executor cleanup * cleaning up types * updating codegen * fixing version * updating codegen * add podname and stage name to error result * version 2.5.0->2.4.0 * ErrorCondition->ExceptionCondition * codegen * [CS-14] merging UI update into rc (#18) * update node * UI tweaks * fixing a comment * versionbump * fixing build errors * [CSE-57] Upgrade argo (#24) * Updated ARTIFACT_REPO.md (argoproj#1049) * Updated examples/README.md (argoproj#1051) * Support for K8s API based Executor (argoproj#1010) * Submodules are dirty after checkout -- need to update (argoproj#1052) * Parameter and Argument names should support snake case (argoproj#1048) * Add namespace explicitly to pod metadata (argoproj#1059) * Update dependencies to K8s v1.12 and client-go 9.0 * Adding SAP Hybris in Who uses Argo (argoproj#1064) * Add Cratejoy to list of users (argoproj#1063) * Raise not implemented error when artifact saving is unsupported (argoproj#1062) * Adding native GCS support for artifact storage and retrieval * Support nested steps workflow parallelism (argoproj#1046) * Auto-complete workflow names (argoproj#1061) * Auto-complete workflow names * Use cobra revision at fe5e611709b0c57fa4a89136deaa8e1d4004d053 * Fix string format arguments in workflow utilities. (argoproj#1070) * fix argoproj#1078 Azure AKS authentication issues (argoproj#1079) * Issue argoproj#740 - System level workflow parallelism limits & priorities (argoproj#1065) * Issue argoproj#740 - System level workflow parallelism limits & priorities * Apply reviewer notes * Add new article and minor edits. (argoproj#1083) * Update docs to outline bare minimum set of privileges for a workflow * Use relative links on README file (argoproj#1087) * Fix typo in demo.md (argoproj#1089) Fix a small typo in demo.md that I encounted when reading through the getting started guide. * Drop reference to removed `argo install` command. (argoproj#1074) * Initialize child node before marking phase. Fixes panic on invalid `When` (argoproj#1075) * argoproj#1081 added retry logic to s3 load and save function (argoproj#1082) * adding logo to be used by the OS Site (argoproj#1099) * Update ROADMAP.md * Update docs with examples using the K8s REST API * Issue argoproj#1114 - Set FORCE_NAMESPACE_ISOLATION env variable in namespace install manifests (argoproj#1116) * Fix examples docs of parameters. (argoproj#1110) * Remove docker_lib mount volume which is not needed anymore (argoproj#1115) * Remove docker_lib mount volume which is not needed anymore * Remove unused hostPathDir * add support for ppc64le and s390x (argoproj#1102) * Install mime-support in argoexec to set proper mime types for S3 artifacts (resolves argoproj#1119) * Adding Quantibio in Who uses Argo (argoproj#1111) * Adding Quantibio in Who uses Argo * fix spelling mistake * Fix output artifact and parameter conflict (argoproj#1125) `SaveArtifacts` deletes the files that `SaveParameters` might still need, so we're calling `SaveParameters` first. Fixes argoproj#1124 * Update generated swagger to fix verify-codegen (argoproj#1131) * Allow owner reference to be set in submit util (argoproj#1120) * Issue argoproj#1104 - Remove container wait timeout from 'argo logs --follow' (argoproj#1142) * Issue argoproj#1132 - Fix panic in ttl controller (argoproj#1143) * Issue argoproj#1040 - Kill daemoned step if workflow consist of single daemoned step (argoproj#1144) * Fix global artifact overwriting in nested workflow (argoproj#1086) * Fix issue where steps with exhausted retires would not complete (argoproj#1148) * add support for other archs (argoproj#1137) * Reflect minio chart changes in documentation (argoproj#1147) * Issue argoproj#1136 - Fix metadata for DAG with loops (argoproj#1149) * Issue argoproj#1136 - Fix metadata for DAG with loops * Add slack badge to README (argoproj#1164) * Fix failing TestAddGlobalArtifactToScope unit test * Fix tests compilation error (argoproj#1157) * Replace exponential retry with poll (argoproj#1166) * add support for hostNetwork & dnsPolicy config (argoproj#1161) * Support HDFS Artifact (argoproj#1159) Support HDFS Artifact (argoproj#1159) * Update codegen for network config (argoproj#1168) * Add GitHub to users in README.md (argoproj#1151) * Add Preferred Networks to users in README.md (argoproj#1172) * Add missing patch in namespace kustomization.yaml (argoproj#1170) * Validate ArchiveLocation artifacts (argoproj#1167) * Update README and preview notice in CLA. * Update README. (argoproj#1173) (argoproj#1176) * Argo users: Equinor (argoproj#1175) * Do not mount unnecessary docker socket (argoproj#1178) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations (argoproj#1177) * Issue argoproj#1113 - Wait for daemon pods completion to handle annotations * Add output artifacts to influxdb-ci example * Increased S3 artifact retry time and added log (argoproj#1138) * Issue argoproj#1123 - Fix 'kubectl get' failure if resource namespace is different from workflow namespace (argoproj#1171) * Refactor Makefile/Dockerfile to remove volume binding in favor of build stages (argoproj#1189) * Add Docker Hub build hooks * Add documentation how to use parameter-file's (argoproj#1191) * Issue argoproj#988 - Submit should not print logs to stdout unless output is 'wide' (argoproj#1192) * Fix missing docker binary in argoexec image. Improve reuse of image layers * Fischerjulian adds ruby to rest docs (argoproj#1196) * Adds link to ruby kubernetes library. * Links to a ruby example on how to start a workflow * Updated OWNERS (argoproj#1198) * Update community/README (argoproj#1197) * Issue argoproj#1128 - Use polling instead of fs notify to get annotation changes (argoproj#1194) * Minor spelling, formatting, and style updates. (argoproj#1193) * Dockerfile: argoexec base image correction (fixes argoproj#1209) (argoproj#1213) * Set executor image pull policy for resource template (argoproj#1174) * Add schedulerName to workflow and template spec (argoproj#1184) * Issue argoproj#1190 - Fix incorrect retry node handling (argoproj#1208) * fix dag retries (argoproj#1221) * Executor can access the k8s apiserver with a out-of-cluster config file (argoproj#1134) Executor can access the k8s apiserver with a out-of-cluster config file * Update README with typo fixes (argoproj#1220) * Update README.md (argoproj#1236) * Remove extra quotes around output parameter value (argoproj#1232) Ensure we do not insert extra single quotes when using valueFrom: jsonPath to set the value of an output parameter for resource templates. Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * Update README.md (argoproj#1224) * Include stderr when retrieving docker logs (argoproj#1225) * Add Gardener to "Who uses Argo" (argoproj#1228) * Add feature to continue workflow on failed/error steps/tasks (argoproj#1205) * Fix the Prometheus address references (argoproj#1237) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported (argoproj#1245) * Fixed Issue#1223 Kubernetes Resource action: patch is not supported This PR is fixed the Issue#1223 reported by @shanesiebken . Argo kubernetes resource workflow failed on patch action. --patch or -p option is required for kubectl patch action. This PR is including the manifest yaml as patch argument for kubectl. This Fix will support the Patch action in Argo kubernetes resource workflow. This Fix will support only JSON merge strategic in patch action * udpated formating * typo, executo -> executor (argoproj#1243) * Issue#1165 fake outputs don't notify and task completes successfully (argoproj#1247) * Issue#1165 fake outputs don't notify and task completes successfully This PR is addressing the Issue#1165 reported by @alexfrieden. Issue/Bug: Argo is finishing the task successfully even artifact /file does exist. Fix: Validate the created gzip contains artifact or file. if file/artifact doesn't exist, Current step/stage/task will be failed with log message . Sample Log: ''' INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) status Running -> Error INFO[0029] Updating node artifact-passing-lkvj8[0].generate-artifact (artifact-passing-lkvj8-1949982165) message: failed to save outputs: File or Artifact does not exist. /tmp/hello_world.txt INFO[0029] Step group node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) deemed failed: child 'artifact-passing-lkvj8-1949982165' failed namespace=default workflow=artifact-passing-lkvj8 INFO[0029] node artifact-passing-lkvj8[0] (artifact-passing-lkvj8-1067333159) phase Running -> Failed namespace=default workflow=artifact-passing-lkvj8 ''' * fixed gometalinter errcheck issue * Git cloning via SSH was not verifying host public key (argoproj#1261) * Update versions (argoproj#1218) * Proxy Priority and PriorityClassName to pods (argoproj#1179) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 (argoproj#1264) * Error running 1000s of tasks: "etcdserver: request is too large" argoproj#1186 This PR is addressing the feature request argoproj#1186. Issue: Nodestatus element keeps growing for big workflow. Workflow will fail once the workflow total size reachs 1 MB (maz size limit in ETCD) . Solution: Compressing the Nodestatus once size reachs the 1 MB which increasing 60% to 80% more steps to execute in compress mode. Latest: Argo cli and Argo UI will able to decode and print nodestatus from compressednoode. Limitation: Kubectl willl not decode the compressedNode element * added Operator.go * revert the testing yaml * Fixed the lint issue * fixed * fixed lint * Fixed Testcase * incorporated the review comments * Reverted the change * incorporated review comments * fixing gometalinter checks * incorporated review comments * Update pod-limits.yaml * updated few comments * updated error message format * reverted unwanted files * Reduce redundancy pod label action (argoproj#1271) * Add the `mergeStrategy` option to resource patching (argoproj#1269) * This adds the ability to pass a mergeStrategy to a patch resource. this is valuable because the default merge strategy for kubernetes is 'strategic', which does not work with Custom Resources. * This also updates the resource example to demonstrate how it is used * Fix bug with DockerExecutor's CopyFile (argoproj#1275) The check to see if the source path was in the tgz archive was wrong when source path was a folder, the arguments to strings.Contains were inverted. * Add workflow labels and annotations global vars (argoproj#1280) * Argo CI is current inactive (argoproj#1285) * Issue#896 Workflow steps with non-existant output artifact path will succeed (argoproj#1277) * Issue#896 Workflow steps with non-existant output artifact path will succeed Issue: argoproj#897 Solution: Added new element "optional" in Artifact. The default is false. This flag will make artifact as optional and existence check will be ignored if input/output artifact has optional=true. Output Artifact ( optional=true ): Artifact existence check will be ignored during the save artifact in destination and continued workflow Input Artifact ( optional=true ): Artifact exist check will be ignored during load artifact from source and continued workflow * added end of line * removed unwanted whitespace * Deleted test code * go formatted * added formatting directives * updated Codegen * Fixed format on merge conflict * format fix * updated comments * improved error case * Fix for Resource creation where template has same parameter templating (argoproj#1283) * Fix for Resource creation where template has same parameter templating This PR will enable to support the custom template variable reference. Soulltion: Workflow variable reference resolve will check the Workflow variable prefix. * added test * fixed gofmt issue * fixed format * fixed gofmt on common.go * fixed testcase * fixed gofmt * Added unit testcase and documented * fixed Gofmt format * updated comments * Admiralty: add link to blog post, add user (argoproj#1295) * Add dns config support (argoproj#1301) * Speed up podReconciliation using parallel goroutine (argoproj#1286) * Speed up podReconciliation using parallel goroutine * Fix make lint issue * put checkandcompress back * Add community meeting notes link (argoproj#1304) * Add Karius to users in README.md (argoproj#1305) * Added support for artifact path references (argoproj#1300) * Added support for artifact path references Adds new `{{inputs.artifacts.<NAME>.path}}` and `{{outputs.artifacts.<NAME>.path}}` placeholders. * Add support for init containers (argoproj#1183) * Secrets should be passed to pods using volumes instead of API calls (argoproj#1302) * Secrets should be passed to pods using downward API instead of API calls * Fixed Gogfmt format * fixed file close Gofmt * updated review comments * fixed gofmt * updated review comments * CheckandEstimate implementation to optimize podReconciliation (argoproj#1308) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Update operator.go * Update operator.go * Add alibaba cloud to officially using argo list (argoproj#1313) * Refactor checkandEstimate to optimize podReconciliation (argoproj#1311) * Refactor checkandEstimate to optimize podReconciliation * Move compress function to persistUpdates * Fix formatting issues in examples documentation (argoproj#1310) * Fix nil pointer dereference with secret volumes (argoproj#1314) * Archive location should conditionally be added to template only when needed * Fix SIGSEGV in watch/CheckAndDecompress. Consolidate duplicate code (resolves argoproj#1315) * Implement support for PNS (Process Namespace Sharing) executor (argoproj#1214) * Implements PNS (Process Namespace Sharing) executor * Adds limited support for Kubelet/K8s API artifact collection by mirroring volume mounts to wait sidecar * Adds validation to detect when output artifacts are not supported by the executor * Adds ability to customize executor from workflow-controller-configmap (e.g. add environment variables, append command line args such as loglevel) * Fixes an issue where daemon steps were not getting terminated properly * Reorganize manifests to kustomize 2 and update version to v2.3.0-rc1 * Update v2.3.0 CHANGELOG.md * Export the methods of `KubernetesClientInterface` (argoproj#1294) All calls to these methods previously generated a panic at runtime because the calls resolved to the default, panic-always implementation, not to the overrides provided by `k8sAPIClient` and `kubeletClient`. Embedding an exported interface with unexported methods into a struct is the only way to implement that interface in another package. When doing this, the compiler generates default, panic-always implementations for all methods from the interface. Implementors can override exported methods, but it's not possible to override an unexported method from the interface. All invocations that go through the interface will come to the default implementation, even if the struct tries to provide an override. * Update README.md (argoproj#1321) * Issue1316 Pod creation with secret volumemount (argoproj#1318) * CheckandEstimate implementation * fixed variable rename * fixed gofmt * fixed feedbacks * Fixed the duplicate mountpath issue * Support parameter substitution in the volumes attribute (argoproj#1238) * `argo list` was not displaying non-zero priorities correctly * Fix regression where argoexec wait would not return when podname was too long * wait will conditionally become privileged if main/sidecar privileged (resolves argoproj#1323) * Update version to v2.3.0-rc2. Update changelog * Add documentation on releasing * use a secret selector for getting credentials * fixing build issues * linter issues * fixing jenkinsfile(?) * jenkins * jenkins * jenkins * jenkins * jenkins? * jenkins :( * jenkins :( * jenkins * jenkins * jenkins * jenkins * gopkg * use GetSecretFromVolMount instead of GetSecrets * actually build argoexec * Fix argoproj#1340 parameter substitution bug Signed-off-by: Ilias Katsakioris <elikatsis@arrikto.com> * fixing gcs upload method * disable autodeploy
* Adding the link to the video that explains Events Signed-off-by: Viktor Farcic <viktor@farcic.com>
Related issue: #1035
What is solved:
What is not solved:
- Parallelism for nested DAGThis commit make nested StepGroup workflow parallelism on the outer
workflow able to limit inner workflow execution. This is done by making
checkParallelism called by the inner workflow check if the number of
its running siblings (the nodes with the same parent node) is >=
the parent node's parallelism.
Update 2018/10/27: after the changes thanks to @jessesuen's suggestion, parallelism for nested DAG is supported too!