Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Onetechnical/beta2.0.14 #1116

Merged
merged 296 commits into from
Jun 2, 2020
Merged

Onetechnical/beta2.0.14 #1116

merged 296 commits into from
Jun 2, 2020

Conversation

onetechnical
Copy link
Contributor

@onetechnical onetechnical commented Jun 1, 2020

tsachiherman and others added 30 commits January 2, 2020 18:26
…679)

* Make gracefull exit of a node that is waiting for WaitForBlock call.

* Add comment.
* Remove tput where not supported by terminal.

* send tput errors to dev/null

* Fix bad constants.
-- Adding receiver function to transaction that returns the receiver of a transaction
-- Fix indexer to show received transactions
* Better lockFile error handling.

* Make blocking locker.

* Fix F_OFD_GETLK constant.

* bugfix.

* Try platform specific code.

* use unix package to include F_OFD_SETLKW

* remove unused imports.

* Rename files.
* A fix for arm64 failures

One observation from the failures is that the test timeouts could be the cause of the failure.
Expect scripts when called from go test using CombinedOutput is behaving strange (slow).
Replacing CombinedOutput with Run.

* DRAFT: this PR is a draft to experiment with test failures on ARM system.

Disabling tests, that failes sporadically on mac, on ARM as well.
Adding a utility to controll test skips.

adding missing file and change.

* DRAFT: this PR is a draft to experiment with test failures on ARM system.

Disabling tests, that failes sporadically on mac, on ARM as well.
Adding a utility to controll test skips.

adding missing file and change.
Fixing errors and adding comments.

* Fixing merge and comment.

* added comment

* Stop catchup on unapproved protocol round

Catchup to stop before fetching the next round if the round protocol is not approved by the node

* Some fixex. Review comments from Tsachi.

* File accidentally added here. removing.

* Reverting changes mistakenly added to this branch.

* Adding comment changes.

* Partially working test

* Adding test to catchup stop on unsupported block
Using s.cancel we are droppng the last block.

* More tests and development to the catchup service

* Stop the catchup before fetching the round with un-approved protocol.

The catchup service will save the round when an an-approved protocol
update will take place. Then, before creating a task to fetch a round,
will check if the next round is when an an-approved protocol round
begins, and stops the catchup service.

The ledger should have the round with NextProtocolSwitchOn to stop
the un-approved round from getting fetched.

The added test covers the edge cases which may or may not happen
when the service runs.

* Stop the catchup before fetching the round with un-approved protocol.

The catchup service will save the round when an an-approved protocol
update will take place. Then, before creating a task to fetch a round,
will check if the next round is when an an-approved protocol round
begins, and stops the catchup service.

The ledger should have the round with NextProtocolSwitchOn to stop
the un-approved round from getting fetched.

The added test covers the edge cases which may or may not happen
when the service runs.

Addressing Tsachi's review comments.

* Combine condition blocks

* Fixing an error in the log info statement.

* Draft:

Test for upgrading a node while keeping another node not upgradable
goal node status field for informing if the node is upgradable

* Catchup service stop on unsupported, ode status message, and e2e test

In this change:
Updated catchup service to stop on unsupported and not unupgradable.
Updated goal node status to inform when the catchup service is stopped.
Updated goal node status by removing last synced information.
Added e2e test for stopped catchup service on unsupported protocol.

* Separating goal changes from this PR.

Separating goal changes from this PR.
goal changes are in PR: #686

* review comment: use NotEqual instead of True
* Updates to the goal node status

This change is splitting the goal section from PR: #685
Updated goal node status to inform when the catchup service is stopped.
Updated goal node status by removing "Synced Since Startup" field.

* Adding parameter StoppedAtUnsupportedRound to v1.NodeStatus and node.StatusReport

* Adding check to libgoal Client StoppedAtUnsupportedRound in v1.NodeStatus true and false values.

* Review comments from Tsachi: using the timeout in select

* Updating the test to reflect the removal of: has synced since startup.
config.json: {"TelemetryToLog":true}
logging.config: {"Enable":false,"SendToLog":true}
* relax StartNetwork regex.

* Another attempt.
…conflict (#697)

* Updates to the goal node status

This change is splitting the goal section from PR: #685
Updated goal node status to inform when the catchup service is stopped.
Updated goal node status by removing "Synced Since Startup" field.

* Adding parameter StoppedAtUnsupportedRound to v1.NodeStatus and node.StatusReport

* Adding check to libgoal Client StoppedAtUnsupportedRound in v1.NodeStatus true and false values.

* Review comments from Tsachi: using the timeout in select

* Two fixes to basicCatchup_test: cloned node not terminated and env var collision

1) TestBasicCatchup and newly added TestStoppedCatchupOnUnsupported
create a new node by cloning one of the network nodes. When
fixture.Shutdown() stops the original network nodes, leaves the cloned
node running. This change adds function shutDownClonedNode to stop the
cloned nodes.

2) In TestStoppedCatchupOnUnsupported, an env variable is used to
delete ConsensusCurrentVersion, so that the cloned node behaves as if
its binary does not support the consensus version. However, when the
TestBasicCatchup runs in parallel, it also picks up the env variable,
and consequently deletes ConsensusCurrentVersion from the Consensus
map. When this happens, TestBasicCatchup sporadically fails.

In this change, instead of having ConsensusTestUnupgradedProtocol
upgrade to ConsensusCurrentVersion, or deleting
ConsensusCurrentVersion so it cannot be upgraded, it sets up
ConsensusTestUnupgradedProtocol to upgrade to
ConsensusTestUnupgradedToProtocol. Hence, the env variable is used to
delete ConsensusTestUnupgradedToProtocol. This way the conflict with
other tests is eliminated.

* Fixing golint by addint comment.

* Tsachi's review comment: unsetting the env var.
if remote telemetry is not enabled, do not start uri update service
add a nil check
* Test changes.

* Better error reporting on goalFixture

* Add version query for kmd startup.

* Few more test cases to cover.

* try to wait.

* changes

* Update.

* Move KMD shutdown to network.

* Add some debug messages to figure out what's going on.

* Fix script bug.

* Fix proper KMD shutdown via the KMDFixture

* Run the tests one at a time only on arm64

* Updating according to review.
* enable go profiler for netdeploy

* add EnableProfiler to ConfigJSONOverride
* Fail test on panic

* few more touchups.

* sync

* bugfix.

* Update few more usecases.

* Refactoring

* Simplify.

* undo network referencing.

* undo few func-ptr.

* undo some more stuff.

* Update method names

* Few more touchups.
* Initial commit

* Added Jenkinsfile

* Updated Jenkinsfile

* Works until GPG IPC

* Move build files into new release/ dir

Also, renamed files {build_,}release.sh and {build_,}setup.sh

* Path issues

* Use t2.xlarge instance type (4 vCPUs, 16GB ram)

* Restructuring

* shellchecked

* fix bug

* Added new `socket.sh` file

* Trying to build rpm

* Bump up disk size of ec2 instance

* more attempts to make rpm

* more fixes

* move /stuff -> /root/stuff

* wip

* moved to correct paths

* Have `release` have its own start and kill ec2 instance scripts

* use buildhost scripts after all

* Make sure the gpg key name matches!!!!!

-%_gpg_name Algorand RPM <rpm@algorand.com>
+%_gpg_name rpm algorand <rpm@algorand.com>

* fixes

* Add upload stage to pipeline

* Add tag stage to pipeline

* more fixes

* Move start/stop ec2 instance scripts back into release/

* Add ability to dynamically set branch

* Added controller/ subdir

* Some cleanup

* Adding tag support

Moved `Jenkinsfile` into controller/ subdir.

* Move build_env build.sh -> setup.sh

Moved socket.sh -> controller/socket.sh

* Revert buildhost changes

* some cleanup

* fix build

* test packages locally

* upload packages to s3 test bucket

* restructure

* misc

* fix build

* Add Jenkins parameters

* fix build

* Move commands into Jenkinsfile into stages/

* fix build

* Make test stage more explicit

* fix build

* Implementing reviewer suggestions

* Added debug info

* fix build

* Merge into master

* implement reviewer suggestions

* turn off test stage

* fix build

* fix build

* fix build

* Update readme

* removed unneeded archive/ dir
* Switch from default logger to pre-configured logger
  in some components of agreement service
* Mark some of the slow e2e tests as such.

* Move shorttest flag to be set at top level.
* Move some more test to be "slow tests", and modify short test condition so that we will run
the long tests on nightly builds only.

* Fix elif -> else
btoll and others added 5 commits May 27, 2020 17:02
Add mule deploy task to enable deployments in the new pipeline.
Fast Catchup

Fast catchup is an alternative, user-driven, complementary way to have a local node catchup in a realistic and (relatively) constant time.

This PR contains the entire backend of the fast catchup feature previously reviewed in #970, without the goal changes which allow the end-user to invoke the usage of the fast catchup.
@onetechnical onetechnical self-assigned this Jun 1, 2020
@CLAassistant
Copy link

CLAassistant commented Jun 1, 2020

CLA assistant check
All committers have signed the CLA.

@onetechnical onetechnical marked this pull request as ready for review June 1, 2020 19:13
@onetechnical onetechnical requested a review from algojohnlee June 1, 2020 19:14
tsachiherman
tsachiherman previously approved these changes Jun 1, 2020
Copy link
Contributor

@tsachiherman tsachiherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. Let's give it a try!

onetechnical and others added 4 commits June 1, 2020 18:30
The AMI for the ARM host builder was updated in rel/beta, but not in master. This fixes that.
Summary

Compiling on ARM takes a long time. As a partial workaround, we want to eliminate recreating the message-pack encoders/decoders, as these already generated by other platforms.
Summary

This PR address a potential data race detected by our automation testing. The data race occurs when a catchpoint label is being written to disk and a new block is added to the ledger at the same time.

Test Plan

Repurpose one of the accountUpdates unit test to verify the fix works as expected.
Copy link
Contributor

@tsachiherman tsachiherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.

@algojohnlee algojohnlee merged commit a6fdbad into algorand:rel/beta Jun 2, 2020
@onetechnical onetechnical deleted the onetechnical/beta2.0.14 branch October 23, 2020 14:05
PhearZero pushed a commit to PhearNet/crypto that referenced this pull request Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.