Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose periods of environment thread in waagent.conf #1891

Merged
merged 7 commits into from
May 22, 2020

Conversation

narrieta
Copy link
Member

Added parameters in waagent.conf to control how often operations in the environment thread are performed.

@codecov
Copy link

codecov bot commented May 21, 2020

Codecov Report

Merging #1891 into develop will decrease coverage by 0.05%.
The diff coverage is 58.26%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1891      +/-   ##
===========================================
- Coverage    69.44%   69.38%   -0.06%     
===========================================
  Files           83       83              
  Lines        11445    11499      +54     
  Branches      1590     1598       +8     
===========================================
+ Hits          7948     7979      +31     
- Misses        3166     3183      +17     
- Partials       331      337       +6     
Impacted Files Coverage Δ
azurelinuxagent/ga/env.py 50.86% <27.77%> (-0.54%) ⬇️
azurelinuxagent/common/conf.py 77.04% <57.69%> (-1.20%) ⬇️
azurelinuxagent/ga/update.py 87.71% <63.63%> (-0.95%) ⬇️
azurelinuxagent/ga/monitor.py 71.62% <80.00%> (-2.91%) ⬇️
azurelinuxagent/common/event.py 86.34% <100.00%> (+0.03%) ⬆️
azurelinuxagent/ga/periodic_operation.py 97.05% <100.00%> (+1.22%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d62c4c5...a989520. Read the comment docs.

README.md Show resolved Hide resolved
Copy link
Contributor

@larohra larohra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

@@ -164,6 +171,18 @@ def enable_firewall(conf=__conf__):
return conf.get_switch("OS.EnableFirewall", False)


def get_enable_firewall_period(conf=__conf__):
return conf.get_int("OS.EnableFirewallPeriod", 30)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the new flags dont have an entry in the README, any particular reason for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the only one that is not in README. EnableFirewall is not there either so I am following the same

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is a yml list and then there is a description for each flag in the conf file. I found a couple that have a description in the README but have not been added to the yml list in the README -

OS.RemovePersistentNetRulesPeriod
OS.RootDeviceScsiTimeoutPeriod
Provisioning.MonitorHostNamePeriod

Missing completely from README -

OS.MonitorDhcpClientRestartPeriod

Also I personally feel we should add the Firewall rules to the readme too, it makes more sense to be as transparent as possible especially because its an open source project.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add MonitorDhcpClientRestartPeriod to README. The other list is not in sync with README, I think it is just meant as a sample, not a comprehensive list of options. I'm not planning on listing all the missing options there.

for op in self._periodic_operations:
op.run()
except Exception as e:
logger.error("An error occurred in the environment thread main loop; will skip the current iteration.\n{0}", ustr(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might never hit this path as no operation has their own error handling so the PeriodicOperation.run() would just log it and let it go.

I wonder if it would be cleaner to just keep only the uber except @line:119 to ensure that we log incase the thread goes down

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a typo (for example) could produce an exception. since this is the main loop i am protecting it

op.run()

except Exception as e:
logger.error("An error occurred in the monitor thread main loop; will skip the current iteration.\n{0}", ustr(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for this, all functions have their own error handling or the PeriodicFunction.run will take care of it, this code path will never hit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW - the reason i added the try/except here is because sleep_until_next_operation in the finally was throwing on Python 2.6 (initially I was using timedelta.totalseconds() in it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh that makes sense, finally can throw too didnt think of that lol. Thanks for adding it! :)

#### __Extensions.GoalStatePeriod__

_Type: Integer_
_Default: 6_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also just curious, why did you change the default from 3 to 6secs? I think Joe had a dependency on checking the artifact blob every 3 secs for Fasttrack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to slow down status reporting. until it is moved to a different thread i'm slowing it down here. can you post a comment in joe's PR to not hardcode 3 sec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add a comment once I do that PR

@@ -421,6 +422,28 @@ def _emit_restart_event(self):

return

@staticmethod
def _emit_changes_in_default_configuration():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this, much needed!
I have a suggestion, why not just log all parameters that changed from default, that way we will always have an idea of what the customers are using and maybe later on we can leverage that to change our defaults if needed.

Copy link
Contributor

@larohra larohra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

sleep_seconds = ((sleep_timedelta.days * 24 * 3600 + sleep_timedelta.seconds) * 10.0 ** 6 + sleep_timedelta.microseconds) / 10.0 ** 6

if sleep_seconds > 0:
time.sleep(sleep_seconds)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had no idea sleep() could take floats...

def log_if_int_changed_from_default(name, current):
default = conf.get_int_default_value(name)
if default != current:
msg = "{0} changed from its default; new value: {1}".format(name, current)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion, log the default value here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i had it initially; too verbose, so i removed it

@narrieta narrieta merged commit de81f1b into Azure:develop May 22, 2020
@narrieta narrieta deleted the env-thread branch May 22, 2020 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants