-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose periods of environment thread in waagent.conf #1891
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #1891 +/- ##
===========================================
- Coverage 69.44% 69.38% -0.06%
===========================================
Files 83 83
Lines 11445 11499 +54
Branches 1590 1598 +8
===========================================
+ Hits 7948 7979 +31
- Misses 3166 3183 +17
- Partials 331 337 +6
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments
@@ -164,6 +171,18 @@ def enable_firewall(conf=__conf__): | |||
return conf.get_switch("OS.EnableFirewall", False) | |||
|
|||
|
|||
def get_enable_firewall_period(conf=__conf__): | |||
return conf.get_int("OS.EnableFirewallPeriod", 30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of the new flags dont have an entry in the README, any particular reason for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the only one that is not in README. EnableFirewall is not there either so I am following the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there is a yml
list and then there is a description for each flag in the conf file. I found a couple that have a description in the README but have not been added to the yml
list in the README -
OS.RemovePersistentNetRulesPeriod
OS.RootDeviceScsiTimeoutPeriod
Provisioning.MonitorHostNamePeriod
Missing completely from README -
OS.MonitorDhcpClientRestartPeriod
Also I personally feel we should add the Firewall rules to the readme too, it makes more sense to be as transparent as possible especially because its an open source project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add MonitorDhcpClientRestartPeriod to README. The other list is not in sync with README, I think it is just meant as a sample, not a comprehensive list of options. I'm not planning on listing all the missing options there.
for op in self._periodic_operations: | ||
op.run() | ||
except Exception as e: | ||
logger.error("An error occurred in the environment thread main loop; will skip the current iteration.\n{0}", ustr(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might never hit this path as no operation has their own error handling so the PeriodicOperation.run()
would just log it and let it go.
I wonder if it would be cleaner to just keep only the uber except @line:119 to ensure that we log incase the thread goes down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a typo (for example) could produce an exception. since this is the main loop i am protecting it
op.run() | ||
|
||
except Exception as e: | ||
logger.error("An error occurred in the monitor thread main loop; will skip the current iteration.\n{0}", ustr(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for this, all functions have their own error handling or the PeriodicFunction.run will take care of it, this code path will never hit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW - the reason i added the try/except here is because sleep_until_next_operation in the finally was throwing on Python 2.6 (initially I was using timedelta.totalseconds() in it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh that makes sense, finally can throw too didnt think of that lol. Thanks for adding it! :)
#### __Extensions.GoalStatePeriod__ | ||
|
||
_Type: Integer_ | ||
_Default: 6_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also just curious, why did you change the default from 3 to 6secs? I think Joe had a dependency on checking the artifact blob every 3 secs for Fasttrack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to slow down status reporting. until it is moved to a different thread i'm slowing it down here. can you post a comment in joe's PR to not hardcode 3 sec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add a comment once I do that PR
@@ -421,6 +422,28 @@ def _emit_restart_event(self): | |||
|
|||
return | |||
|
|||
@staticmethod | |||
def _emit_changes_in_default_configuration(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this, much needed!
I have a suggestion, why not just log all parameters that changed from default, that way we will always have an idea of what the customers are using and maybe later on we can leverage that to change our defaults if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
sleep_seconds = ((sleep_timedelta.days * 24 * 3600 + sleep_timedelta.seconds) * 10.0 ** 6 + sleep_timedelta.microseconds) / 10.0 ** 6 | ||
|
||
if sleep_seconds > 0: | ||
time.sleep(sleep_seconds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had no idea sleep() could take floats...
def log_if_int_changed_from_default(name, current): | ||
default = conf.get_int_default_value(name) | ||
if default != current: | ||
msg = "{0} changed from its default; new value: {1}".format(name, current) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion, log the default value here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i had it initially; too verbose, so i removed it
Added parameters in waagent.conf to control how often operations in the environment thread are performed.