Skip to content

Commit

Permalink
Small fixes for metrics (#13807)
Browse files Browse the repository at this point in the history
* Check category names in YAML files. Fix some.
* Sort out short names in metrics labels.
* allMetrics.yaml
* CHANGELOG.

Co-authored-by: jsteemann <jan@arangodb.com>
  • Loading branch information
neunhoef and jsteemann authored Mar 25, 2021
1 parent 0bf3168 commit 3a76351
Show file tree
Hide file tree
Showing 9 changed files with 36 additions and 10 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
devel
-----

* Fix shortName labels in metrics, in particular for agents.

* Fix a race in LogAppender::haveAppenders.
`haveAppenders` is called as part of audit logging. It accesses internal maps
but previously did not hold a lock while doing so.
Expand Down
6 changes: 3 additions & 3 deletions Documentation/Metrics/allMetrics.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -691,7 +691,7 @@
temporarily, \nwhich will lead to an increase in lock acquisition times.\n"
type: histogram
unit: s
- category: Transaction
- category: Transactions
complexity: advanced
description: 'Number of transactions using sequential locking of collections to
avoid deadlocking.
Expand Down Expand Up @@ -724,7 +724,7 @@
run on multiple shards on different servers.\n"
type: counter
unit: number
- category: Transaction
- category: Transactions
complexity: medium
description: 'Number of timeouts when trying to acquire collection exclusive locks.
Expand All @@ -749,7 +749,7 @@
for the same locks.\n"
type: counter
unit: number
- category: Transaction
- category: Transactions
complexity: medium
description: "Number of timeouts when trying to acquire collection write locks.\nThis
counter will be increased whenever a collection write lock\ncannot be acquired
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ help: |
Number of transactions using sequential locking of collections to avoid deadlocking.
unit: number
type: counter
category: Transaction
category: Transactions
complexity: advanced
exposedBy:
- coordinator
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ help: |
Number of timeouts when trying to acquire collection exclusive locks.
unit: number
type: counter
category: Transaction
category: Transactions
complexity: medium
exposedBy:
- dbserver
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ help: |
Number of timeouts when trying to acquire collection write locks.
unit: number
type: counter
category: Transaction
category: Transactions
complexity: medium
exposedBy:
- dbserver
Expand Down
1 change: 1 addition & 0 deletions arangod/Agency/State.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1049,6 +1049,7 @@ bool State::loadOrPersistConfiguration() {
}
}
_agent->id(uuid);
ServerState::instance()->setId(uuid);

auto ctx = std::make_shared<transaction::StandaloneContext>(*_vocbase);
SingleCollectionTransaction trx(ctx, "configuration", AccessMode::Type::WRITE);
Expand Down
10 changes: 8 additions & 2 deletions arangod/Cluster/ServerState.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -789,10 +789,16 @@ bool ServerState::registerAtAgencyPhase1(AgencyComm& comm, ServerState::RoleEnum
}

std::string ServerState::getShortName() const {
if (_role == ROLE_AGENT) {
return getId().substr(0, 13);
}
std::stringstream ss; // ShortName
auto num = getShortId();
size_t width = std::max(std::to_string(num + 1).size(), static_cast<size_t>(4));
ss << roleToAgencyKey(getRole()) << std::setw(width) << std::setfill('0') << num + 1;
if (num == 0) {
return std::string{}; // not yet known
}
size_t width = std::max(std::to_string(num).size(), static_cast<size_t>(4));
ss << roleToAgencyKey(getRole()) << std::setw(width) << std::setfill('0') << num;
return ss.str();
}

Expand Down
11 changes: 9 additions & 2 deletions arangod/RestServer/MetricsFeature.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -456,8 +456,15 @@ void MetricsFeature::toPrometheus(std::string& result, bool v2) const {

std::lock_guard<std::recursive_mutex> guard(_lock);
if (_globalLabels.find("shortname") == _globalLabels.end()) {
_globalLabels.try_emplace("shortname", ServerState::instance()->getShortName());
changed = true;
std::string shortName = ServerState::instance()->getShortName();
// Very early after a server start it is possible that the
// short name is not yet known. This check here is to prevent
// that the label is permanently empty if metrics are requested
// too early.
if (!shortName.empty()) {
_globalLabels.try_emplace("shortname", shortName);
changed = true;
}
}
if (_globalLabels.find("role") == _globalLabels.end() &&
ServerState::instance() != nullptr &&
Expand Down
10 changes: 10 additions & 0 deletions utils/generateAllMetricsDocumentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,12 @@

import os, re, sys

# Some data:
categoryNames = ["Health", "AQL", "Transactions", "Foxx", "Pregel", \
"Statistics", "Replication", "Disk", "Errors", \
"RocksDB", "Hotbackup", "k8s", "Connectivity", "Network",\
"V8", "Agency", "Scheduler", "Maintenance", "kubearangodb"]

# Check that we are in the right place:
lshere = os.listdir(".")
if not("arangod" in lshere and "arangosh" in lshere and \
Expand Down Expand Up @@ -118,6 +124,10 @@
if not isinstance(y["exposedBy"], list):
print("YAML file '" + filename + "' has an attribute 'exposedBy' whose value must be a list but isn't.")
bad = True
if not bad:
if not y["category"] in categoryNames:
print("YAML file '" + filename + "' has an unknown category '" + y["category"] + "', please fix.")
bad = True

if bad:
missing = True
Expand Down

0 comments on commit 3a76351

Please sign in to comment.