Skip to content
netwolfuk edited this page Jan 11, 2023 · 6 revisions

Since 1.2.0

As the developer of tcWebHooks, I would really appreciate some information on how the plugin is used. In version 1.2.0 and later, analytics can be enabled from the WebHook option on the Administration section of TeamCity.

There are two buttons to control statistic collection, and analytics sharing.

Statistics collection

This enables a task to run every 60 mins to collect statistics from webhook executions. It then writes them to an xml file in config/webhook-statistics/. Some information in the statistics file is generalised. Eg, a url like https://api.slack.com/hooks/some-token is converted to slack.com when stored. These stats contain the following sorts of values.

  1. The number of webhooks configured, the build events enabled, and the template they are configured to use.
  2. The number of webhook executions, the generalised URL, and the result code returned when the webhook triggered.

The statistics in these files are shown in the admin UI on the graph in the WebHook Execution History Statistics section.

Analytics sharing

The data in the statistics files is then hashed so as to obscure any private information. The salt used to hash the values is created in one of two ways.

  1. The salt used to encrypt the Server UUID is the same every time. This means that the hashed UUID will be the same every time, and can be used to filter out duplicate analytics from the same instance. I have no interest in what your TeamCity UUID really is, I just want a way to avoid duplicated data from the same instance on the same date.
  2. The salt used to encrypt all other string data in the analytics message is hashed using a different salt on every invocation. This prevents the use of rainbow tables to analyse the statistics sent for URLs or template names. The intention is so make the data so expensive to brute force, that I will only attempt to determine common URLs like Slack or Teams and known templateIds (eg, those bundled with tcWebHooks).

Please consider enabling the Analytics sharing option.

Note: No payload content, variables or full URLs are shared in the analytics data.

What do I gain from the analytics?

I hope to determine the following data.

  1. Common generalised URLs for which webhook are used. eg, Slack, Teams, etc.
  2. Which bundled templates are most commonly used.
  3. Whether users use the bundled templates, modify them or create their own
  4. Common return codes. Eg, if the analytics are showing 999 errors, I might have a bug.
  5. Roughly how many webhooks are configured, and how many webhook POSTs are undertaken.

More detail about the WebHook History, Statistics and Analytics

Every webhook execution result is stored in memory in the WebHooks History Memory store. The store contains the most recent 10,000 execution results in memory. Once the 10,000 item limit is reached, any new results that are added will expire the oldest results. The store is kept in memory, so restarting the plugin (or restarting TeamCity) will clear the store. The history page and the error list on the WebHooks admin page is populated from the Memory Store.

Build Events
flowchart LR;
    A[Build Runs] --> B[Build Event Triggers WebHook] --> C[WebHook Executes] --> D[Result stored in Memory store];
    A --> E[Build Event Triggers WebHook] --> F[WebHook Executes] --> G[Result stored in Memory store];

Loading

Statistics are the analysis of the items in the memory store. Statistics are the totals of each type of webhook configuration and the totals of each type of webhook execution. The statistics task runs every 60 mins and re-calculates the statistics. The update statistics are then written to XML files in .BuildServer/config/webhooks-statistics/. One file is created for each day and is updated when each task runs. There is also a shutdown hook, so that any new results in the Memory store can be persisted to the XML file before TeamCity shuts down.

Scheduled Task (Hourly or on TeamCity shutdown)
flowchart LR;
    H[New events fetched from Memory store] --> I[URLs are generalised] --> J[Events are normalised] --> K[Normalised statistics written to disk];
Loading

Once a day (shortly after midnight), a task runs to finalise any statistics from the previous day and start a new file for the new day's statistics. Statistics files that are more than one year old are removed. The statistics files are also analysed to see if any are ready to be shared. If sharing is enabled and there are 5 or more days of statistics to share, then a webhook is executed so send the analytics report.

Scheduled Task Daily
flowchart LR;
    P[Statistics loaded from disk] --> O{5 days of stats to report} -- No --> Q[Finish];
    O -- Yes --> R[Hash data] --> S[Report Analytics] --> T[Mark as reported] --> Q
Loading
Clone this wiki locally