Integrations Archives - Zabbix Blog

Using the zabbix_utils Library for Tool Development

Aleksandr Iantsen — Tue, 12 Nov 2024 12:16:49 +0000

In this article, we will explore a practical example of using the zabbix_utils library to solve a non-trivial task – obtaining a list of alert recipients for triggers associated with a specific Zabbix host. You will learn how to easily automate the process of collecting this information, and see examples of real code that can be adapted to your needs.

Table of Contents

Over the last year, the zabbix_utils library has become one of the most popular tools for working with the Zabbix API. It is a convenient tool that simplifies interacting with the Zabbix server, proxy, or agent, especially for those who automate monitoring and management tasks.

Due to its ease of use and extensive functionality, zabbix_utils has found a following among system administrators, monitoring, and DevOps engineers. According to data from PyPI, the library has already been downloaded over 140,000 times since its release, confirming its demand within the community. It’s all thanks to you and your attention to zabbix_utils!

Task Description

Administrators often need to check which Zabbix users receive alerts for specific triggers in the Zabbix monitoring system. This can be useful for auditing, configuring new notifications, or simply for a quick diagnosis of issues. The task becomes especially relevant when you have plenty of hosts containing numerous triggers, and manually checking the recipients for each trigger through the Zabbix interface becomes very time-consuming.

In such cases, it is advisable to use a custom solution based on the Zabbix API. You can directly access all the required data using the API, and then use additional logic to determine the final alert recipients. The zabbix_utils library makes working with the Zabbix API more convenient and allows you to automate this process. In this project, we use the zabbix_utils library to write a Python script that collects a list of alert recipients for the triggers of the selected Zabbix host. This will allow you to obtain the necessary information faster and with minimal effort.

Environment Setup and Installation

To get started with zabbix_utils, you need to install the library and configure the connection to the Zabbix API. This article provides more details and examples on getting started with the library. However, it would be better if I describe the basic steps to prepare the environment here.

The library supports several installation methods described in the official README, making it convenient for use in different environments.

1. Installation via pip

The simplest and most common installation method is using the pip package manager. To do this, execute the command:

~$ pip install zabbix_utils

To install all necessary dependencies for asynchronous work, you can use the command:

~$ pip install zabbix_utils[async]

This method is suitable for most users, as pip automatically installs all required dependencies.

2. Installation from Zabbix Repository

Since writing the previous articles, we have added one more installation method – from the official Zabbix repository. First and foremost, you need to add the repository to your system if it has not been installed yet. Official Zabbix packages for Red Hat Enterprise Linux and Debian-based distributions are available on the Zabbix website.

For Red Hat Enterprise Linux and derivatives:

~# dnf install python3-zabbix-utils

For Debian / Ubuntu and derivatives:

~# apt install python3-zabbix-utils

3. Installation from Source Code

If you require the latest version of the library that has not yet been published on PyPI, or you want to customize the code, you can install the library directly from GitHub:

1. Clone the repository from GitHub:

~$ git clone https://github.com/zabbix/python-zabbix-utils

2. Navigate to the project folder:

~$ cd python-zabbix-utils/

3. Install the library by executing the command:

~$ python3 setup.py install

4. Testing the Connection to Zabbix API

After installing zabbix_utils, it is a good idea to check the connection to your Zabbix server via the API. To do this, use the URL to the Zabbix server, the token, or the username and password of the user who has permission to access the Zabbix API.

Example code for checking the connection:

from zabbix_utils import ZabbixAPI

ZABBIX_AUTH = {
    "url": "your_zabbix_server",
    "user": "your_username",
    "password": "your_password"
}
api = ZabbixAPI(**ZABBIX_AUTH)
hosts = api.host.get(
    output=['hostid', 'name']
)
print(hosts)
api.logout()

Main Steps of the Task Solution

Now that the environment is set up, let’s look at the main steps for solving the task of retrieving the list of alert recipients for triggers associated with a specific Zabbix host in Zabbix.

In zabbix_utils, asynchronous API interaction support is built in through the AsyncZabbixAPI class. This allows multiple requests to be sent simultaneously and their results to be handled as they become ready, significantly reducing latencies when making multiple API calls. Therefore, we will use the AsyncZabbixAPI class and the asynchronous approach in this project.

Below are the main steps for solving the task, and code examples for each step. Please note that the code in this project is for demonstration purposes, may not be optimal, or could contain errors. Use it as an example or a base for your project, but not as a complete tool.

Step 1. Obtain Host ID

The first step is to identify the host for which we will retrieve information about triggers and alerts. We need to find the hostid using its name/host to do this. The Zabbix API provides a method to obtain this information, and using zabbix_utils makes this process much simpler.

Example of obtaining the host ID by its name:

host = api.host.get(
    output=["hostid"],
    filter={"name": "your_host_name"}
)

This method returns a unique identifier for the host, which can be used further. However, for our test project, we will use a manually specified host identifier.

Step 2. Retrieve Host Triggers

With the hostid in hand, the next step is to retrieve all triggers associated with this host. Triggers contain the conditions that trigger the alerts. We need to collect information about all triggers so that we can then use it to select actions that match all the conditions.

Example of retrieving node triggers:

triggers = api.trigger.get(
    hostids=[hostid],
    selectTags="extend",
    selectHosts=["hostid"],
    selectHostGroups=["groupid"],
    selectDiscoveryRule=["templateid"],
    output="extend",
)

This request returns complete information about the triggers for the host. We get not only the triggers but also their tags, associated host and host groups, and discovery rule information. All this information will be necessary to check the conditions of the actions.

Step 3. Initialize Trigger Metadata

At this stage, objects for each trigger are created to store their metadata. This is done using the Trigger class, which includes information about the trigger such as its name, ID, associated host groups, hosts, tags, templates, and operations.

Here’s the code defining the Trigger class:

class Trigger:
    def __init__(self, trigger):
        self.name = trigger["description"]
        self.triggerid = trigger["triggerid"]
        self.hostgroups = [g["groupid"] for g in trigger["hostgroups"]]
        self.hosts = [h["hostid"] for h in trigger["hosts"]]
        self.tags = {t["tag"]: t["value"] for t in trigger["tags"]}
        self.tmpl_triggerid = self.triggerid
        self.lld_rule = trigger["discoveryRule"] or {}
        if trigger["templateid"] != "0":
            self.tmpl_triggerid = trigger["templateid"]
        self.templates = []
        self.messages = []
        self._conditions = {
            "0": self.hostgroups,
            "1": self.hosts,
            "2": [self.triggerid],
            "3": trigger["event_name"] or trigger["description"],
            "4": trigger["priority"],
            "13": self.templates,
            "25": self.tags.keys(),
            "26": self.tags,
        }

    def eval_condition(self, operator, value, trigger_data):
        # equals or does not equal
        if operator in ["0", "1"]:
            equals = operator == "0"
            if isinstance(value, dict) and isinstance(
                trigger_data, dict):
                if value["tag"] in trigger_data:
                    if value["value"] == trigger_data[
                        value["tag"]]:
                        return equals
            elif value in trigger_data and isinstance(
                trigger_data, list):
                return equals
            elif value == trigger_data:
                return equals
            return not equals
        # contains or does not contain
        if operator in ["2", "3"]:
            contains = operator == "2"
            if isinstance(value, dict) and isinstance(
                trigger_data, dict):
                if value["tag"] in trigger_data:
                    if value["value"] in trigger_data[
                        value["tag"]]:
                        return contains
            elif value in trigger_data:
                return contains
            return not contains
 
        # is greater/less than or equals
        if operator in ["5", "6"]:
            greater = operator != "5"
            try:
                if int(value) < int(trigger_data):
                    return not greater
                if int(value) == int(trigger_data):
                    return True
                if int(value) > int(trigger_data):
                    return greater
            except:
                raise ValueError(
                    "Values must be numbers to compare them"
                )
 
    def select_templates(self, templates):
        for template in templates:
            if self.tmpl_triggerid in [
                t["triggerid"] for t in template["triggers"]]:
                self.templates.append(template["templateid"])
            if self.lld_rule.get("templateid") in [
                d["itemid"] for d in template["discoveries"]
            ]:
                self.templates.append(template["templateid"])

    def select_actions(self, actions):
        selected_actions = []
        for action in actions:
            conditions = []
            if "filter" in action:
                conditions = action["filter"]["conditions"]
                eval_formula = action["filter"]["eval_formula"]
            # Add actions without conditions directly
            if not conditions:
                selected_actions.append(action)
                continue
            condition_check = {}
            for condition in conditions:
                if (
                    condition["conditiontype"] != "6"
                    and condition["conditiontype"] != "16"
                ):
                    if (
                        condition["conditiontype"] == "26"
                        and isinstance(condition["value"], str)
                    ):
                        condition["value"] = {
                            "tag": condition["value2"],
                            "value": condition["value"],
                        }
                    if condition["conditiontype"] in self._conditions:
                        condition_check[
                            condition["formulaid"]
                        ] = self.eval_condition(
                            condition["operator"],
                            condition["value"],
                            self._conditions[
                                condition["conditiontype"]
                            ],
                        )
                else:
                    condition_check[
                        condition["formulaid"]
                    ] = True
            for formulaid, bool_result in condition_check.items():
                eval_formula = eval_formula.replace(
                    formulaid, str(bool_result))

            # Evaluate the final condition formula
            if eval(eval_formula):
                selected_actions.append(action)

        return selected_actions
 
    def select_operations(self, actions, mediatypes):
        messages_metadata = []
        for action in self.select_actions(actions):
            messages_metadata += self.check_operations(
                "operations", action, mediatypes
            )
            messages_metadata += self.check_operations(
                "update_operations", action, mediatypes
            )
            messages_metadata += self.check_operations(
                "recovery_operations", action, mediatypes
            )
        return messages_metadata


    def check_operations(self, optype, action, mediatypes):
        messages_metadata = []
        optype_mapping = {
            "operations": "0",  # Problem event
            "recovery_operations": "1",  # Recovery event
            "update_operations": "2",  # Update event
        }

        operations = copy.deepcopy(action[optype])

        # Processing "notify all involved" scenarios
        for idx, _ in enumerate(operations):
            if operations[idx]["operationtype"] not in ["11", "12"]:
                continue
            # Copy operation as a template for reuse
            op_template = copy.deepcopy(operations[idx])
            del operations[idx]
            # Checking for message sending operations
            for key in [
                k for k in ["operations", "update_operations"] if k != optype
            ]:
                if not action[key]:
                    continue
                # Checking for message sending type operations
                for op in [
                    o for o in action[key] if o["operationtype"] == "0"
                ]:
                    # Copy template for the current operation
                    operation = copy.deepcopy(op_template)
                    operation.update(
                        {
                            "operationtype": "0",
                            "opmessage_usr": op["opmessage_usr"],
                            "opmessage_grp": op["opmessage_grp"],
                        }
                    )
                    operation["opmessage"]["mediatypeid"] = op[
                        "opmessage"
                    ]["mediatypeid"]
                    operations.append(operation)
        for operation in operations:
            if operation["operationtype"] != "0":
                continue
            # Processing "all mediatypes" scenario
            if operation["opmessage"]["mediatypeid"] == "0":
                for mediatype in mediatypes:
                    operation["opmessage"]["mediatypeid"] = mediatype[
                        "mediatypeid"
                    ]
                    messages_metadata.append(
                        self.create_messages(
                            optype_mapping[optype], action, operation, [
                                mediatype
                            ]
                        )
                    )
            else:
                messages_metadata.append(
                    self.create_messages(
                        optype_mapping[optype],
                        action,
                        operation,
                        mediatypes
                    )
                )
        return messages_metadata
 
    def create_messages(self, optype, action, operation, mediatypes):
        message = Message(optype, action, operation)
        message.select_mediatypes(mediatypes)
        self.messages.append(message)
        return message

The code for creating Trigger class objects for each of the retrieved triggers:

for trigger in triggers:
    triggers_metadata[trigger["triggerid"]] = Trigger(trigger)

This loop iterates through all triggers and saves them in a dictionary called triggers_metadata, where the key is the triggerid and the value is the trigger object.

Step 4. Retrieve Template Information

The next step is to obtain data about the templates associated with all the triggers:

templates = api.template.get(
    triggerids=list(set([t.tmpl_triggerid for t in triggers_metadata.values()])),
    selectTriggers=["triggerid"],
    selectDiscoveries=["itemid"],
    output=["templateid"],
)

This request returns information about all templates linked to the host’s triggers being examined. Executing a single query for all triggers is a more optimal solution than making individual requests for each trigger. This information will be needed for evaluating the “Template” condition in actions.

Step 5. Get Actions and Media Types

Next, we obtain the list of actions and media types configured in the system:

actions = api.action.get(
    selectFilter="extend",
    selectOperations="extend",
    selectRecoveryOperations="extend",
    selectUpdateOperations="extend",
    filter={"eventsource": 0, "status": 0},
    output=["actionid", "esc_period", "eval_formula", "name"],
)


mediatypes = api.mediatype.get(
    selectUsers="extend",
    selectActions="extend",
    selectMessageTemplates="extend",
    filter={"status": 0},
    output=["mediatypeid", "name"],
)

Here we retrieve actions that define how and to whom alerts are sent, and mediatypes through which users can receive notifications (for example, email or SMS).

Step 6. Match Triggers with Templates and Actions

At this stage, each trigger is associated with the corresponding templates and actions:

for trigger in triggers_metadata.values():
    trigger.select_templates(templates)
    messages += trigger.select_operations(actions, mediatypes)

Here, for each trigger, we update information about its templates and configured actions for sending notifications. The list of associated actions is determined by checking the conditions specified in them against the accumulated data for each trigger.

For each operation of the corresponding trigger action, a Message class object is created:

class Message:
    def __init__(self, optype, action, operation):
        self.optype = optype
        self.mediatypename = ""
        self.actionid = action["actionid"]
        self.actionname = action["name"]
        self.operationid = operation["operationid"]
        self.mediatypeid = operation["opmessage"]["mediatypeid"]
        self.subject = operation["opmessage"]["subject"]
        self.message = operation["opmessage"]["message"]
        self.default_msg = operation["opmessage"]["default_msg"]
        self.users = [u["userid"] for u in operation["opmessage_usr"]]
        self.groups = [g["usrgrpid"] for g in operation["opmessage_grp"]]
        self.recipients = []
        # Escalation period set to action's period if not specified
        self.esc_period = operation.get("esc_period", "0")
        if self.esc_period == "0":
            self.esc_period = action["esc_period"]
        # Use action's escalation period if unset
        self.esc_step_from = self.multiply_time(
            self.esc_period, int(operation.get("esc_step_from", "1")) - 1
        )
        if operation.get("esc_step_to", "0") != "0":
            self.repeat_count = str(
                int(operation["esc_step_to"]) - int(operation["esc_step_from"]) + 1
            )
        # If not a problem event, set repeat count to 1
        elif self.optype != "0":
            self.repeat_count = "1"
        # Infinite repeat count if esc_step_to is 0
        else:
            self.repeat_count = “∞”
 
    def multiply_time(self, time_str, multiplier):
        # Multiply numbers within the time string
        result = re.sub(
            r"(\d+)",
            lambda m: str(int(m.group(1)) * multiplier),
            time_str
        )
        if result[0] == "0":
            return "0"
        return result
 
    def select_mediatypes(self, mediatypes):
        for mediatype in mediatypes:
            if mediatype["mediatypeid"] == self.mediatypeid:
                self.mediatypename = mediatype["name"]
                # Select message templates related to operation type
                msg_template = [
                    m
                    for m in mediatype["message_templates"]
                    if (
                        m["recovery"] == self.optype 
                        and m["eventsource"] == "0"
                    )
                ]
                # Use default message if applicable
                if msg_template and self.default_msg == "1":
                    self.subject = msg_template[0]["subject"]
                    self.message = msg_template[0]["message"]
 
    def select_recipients(self, user_groups, recipients):
        for groupid in self.groups:
            if groupid in user_groups:
                self.users += user_groups[groupid]
        for userid in self.users:
            if userid in recipients:
                recipient = copy.deepcopy(recipients[userid])
                if self.mediatypeid in recipient.sendto:
                    recipient.mediatype = True
                self.recipients.append(recipient)

Each such object represents a separate message sent to users (recipients) and will contain all message information – its subject, text, recipients, and escalation parameters.

Step 7. Collect User and Group Identifiers

After matching the triggers with actions, the process of collecting unique identifiers for users and groups starts:

userids = set()
groupids = set()

for message in messages:
    userids.update(message.users)
    groupids.update(message.groups)

This code snippet collects the IDs of all users and groups involved in the operations for each trigger. This is necessary to perform only one request to the Zabbix API for all involved users and their groups, rather than making separate requests for each trigger.

Step 8. Obtain User and Group Information

The next step is to collect detailed information about users and user groups:

usergroups = {
    group["usrgrpid"]: group
    for group in api.usergroup.get(
        selectUsers=["userid"],
        selectHostGroupRights="extend",
        output=["usrgrpid", "role"],
    )
}
 
users = {
    user["userid"]: user
    for user in api.user.get(
        selectUsrgrps=["usrgrpid"],
        selectMedias=["mediatypeid", "active", "sendto"],
        selectRole=["roleid", "type"],
        filter={"status": 0},
        output=["userid", "username", "name", "surname"],
    )
}

Here we gather data about users, including their role and media types through which they receive notifications, as well as data about user groups, including access rights to host groups and the list of users in each group. All this information will be needed to check access to the host with the triggers we are working with.

Step 9. Match Users and Groups with Triggers

After obtaining user information, we match users and groups with their respective rights to receive notifications. Here we also link users with groups, updating the information regarding rights and groups for each user.

for userid in userids:
    if userid in users:
        user = users[userid]
        recipients[userid] = Recipient(user)
        for group in user["usrgrps"]:
            if group["usrgrpid"] in usergroups:
                recipients[userid].permissions.update([
                    h["id"]
                    for h in usergroups[group["usrgrpid"]]["hostgroup_rights"]
                    if int(h["permission"]) > 1
                ])
 
for groupid in groupids:
    if groupid in usergroups:
        group = usergroups[groupid]
        user_groups[group["usrgrpid"]] = []
        for user in group["users"]:
            user_groups[group["usrgrpid"]].append(user["userid"])
            if user["userid"] in recipients:
                recipients[user["userid"]].groups.update(group["usrgrpid"])
            elif user["userid"] in users:
                recipients[user["userid"]] = Recipient(users[user["userid"]])
            recipients[user["userid"]].permissions.update([
                h["id"]
                for h in group["hostgroup_rights"]
                if int(h["permission"]) > 1
            ])

This code fragment connects each user with their groups and vice versa, creating a complete list of users with their access rights to the host, and thus their eligibility to receive notifications about events for this host.

For each recipient, a Recipient class object is created containing data about the recipient, such as the notification address, access rights to hosts, configured mediatypes, etc.

Here’s the code that describes the Recipient class:

class Recipient:
    def __init__(self, user):
        self.userid = user["userid"]
        self.username = user["username"]
        self.fullname = "{name} {surname}".format(**user).strip()
        self.type = user["role"]["type"]
        self.groups = set([g["usrgrpid"] for g in user["usrgrps"]])
        self.has_right = False
        self.permissions = set()
        self.sendto = {
            m["mediatypeid"]: m["sendto"] for m in user["medias"] if m["active"] == "0"
        }
        # Check if the user is a super admin (type 3)
        if self.type == "3":
            self.has_right = True

Step 10. Match Messages with Recipients

Finally, we match recipients with specific messages from Step 6:

for message in messages:
    message.select_recipients(user_groups, recipients)

This step completes the main process – each message is assigned to the relevant recipients.

Step 11. Check Recipient Access Rights and Output the Result

Before the actual output of the result with the list of recipients, we can perform a check of the recipients’ message rights and filter only those who have the corresponding rights to receive notifications for the events related to the trigger, or those who have all configured media types specified and active. After these actions, the information can be output in any convenient way – whether it be exporting to a file or displaying it on the screen:

for trigger in triggers_metadata.values():
    for message in trigger.messages:
        for recipient in message.recipients:
            recipient.show = True
            if not recipient.has_right:
                recipient.has_right = (len([gid
                    for gid in trigger.hostgroups
                    if gid in recipient.permissions
                ]) > 0)
            if not recipient.has_right and not show_unavail:
                recipient.show = False

Example Implementation

All the examples and code snippets described above have been compiled to create a solution demonstrating the algorithm for obtaining notification recipients for triggers associated with the selected host. We have implemented this algorithm as a simple web interface to make the result more illustrative and convenient for familiarization.

This interface allows users to enter the host’s ID. The script then processes the data and provides a list of notification recipients associated with the triggers on that host. The web interface uses asynchronous requests to the Zabbix API and the zabbix_utils library to ensure fast data processing and ease of use with many triggers and users.

This lets you familiarize yourself with the theoretical steps and code examples and also try to put this solution into action.

Please note once again that the code in this project is for demonstration purposes, may not be optimal, or could contain errors. Use it as an example or a base for your project, but not as a complete tool.

The web interface’s complete source code and installation instructions can be found on GitHub.

Conclusion

In this article, we explored a practical example of using the zabbix_utils library to solve the task of obtaining alert recipients for triggers associated with a selected Zabbix host using the Zabbix API. We detailed the key steps, from setting up the environment and initializing trigger metadata to working with notification recipients and optimizing performance with asynchronous requests.

Using zabbix_utils allowed us to optimize and accelerate interaction with the Zabbix API, expanding the capabilities of the Zabbix web interface and increasing efficiency when working with large volumes of data. Thanks to support for asynchronous processing and selective API requests, it is possible to significantly reduce the load on the server and improve system performance when working with Zabbix, which is especially important in large infrastructures.

We hope this example will assist you in implementing your own solutions based on the Zabbix API and zabbix_utils, and demonstrate the possibilities for optimizing your interaction with the Zabbix API.

The post Using the zabbix_utils Library for Tool Development appeared first on Zabbix Blog.

Monitoring Zabbix Security Advisories

Brian van Baekel — Tue, 10 Sep 2024 09:48:41 +0000

Zabbix plays a crucial role in monitoring all kinds of “things” – IoT devices, domains, cloud infrastructures and more. It can also be integrated with third-party solutions – for example, with Oxidized for configuration backup monitoring. Given the nature of Zabbix, it usually contains a lot of confidential information as well as (more importantly) some kind of elevated access to network elements while being used by operators, engineers, and customers. This requires that Zabbix as a product should be as secure as possible.

Zabbix has upped their security game and is actively working with HackerOne to take full advantage of the reach of their global community by providing a bug bounty program. And though it doesn’t happen too often, from time to time a security issue arises in Zabbix or one of its dependencies, warranting the release of a Security Advisory.

Table of Contents

The issue

Zabbix typically releases a Security Advisory and might even assign a CVE to the issue. Cool, that is what we expect from reputable software developers. They even inform their customers with support contracts before publishing the advisory to allow them to patch installations beforehand.

If you don’t have a support contract, you can find out about these security advisories on your own by monitoring the Security Advisory page or the published CVEs for Zabbix. NIST has a public API that can be used and that works well, but CVEs are often incomplete.

Zabbix currently publishes no API for their Security Advisories – there is the public tracker which contains all entries and can be queried via API, but it’s only available as unstructured text.

The solution

We want to automatically be notified of new security advisories, and the only data source that contains all data in a structured way is the Zabbix Security Advisory page. However, structured doesn’t mean easily parseable – in fact, it is just raw HTML. We could try to solve this issue in Zabbix, but the easier solution in this case is to scrape the page and generate a JSON file which then can be parsed by Zabbix to achieve our goal, which is automated notifications of new advisories.

Webscraping

We’ve chosen to scrape the Zabbix site using Rust, utilizing the Scraper crate to parse the HTML and flesh out the relevant parts we want. Without going into too much detail, the interesting information is stored in 2 tables, one with the table-simple class applied and one with the table-vertical class applied. Using CSS selectors (which is what the Scraper crate requires), we can retrieve the information we want.

This information is then stored in a struct, which gets added to a hashmap. The result is stored in a vector, which is added to a struct, which eventually is used to generate the JSON we require. Phew.

The resulting JSON is easily parseable by Zabbix:

{
  "last_updated": {
    "secs": ,
    "nanos": ,
  },
  "reports": [
    _list of reports_
  ]
}

The ‘reports’ array contains one entry per advisory, and each entry has the following layout. Unsurprisingly, this closely matches the information that is available on the Zabbix Security Advisory page:

    {
      "_zbxref_": {
        "zbxref": "_zbxref_",
        "cveref": "CVE-XXXX-XXXX",
        "score": X.X,
        "synopsis": "_synopsis_",
        "description": "_description_",
        "vectors": "_vectors_",
        "resolution": "_resolution_",
        "workaround": "_workaround_",
        "acknowledgement": "_acknowledgement_",
        "components": [
          _list of components_,
          _list of components_
        ],
        "affected_version": [
          {
            "affected": "_version_",
            "fixed": "_version_"
          }
        ]
      }
    }

Now, we could provide you with the code of the scraping tool and wish you good luck with making sure the tool runs every X hours and somehow, somewhere stores the resulting JSON for Zabbix to parse. That would be the easy way out, right?

Instead, we’ve chosen to host the Rust program as an AWS Lambda function, triggered every 2 hours by the AWS EventBridge Scheduler and with some code added to the Rust program (function?) to upload the resulting JSON to an AWS S3 bucket. This chain of AWS products not only makes sure that our cloud bill increases, but also guarantees we don’t have to host (and maintain!) anything ourselves.

The result? Just one HTTP GET away…

Template

TL;DR: Download the template here.

Now that the data is available in JSON, it’s fairly easy to parse it using Zabbix. Using the HTTP Agent data collection, we download the JSON from AWS. The URI is stored in the {$ZBX_ADVISORY_URI} macro, which allows for easy modification. By default, it points to the JSON file hosted on AWS S3. This retrieval is done by the Retrieve the Zabbix Security Advisories item, which acts as the source for every other operation. It retrieves the JSON every hour, and with the JSON being generated every 2 hours, the maximum delay between Zabbix publishing a new advisory and you getting it into Zabbix is 3 hours.

The retrieve the Zabbix Security Advisories item acts as a master item for the Last Updated item. This item uses a JSONPath preprocessing step to flesh out the information we want: $.last_updated.secs. The resulting data is stored as unixtime so that we mere mortals can easily read when the last update of the JSON file was performed.

A trigger is configured for this item to ensure that the JSON file isn’t too old. The trigger JSON Feed is out of date has the following expression:
last(/Zabbix Security Advisories/zbx_sec.last_updated)>{$ZBX_ADVISORY_UPDATE_INTERVAL}*{$ZBX_ADVISORY_UPDATE_THRESHOLD}

By default, {$ZBX_ADVISORY_UPDATE_INTERVAL} is set to 2 hours (which is the interval the file gets updated by our tool) and {$ZBX_ADVISORY_UPDATE_THRESHOLD} is set to 3. So, when the JSON file hasn’t been updated within the last 6 hours, this trigger will trigger.

The item Number of advisories uses the same principle, where a JSONPath preprocessing step is used to flesh out the information we want: $.reports. However, as $.reports is an array, we can use functions on it. In this case .length(), which returns an integer. This number is used in the associated trigger A new Zabbix Security Advisory has been published, which simply triggers when the value changes.

This is all very cool, but the JSON has a lot more information, including details about each report. In order to get these details into Zabbix, we use a discovery rule to ‘loop’ through the JSON and create items based on what we’ve discovered: Discover Advisories. This rule uses (again) a JSONPath preprocessing step to get the details we want: $.reports[*][*]. Based on the resulting data (which is a single report in this case), 2 LLD Macros are assigned: {#ZBXREF} – based on the JSONpath $.zbxref and {{#CVEREF} – based on the JSONpath $.cveref.

For each discovered report, 8 items are created. They all work using the same principle, so I will only describe one: Advisory {#ZBXREF} / {#CVEREF} – Acknowledgement. This item uses the master item Zabbix Security Advisories, just like all other items described so far. JSONPath is once again used to get the information we want. The expression $.reports[*][“{#ZBXREF}”].acknowledgement.first() provides exactly what we need, where we combine a LLD macro ({#ZBXREF}) and a JSONpath function (.first()) to first ‘select’ the correct advisory in the JSON and then retrieve the value.

All other 7 items work like this, and there is only one exception: Advisory {#ZBXREF} / {#CVEREF} – Components. The ‘components’ value in the JSON file is actually an array with 1 or more items, describing which components might be affected. But we cannot store arrays in Zabbix, so we use another preprocessing step to convert the array into a string. A few lines of Javascript is all we need:

components = JSON.parse(value);
return components.toString();

First, we parse the JSON input (‘value’) into an array, only to apply the javascript .toString() function on it. The toString method of arrays calls join() internally, which joins the array and returns one string containing each array element separated by commas, which is exactly what we want: a string, separated by commas.

To make working with these advisories easier, each item has the component tag applied, with the value zabbix_security. If the item belongs to an advisory, the advisory tag is added with the value of {#ZBXREF} (which is the advisory number/name). That way, we can easily filter on all Zabbix Security items, filter on all items for a single advisory, and (to make things even better) the type tag is also applied, with the actual type being ‘workaround’ or ‘description.’ This allows for filtering on all Zabbix Security items, of the type ‘score’ (et cetera) to easily gain insight into the different advisories and their score, synopsis, description, components, et cetera.

Dashboard

The tags on the items allow for filtering, but with Zabbix 7.0 we can use all great new nifty features, such as the Item Navigator widget combined with the Item Value widget. Let’s take a look at what configuring such a dashboard might look like if you set up the Item Navigator widget as follows:

And then ‘link’ the Item Value widget to it:

You should get a somewhat decent dashboard. It isn’t perfect (given that the Item Value widget only seems to be able to display a single line of text) but it’s something.

Disclaimer

Though we use this functionality ourselves, this all comes without any guarantee. The technology used to retrieve data (screen scraping) is mediocre at best and could break at any moment if and when Zabbix changes the layout of their page.

NOTE: Because of recent changes to the Zabbix vulnerabilities page, some information presented in this article may no longer apply. We sincerely apologize for any inconvenience.

The post Monitoring Zabbix Security Advisories appeared first on Zabbix Blog.

Monitoring Configuration Backups with Zabbix and Oxidized

Brian van Baekel — Tue, 25 Jun 2024 08:45:43 +0000

As a Zabbix partner, we help customers worldwide with all their Zabbix needs, ranging from building a simple template all the way to (massive) turn-key implementations, trainings, and support contracts. Quite often during projects, we get the question, “How about making configuration backups of our network equipment? We need this, as another tool was also capable of doing this!”

The answer is always the same – yes, but no. Yes, technically it is possible to get your configuration backups in Zabbix. It’s not even that hard to set up initially. However, you really should not want configuration backups. Zabbix is not made for them, and you will run into limitations within minutes. As you can imagine, the customer is never happy with this limitation, and some actively start to question where we think the limitation is to see if it is a limitation for them as well. So we simply set up an SSH agent item and get that config out:

Voila! Once per hour Zabbix will log in to that device, execute the command ‘show full-configuration,’ and get back the result. Frankly, it just works. You check the Monitoring -> Latest data section of the host and see that there is data for this item. Problem solved, right?

No. As a matter of fact, this is where the problems start. Zabbix allows us to store up to 64KB of data in a item value. The above screenshot is of a (small) fortigate firewall and the config if stored in a text file is just over 1.1MB. So, Zabbix truncates the data, which renders the backup useless – restore will never work. At the same time, Zabbix is not sanitizing the output, so all secrets are left in it.

To make it even worse, it’s challenging to make a diff of different config versions/revisions – that feature is just not there. Most of the time, the customer is at this point convinced that Zabbix is not the right tool and the next question pops up – “Now what? How can we fix this?” This is where our added value is presented, as we do have a solution here which is rather affordable (free) as well.

The solution is Oxidized, which is basically Rancid on steroids. This project started years ago and is released under the Apache 2.0 license. We found it by accident, started playing around with it, and never left it. The project is available on Github (https://github.com/ytti/oxidized) and written in Ruby. Incidentally, if you (or your company) have Ruby devs and want to give something back to the community, the Oxidized project is looking for extra maintainers!

At this point, we show our customers the GUI of Oxidized, which in our case involves just making backups of a few firewalls:

So we have the name, the model, and (in this case) just one group. The status shows whether the backup was successful or not, the last update and when the last change was detected. At the same time, under actions, we can get the full config file, look at previous revisions(and diff them) combined with a ‘execute now’ option.

Looking at the versions, it’s simply showing this:

This is already giving us a nice idea of what is going on. We see the versions and dates at a glance, but the moment we check the diff option, we can easily see what was actually changed:

The perfect solution, except that it is not integrated with Zabbix. That means double administration and a lot of extra work, combined with the inevitable errors – devices not added, credential mismatches, connection errors, etc. Luckily, we can easily change the format of the above information from GUI to json by just adding ‘.json’ at the end of the url:

http://:/nodes.json

This will give the following output:

[
  {
    "name": "fw-mid-01",
    "full_name": "fw-mid-01",
    "ip": "192.168.4.254",
    "group": "default",
    "model": "FortiOS",
    "last": {
      "start": "2024-06-13 06:46:14 UTC",
      "end": "2024-06-13 06:46:54 UTC",
      "status": "no_connection",
      "time": 40.018852483
    },
    "vars": null,
    "mtime": "unknown",
    "status": "no_connection",
    "time": "2024-06-13 06:46:54 UTC"
  },
  {
    "name": "FW-HUNZE",
    "full_name": "FW-HUNZE",
    "ip": "192.168.0.254",
    "group": "default",
    "model": "FortiOS",
    "last": {
      "start": "2024-06-13 06:46:54 UTC",
      "end": "2024-06-13 06:47:04 UTC",
      "status": "success",
      "time": 10.029043912
    },
    "vars": null,
    "mtime": "2024-06-13 06:47:05 UTC",
    "status": "success",
    "time": "2024-06-13 06:47:04 UTC"
  }
]

As you might know, Zabbix is perfectly capable of parsing json formats and creating items and triggers out of them. A master item, dependent lld (https://blog.zabbix.com/low-level-discovery-with-dependent-items/13634/), and within minutes you’ve got Oxidized making configuration backups while Zabbix is monitoring and alerting on the status:

At this point we’re getting close to a nice integration, but we haven’t overcome the double configuration management yet.

Oxidized can read its configuration from multiple sources, including a CSV file, SQL, SQLite, MySQL or HTTP. The easiest is a CSV file – just make sure you’ve got all information in the correct column and it works. An example:

Oxidized config:

source:
  default: csv
  csv:
    file: /var/lib/oxidized/router.db
    delimiter: !ruby/regexp /:/
    map:
      name: 0
      ip: 1
      model: 2
      username: 3
      password: 4
    vars_map:
      enable: 5

CSV file:

rtr01.local:192.168.1.1:ios:oxidized:5uP3R53cR3T:T0p53cR3t

Great, now we have to configure 2 places (Zabbix and Oxidized) and get a username/password cleartext in a CSV file. What about SQL as a source, and letting it connect to Zabbix? From there we should be able to get information regarding the hostname, but somehow we need the credentials as well. That’s not a default piece of information in Zabbix, but UserMacros can save us here.

So on our host we add 2 extra macros:

At the same time, we need to tell Oxidized what kind of device it is. There are multiple ways of doing this, obviously. A tag, a usermacro, hostgroups, you name it. In order to do this, we place a tag on the host:

Now we make sure that Oxidized is only taking hosts with the tag ‘oxidized’ and extract from them the host name, IP address, model, username, and password:

+-----------+----------------+---------+------------+------------------------------+
| host      | ip             | model   | username   | password                     |
+-----------+----------------+---------+------------+------------------------------+
| fw-mid-01 | 192.168.4.254  | fortios |  |                    |
| FW-HUNZE  | 192.168.0.254  | fortios |  |                    |
+-----------+----------------+---------+------------+------------------------------+

This way, we simply add our host in Zabbix, add the SSH credentials, and Oxidized will pick it up the next time a backup is scheduled. Zabbix will immediately start monitoring the status of those jobs and alert you if something fails.

This blog post is not meant as a complete integration write down, but rather as a way to give some insight into how we as a partner operate in the field, taking advantage of the flexibility of the products we work with. This post should give you enough information to build it yourself, but of course we’re always available to help you or just build it as part of our consultancy offering.

The post Monitoring Configuration Backups with Zabbix and Oxidized appeared first on Zabbix Blog.

Enhancing Network Synergy: rConfig’s Native Integration with Zabbix

Stephen Stack — Tue, 18 Jun 2024 11:00:16 +0000

Native integration between two leading open-source tools – Zabbix for network monitoring and rConfig for configuration management, delivers substantial benefits to organizations. On one side, Zabbix offers a platform that maintains a Single Source of Truth for network device inventories. It provides real-time monitoring, problem detection, alerting, and other critical features that are essential for day-to-day operations, ensuring smooth and reliable network connectivity crucial for business continuity.

On the other side, there’s rConfig, renowned for its robust and reliable network automation, configuration backup, and compliance management. Integrating rConfig with Zabbix enhances its capabilities, allowing for seamless Device Inventory synchronization. This union not only simplifies the management of network configurations but also introduces more advanced Network Automation Platform features. Together, they form a powerhouse toolset that streamlines network management tasks, reduces operational overhead, and boosts overall network performance, making it easier for businesses to focus on growth and innovation without being hindered by network reliability concerns.

Table of Contents

Optimizing Network Management with Unified Inventory

At rConfig, we are deeply embedded with our customers, and our main mission is to work with them to solve their real-world problems. One significant challenge that consistently surfaces – both from client feedback and our own experiences – is managing and accurately locating a trusted and reliable central network inventory. This challenge brings to the forefront a classic dilemma in Enterprise Architecture circles: In our scenario of network inventory, which system ought to act as the System of Record, and which should function as the System of Engagement to optimize interactions with records for various purposes, such as Network Management Systems (NMS) and Network Configuration Management (NCM)?

Enterprise Architecture circles illustrating systems of record, insight and engagement. Credit: Sharon Moore – https://samoore.me/

At rConfig, from a product perspective we’ve chosen to focus on what we do best and love most: Network Configuration Management. Therefore, integrating with an upstream Network Management System (NMS) that can act as the System of Record for network device inventory was a logical step for us. Given that many of our customers also use Zabbix network operations, it was a natural choice to begin our integration journey with them. Our platforms are highly complementary, which streamlines the integration process and enhances our ability to serve our customers better. This strategic decision allows us to offer a seamless and efficient management solution that not only meets the current needs but also scales to address future challenges in network management.

Enhanced Integration Through ETL

You might be wondering how this integration works and whether it’s straightforward or challenging to set up. Setting up the integration between rConfig and Zabbix is relatively straightforward, but, as with any complex data driven systems, it requires careful planning and diligence to ensure that the data flow between the systems is fully optimized and automated. This is where ETL – or Extract, Transform, Load – plays a crucial role. ETL is a process that involves extracting data from the Zabbix API in its raw form, transforming it into a format that rConfig can readily process and validate, and then loading it into the rConfig production database. This process also efficiently handles any data conflicts and updates.

The advantages of using ETL are significant, enhancing data quality and making the data more accessible, thereby enabling rConfig to analyze information more effectively and make well-informed, data-driven decisions. At rConfig, our user interface is designed to aid in the development and troubleshooting of features, though we’re also fond of using the CLI for those who prefer it. Below is a screenshot from our lab showing the end-to-end ETL process with Zabbix in action. It illustrates the steps rConfig takes to connect to Zabbix, extract, validate, transform and map the data, load it to staging, and finally, move it to the production environment for a small set of devices.

While the screenshot below displays just a few devices as a sample integration in our lab, the most extensive integration we’ve achieved in a production environment with this new rConfig feature involved syncing a single Zabbix instance with over 5,000 host/device records. This highlights its efficiency and reliability in a real-world environment.

Screenshot of rConfig Zabbix Integration on the Command Line

Going Deeper: Understanding the Integration Process

To grasp the integration process more clearly, let’s dive into the details that will help you understand how to set everything up before we automate the task. Our documentation website, docs.rconfig.com, provides comprehensive details, and our YouTube channel features a great demonstration video of the entire process.

Initial Setup: The first step involves configuring rConfig to connect and authenticate with the Zabbix API. This setup is managed through the Configuration page in the rConfig user interface. During this phase, you can also apply filters to select specific Zabbix tags or host groups, refining exactly which host records you want to synchronize.

Screenshot of Zabbix Configuration page in rConfig V7 professional

Data Extraction and Validation: Once the connection is established, rConfig extracts host records in raw JSON format. This stage involves validating the data to ensure that the correct tags and data mappings are in place.

Screenshot of Zabbix Raw Host Extract page in rConfig V7 Professional

Staging for Review: After validation, the data is loaded into a staging table. This allows for a thorough review to confirm that the mapped rConfig data fields are correct, ensuring that the newly imported devices are associated with the appropriate connection templates, categories, and tags.

Screenshot of Zabbix host staging table in rConfig V7 Professional

Final Loading: The final step involves transferring the staged devices to the main production devices table. After this transfer, the staging table is cleared. The devices then appear in the main device table, marked with a special icon indicating that they are synced through integration.

Screenshot of Zabbix host fully loaded to production devices table in rConfig V7 Professional

Seamless Operational Integration: Once the devices are loaded into the production table, they are automatically incorporated into standard rConfig scheduled tasks, automations, or any other rConfig feature that utilizes the device data (like categories and tags). This integration facilitates a seamless operational workflow between the platforms. Users can even access these devices directly in Zabbix from within the rConfig UI, streamlining operations management.

After all the above steps are completed, and the initial setup is done future loads are completed on a scheduled and automation basis using the rConfig Task manager.

Screenshot of rConfig Device detail view for a Zabbix integrated host

This detailed setup and validation process ensures that the integration between rConfig and Zabbix is not only effective but also enhances the functionality and efficiency of managing network devices across platforms.

Case Study: Enhancing Network Management for a Las Vegas Entertainment Organization

Challenge: A prominent Las Vegas entertainment organization faced significant difficulties in managing the diverse and complex network that supports their extensive operations, including gaming, security, and hospitality services. The primary issues were outdated network inventories and inefficient management of network configurations across numerous devices, leading to operational disruptions and security vulnerabilities.
Solution: To address these challenges, the organization implemented the integration of rConfig with Zabbix, focusing on automating and centralizing the network management process. This solution aimed to synchronize network device inventories across the organization’s extensive operations, ensuring accurate and real-time data availability.
Implementation: The integration process began with setting up Zabbix to continuously monitor and gather data from network devices across different venues and services. This data was then extracted, standardized, and loaded into rConfig, where it could be used for automated configuration management and backup. The setup also included sophisticated mapping and validation to ensure all data transferred between Zabbix and rConfig was accurate and relevant.

Benefits:

Improved Network Reliability: The automated synchronization of network inventories reduced the frequency of network failures and minimized downtime, which is crucial in the high-stakes environment of Las Vegas entertainment.
Enhanced Security: With more accurate and timely network data, the organization could better identify and respond to security threats, protecting sensitive information and ensuring the safety of both guests and operations.
Operational Efficiency: The IT team was able to shift their focus from routine network maintenance to strategic initiatives that enhanced overall business operations, including integrating new technologies and improving guest experiences.
Scalability: The integration provided a scalable solution that could accommodate future expansion, whether adding new devices or incorporating new technologies or venues into the network.
Outcome: The implementation of the rConfig and Zabbix integration dramatically transformed the organization’s network management capabilities. The IT department noted a substantial reduction in the manpower and time required for routine maintenance, while operational uptime improved significantly. The organization now enjoys a robust, streamlined network management system that supports its dynamic environment, ensuring that both guests and staff benefit from reliable and secure network services.

This case study highlights the power of effective network management solutions in supporting complex operations and enhancing business efficiency and security within the entertainment industry.

Conclusion: Forging Ahead with Innovative Partnerships

In conclusion, the Zabbix platform stands out as a cornerstone in network monitoring, renowned for its extensive capabilities in real-time monitoring, problem detection, and alerting. Its robust architecture not only supports a broad range of network environments but also offers the flexibility and scalability necessary for today’s diverse technological landscapes. The platform’s ability to provide detailed and accurate network insights is crucial for organizations aiming to maintain optimal operational continuity and security.

The integration of Zabbix with rConfig, a globally reliable and robust network configuration management (NCM) solution, enhances these benefits significantly, creating a synergistic relationship that leverages the strengths of both platforms. For customers and partners, this integration means not only smoother and more efficient network management but also the assurance that they are supported by two of the leading solutions in the industry. Together, Zabbix and rConfig deliver a comprehensive network management experience that drives efficiency, reduces costs, and ensures a higher level of network reliability and security, positioning them as indispensable tools in the toolkit of any organization serious about its network infrastructure.

About rConfig

rConfig is an industry leader in network configuration management and automation. Founded in 2010 and based in Ireland, rConfig has been at the forefront of delivering innovative solutions that simplify the complexities of network management. Our software is designed to be both powerful and user-friendly, making it an ideal choice for IT professionals across a variety of sectors, including education, government, manufacturing, and large global enterprises.

With the capability to manage up to 10s of 1000s of devices, rConfig offers robust functionalities such as automated config backups, compliance management, and network automation. Our platform is vendor-agnostic, which allows seamless integration with a diverse range of network devices and systems, from traditional IT to IoT and OT environments. This flexibility ensures that our clients can manage all aspects of their network configurations, regardless of the underlying technology.

rConfig is committed to continuous innovation and customer-centric solutions, with industry first solutions such as API backups and our Script Integration Engine. Our native integration with platforms like Zabbix exemplifies our dedication to enhancing network management through strategic partnerships. This collaboration not only streamlines operations but also amplifies the benefits provided, ensuring that our customers have access to the most advanced tools in the industry.

The post Enhancing Network Synergy: rConfig’s Native Integration with Zabbix appeared first on Zabbix Blog.

Make your interaction with Zabbix API faster: Async zabbix_utils.

Aleksandr Iantsen — Tue, 30 Apr 2024 07:16:52 +0000

In this article, we will explore the capabilities of the new asynchronous modules of the zabbix_utils library. Thanks to asynchronous execution, users can expect improved efficiency, reduced latency, and increased flexibility in interacting with Zabbix components, ultimately enabling them to create efficient and reliable monitoring solutions that meet their specific requirements.

There is a high demand for the Python library zabbix_utils. Since its release and up to the moment of writing this article, zabbix_utils has been downloaded from PyPI more than 15,000 times. Over the past week, the library has been downloaded more than 2,700 times. The first article about the zabbix_utils library has already gathered around 3,000 views. Among the array of tools available, the library has emerged as a popular choice, offering developers and administrators a comprehensive set of functions for interacting with Zabbix components such as Zabbix server, proxy, and agents.

Considering the demand from users, as well as the potential of asynchronous programming to optimize interaction with Zabbix, we are pleased to present a new version of the library with new asynchronous modules in addition to the existing synchronous ones. The new zabbix_utils modules are designed to provide a significant performance boost by taking advantage of the inherent benefits of asynchronous programming to speed up communication between Zabbix and your service or script.

You can read the introductory article about zabbix_utils for a more comprehensive understanding of working with the library.

Benefits and Usage Scenarios

From expedited data retrieval and real-time event monitoring to enhanced scalability, asynchronous programming empowers you to build highly efficient, flexible, and reliable monitoring solutions adapted to meet your specific needs and challenges.

The new version of zabbix_utils and its asynchronous components may be useful in the following scenarios:

Mass data gathering from multiple hosts: When it’s necessary to retrieve data from a large number of hosts simultaneously, asynchronous programming allows requests to be executed in parallel, significantly speeding up the data collection process;

Mass resource exporting: When templates, hosts or problems need to be exported in parallel. This parallel execution reduces the overall export time, especially when dealing with a large number of resources;

Sending alerts from or to your system: When certain actions need to be performed based on monitoring conditions, such as sending alerts or running scripts, asynchronous programming provides rapid condition processing and execution of corresponding actions;

Scaling the monitoring system: With an increase in the number of monitored resources or the volume of collected data, asynchronous programming provides better scalability and efficiency for the monitoring system.

Installation and Configuration

If you already use the zabbix_utils library, simply updating the library to the latest version and installing all necessary dependencies for asynchronous operation is sufficient. Otherwise, you can install the library with asynchronous support using the following methods:

By using pip:

~$ pip install zabbix_utils[async]

Using [async] allows you to install additional dependencies (extras) needed for the operation of asynchronous modules.

By cloning from GitHub:

~$ git clone https://github.com/zabbix/python-zabbix-utils
~$ cd python-zabbix-utils/
~$ pip install -r requirements.txt
~$ python setup.py install

The process of working with the asynchronous version of the zabbix_utils library is similar to the synchronous one, except for some syntactic differences of asynchronous code in Python.

Working with Zabbix API

To work with the Zabbix API in asynchronous mode, you need to import the AsyncZabbixAPI class from the zabbix_utils library:

from zabbix_utils import AsyncZabbixAPI

Similar to the synchronous ZabbixAPI, the new AsyncZabbixAPI can use the following environment variables: ZABBIX_URL, ZABBIX_TOKEN, ZABBIX_USER, ZABBIX_PASSWORD. However, when creating an instance of the AsyncZabbixAPI class you cannot specify a token or a username and password, unlike the synchronous version. They can only be passed when calling the login() method. The following usage scenarios are available here:

Use preset values of environment variables, i.e., not pass any parameters to AsyncZabbixAPI:

~$ export ZABBIX_URL="https://zabbix.example.local"

api = AsyncZabbixAPI()

Pass only the Zabbix API address as input, which can be specified as either the server IP/FQDN address or DNS name (in this case, the HTTP protocol will be used) or as an URL of Zabbix API:

api = AsyncZabbixAPI(url="127.0.0.1")

After declaring an instance of the AsyncZabbixAPI class, you need to call the login() method to authenticate with the Zabbix API. There are two ways to do this:

Using environment variable values:

~$ export ZABBIX_USER="Admin"
~$ export ZABBIX_PASSWORD="zabbix"

~$ export ZABBIX_TOKEN="xxxxxxxx"

and then:

await api.login()

Passing the authentication data when calling login():

await api.login(user="Admin", password="zabbix")

Like ZabbixAPI, the new AsyncZabbixAPI class supports version getting and comparison:

# ZabbixAPI version field
ver = api.version
print(type(ver).__name__, ver) # APIVersion 6.0.29

# Method to get ZabbixAPI version
ver = api.api_version()
print(type(ver).__name__, ver) # APIVersion 6.0.29

# Additional methods
print(ver.major)     # 6.0
print(ver.minor)     # 29
print(ver.is_lts())  # True

# Version comparison
print(ver < 6.4)        # True
print(ver != 6.0)       # False
print(ver != "6.0.24")  # True

After authentication, you can make any API requests described for all supported versions in the Zabbix documentation.

The format for calling API methods looks like this:

await api_instance.zabbix_object.method(parameters)

For example:

await api.host.get()

After completing all needed API requests, it is necessary to call logout() to close the API session if authentication was done using username and password, and also close the asynchronous sessions:

await api.logout()

More examples of usage can be found here.

Sending Values to Zabbix Server/Proxy

The asynchronous class AsyncSender has been added, which also helps to send values to the Zabbix server or proxy for items of the Zabbix Trapper data type.

AsyncSender can be imported as follows:

from zabbix_utils import AsyncSender

Values can be sent in a group, for this it is necessary to import ItemValue:

import asyncio
from zabbix_utils import ItemValue, AsyncSender


items = [
    ItemValue('host1', 'item.key1', 10),
    ItemValue('host1', 'item.key2', 'Test value'),
    ItemValue('host2', 'item.key1', -1, 1702511920),
    ItemValue('host3', 'item.key1', '{"msg":"Test value"}'),
    ItemValue('host2', 'item.key1', 0, 1702511920, 100)
]

async def main():
    sender = AsyncSender('127.0.0.1', 10051)
    response = await sender.send(items)
    # processing the received response

asyncio.run(main())

As in the synchronous version, it is possible to specify the size of chunks when sending values in a group using the parameter chunk_size:

sender = AsyncSender('127.0.0.1', 10051, chunk_size=2)
response = await sender.send(items)

In the example, the chunk size is set to 2. So, 5 values passed in the code above will be sent in three requests of two, two, and one value, respectively.

Also it is possible to send a single value:

sender = AsyncSender(server='127.0.0.1', port=10051)
resp = await sender.send_value('example_host', 'example.key', 50, 1702511920))

If your server has multiple network interfaces, and values need to be sent from a specific one, the AsyncSender provides the option to specify a source_ip for sent values:

sender = AsyncSender(
    server='zabbix.example.local',
    port=10051,
    source_ip='10.10.7.1'
)
resp = await sender.send_value('example_host', 'example.key', 50, 1702511920)

AsyncSender also supports reading connection parameters from the Zabbix agent/agent2 configuration file. To do this, you need to set the use_config flag and specify the path to the configuration file if it differs from the default /etc/zabbix/zabbix_agentd.conf:

sender = AsyncSender(
    use_config=True,
    config_path='/etc/zabbix/zabbix_agent2.conf'
)

More usage examples can be found here.

Getting values from Zabbix Agent/Agent2 by item key.

In cases where you need the functionality of our standart zabbix_get utility but native to your Python project and working asynchronously, consider using the AsyncGetter class. A simple example of its usage looks like this:

import asyncio
from zabbix_utils import AsyncGetter

async def main():
    agent = AsyncGetter('10.8.54.32', 10050)
    resp = await agent.get('system.uname')
    print(resp.value) # Linux zabbix_server 5.15.0-3.60.5.1.el9uek.x86_64

asyncio.run(main())

Like AsyncSender, the AsyncGetter class supports specifying the source_ip address:

agent = AsyncGetter(
    host='zabbix.example.local',
    port=10050,
    source_ip='10.10.7.1'
)

More usage examples can be found here.

Conclusions

The new version of the zabbix_utils library provides users with the ability to implement efficient and scalable monitoring solutions, ensuring fast and reliable communication with the Zabbix components. Asynchronous way of interaction gives a lot of room for performance improvement and flexible task management when handling a large volume of requests to Zabbix components such as Zabbix API and others.

We have no doubt that the new version of zabbix_utils will become an indispensable tool for developers and administrators, helping them create more efficient, flexible, and reliable monitoring solutions that best meet their requirements and expectations.

The post Make your interaction with Zabbix API faster: Async zabbix_utils. appeared first on Zabbix Blog.

Monitoring Oracle Cloud Infrastructure (OCI) with Zabbix

Kristaps Naglis — Tue, 23 Apr 2024 13:16:00 +0000

Monitoring Oracle Cloud Infrastructure resources is crucial for maintaining optimal performance, security, and cost-efficiency. By continuously monitoring resources such as compute instances, storage, databases, and networking components, you can proactively identify and address potential issues before they escalate, ensuring uninterrupted service delivery.

Monitoring also provides valuable insights into resource utilization patterns, enabling capacity planning and optimization efforts. Furthermore, effective monitoring enhances security posture by detecting anomalies or suspicious activities that could indicate potential security breaches.

This integration consists of multiple templates, where each template covers a specific OCI resource. As of writing this blog post, such services are supported:

OCI Compute;
OCI Autonomous Database (serverless);
OCI Object Storage;
OCI Virtual Cloud Networks (VCNs);
OCI Block Volumes;
OCI Boot Volumes.

The structure of these templates is made so that you do not need to configure each service separately. There is a master template Oracle Cloud by HTTP, which needs to be configured and rest of the services get monitored automatically using low-level discovery and host prototypes. Overall, the structure of templates looks like this:

Prepare OCI for monitoring with Zabbix

In all of these examples, an OCI web console will be used, therefore, keep in mind that some of the UI elements or navigation menus can potentially change in the future.

User creation

First of all, you need to set up a user in OCI that Zabbix will use to access the Rest API and gather data. To do so, navigate to 'Identity & Security' -> 'Domains' from the main menu in top-left of the screen. Select the domain for which you wish to monitor resources and head over to Users view on the left sidebar. Here, create a new user.

API key generation

After this, OCI should automatically redirect you to the newly created user page. If not, for some reason, go back to all users in domain and open your previously created user.

You need to assign API keys to this user. To do so, on the left sidebar, press API keys and then Add API key. If you already have an RSA key pair that you wish to use for this monitoring user, you can select an option to either upload or paste the public key. However, in this example, I will generate a completely new key pair.

After pressing button Add, OCI will show you details about your environment and the API key. Save this information somewhere safe for now, as we will need to use this data later inside Zabbix. After saving this, you can now close this pop-up.

User group creation

To make user management more flexible, a good idea would be to create a user group specifically for monitoring and assign monitoring user to this group. So in case, in the future, you will need more monitoring users, you will be able to just assign those users to a common monitoring user group, and all access rights will work automatically.

To do this, navigate back to 'Identity & Security' -> 'Domains' and select Groups. Here press on button Create group. Give a name for this group and assign your monitoring user to it while you’re at it.

Policy creation

Now that you have successfully created a monitoring user, generated API keys and created a user group for monitoring, you now need to create a new OCI policy. Policies in OCI regulate what resources its users can access and what they can do with those resources. To create a new policy, open the top-left menu and navigate to 'Identity & Security' -> 'Policies' and press on the button Create Policy.

This will open a new view where you can define policy settings. Give this policy a name and description and select the appropriate compartment. Inside the policy builder, enable Show manual editor and copy-paste these rules, just remember to replace zabbix-mon with your monitoring group name:

Allow group 'zabbix-mon' to read metrics in tenancy
Allow group 'zabbix-mon' to read instances in tenancy
Allow group 'zabbix-mon' to read subnets in tenancy
Allow group 'zabbix-mon' to read vcns in tenancy
Allow group 'zabbix-mon' to read vnic-attachments in tenancy
Allow group 'zabbix-mon' to read volumes in tenancy
Allow group 'zabbix-mon' to read objectstorage-namespaces in tenancy
Allow group 'zabbix-mon' to read buckets in tenancy
Allow group 'zabbix-mon' to read autonomous-databases in tenancy

This set of rules will give your monitoring group users read-only access to only those resources that Zabbix Oracle Cloud Infrastructure templates use.

After you are done, press the Create button and the policy will be created.

If you decide to modify our template to access more data from OCI, you will need to append appropriate rules here as well.

That is it for configuration on the OCI side now.

Prepare Zabbix

Create host

First, open up your Zabbix web interface, head over to Hosts section and create a new host. Here, you will need to specify a host name of your choice, template Oracle Cloud by HTTPS and assign it to a group.

Before pressing the Add button, you should configure macros. Open the Macros tab and select Inherited and host macros. There will be quite a few of them, but each with its own use-case. Some of these macros are discussed in Optional configuration section, so for now let’s focus on the ones that are necessary for template to work at all.

Essentially, there are 3 steps to take here:

This is the moment when OCI configuration file values from API key generation section come into play. You will need to set the macro value to the value that OCI gave you:
- {$OCI.API.TENANCY} – here set the tenancy OCID value;
- {$OCI.API.USER} – here set the user OCID value;
- {$OCI.API.FINGERPRINT} – here set the fingerprint value;
Now you need to find where you saved the private key for your OCI monitoring user from the same API key generation section. Copy the contents of that private key file and paste them into {$OCI.API.PRIVATE.KEY}macro.

Once the authentication credentials are entered, you need to identify the correct OCI API endpoints that match for your region (the one that OCI configuration file gave you in the API key generation section). One way to do this is to go to Oracle Cloud Infrastructure Documentation, and on the left side you will find all the API services. Under each of these services, there is a home page dedicated to each service with respective API endpoints available. Required API service endpoints for Zabbix are:

In my case the OCI region is eu-stockholm-1, therefore the above API service endpoints look like this:

OCI API service	URL	Zabbix macro name
Core Services API	`iaas.eu-stockholm-1.oraclecloud.com`	`{$OCI.API.CORE.HOST}`
Database Service API	`database.eu-stockholm-1.oraclecloud.com`	`{$OCI.API.AUTONOMOUS.DB.HOST}`
Object Storage Service API	`objectstorage.eu-stockholm-1.oraclecloud.com`	`{$OCI.API.OBJECT.STORAGE.HOST}`
Monitoring API	`telemetry.eu-stockholm-1.oraclecloud.com`	`{$OCI.API.TELEMETRY.HOST}`

Now that you have the necessary host names, you need to save them as values for their respective Zabbix macros.

IMPORTANT! API Endpoint URLs need to be entered without the HTTP scheme (https://).

After these 3 steps, your Macros tab should look something like this.

After pressing the button Add, this host will be added to Zabbix.

This host will not have any metrics items, instead it will discover all the OCI resources and add them as separate hosts in Zabbix. This will happen automatically, sometime between when Zabbix server reloads its configuration cache and 1 hour after that (because discovery rules have update interval of 1 hour). However, if you want to speed this up, you can manually refresh Zabbix server config cache with CLI command zabbix_server -R config_cache_reload and execute all discovery rules manually.

After that, in the Hosts view, you should see all of your monitored OCI resources.

Also, each of these hosts will have their own dashboards created automatically that will give a good overview of resource utilization.

Optional configuration

When creating the OCI host in Zabbix, there were a few macros that required modification of values. As you might have noticed, there were quite a few other macros that we skipped over. In the following sections, usage of these macros will be explained. These are optional macros, so depending on your requirements, you might not even need to change them.

Use macros for low-level discovery filtering

In official Zabbix templates, you might find macros that end with MATCHES and NOT_MATCHES. These are used for Low-level discovery rules (LLDs), to help you filter resources that should or should not be discovered. These values are using regular expressions. Therefore, you can use wildcard symbols for pattern matching.

Usage of these macros can be found in Filters tab, under discovery rules.

Usually default values for MATCHES is .* and for NOT_MATCHES – CHANGE_IF_NEEDED. This means that any kind of value will be discovered if it is not equal to CHANGE_IF_NEEDED. For example, if such macros are filtering compute instance name:

{"instance-name": "compute_instance_1"} will be discovered;
{"instance-name": "web-server"} will be discovered;
{"instance-name": "CHANGE_IF_NEEDED"} will not be discovered.

In OCI templates, such discovery macros are also used. One example is compute instance discovery where filters are checking the instance state:

Macro {$OCI.COMPUTE.DISCOVERY.STATE.MATCHES} has a value of .*;
Macro {$OCI.COMPUTE.DISCOVERY.STATE.NOT_MATCHES} has a value of TERMINATED.

This means that any instance that is in terminated state will not be discovered. This is because when you terminate an instance in OCI, it is still kept for a while, but there is no reason to monitor such an instance.

Now that you have an idea how these filters work, you can adjust them based on your requirements.

Low-level discovery resource filtering by free-form tags of OCI resources

To add additional resource discovery filtering options, every discovery script (except VCN discovery), gathers free-form tag data about a specific resource. Since free-form tags are completely custom and format or usage will vary between users, the free-from tag filters are not included under LLD filters by default, but can be easily added as they are already being collected by scripts.

To show an example, let’s assume that you want to categorize OCI compute instances by their usage utilizing free-form tags.

For one of the instances, add a new free-from tag, such as, usage with value web-server.

You need to repeat this for other compute instances and set usage free-from tag values as you want to.

Now, switch over to Zabbix and open Oracle Cloud by HTTP template in Zabbix and go to Discovery rules. Find Compute instances discovery and open it.

Under LLD macros tab, add a new macro that will represent this location group tag, for example, {#USAGE} $.tags.usage.

Under the Filters tab, there will already be filters regarding the compute instance name and state. Click “Add” to add a new filter and define the previously created LLD macro and add a matching pattern and value, for example, {#USAGE} matches web-server.

The next time Compute instances discovery is executed, it will only discover OCI compute instances that have the free-form tag usage that matches the regex of web-server. You can also experiment with the LLD filter pattern matching value to receive different matching results for a specified value.

IMPORTANT! You must have the free-from tag present on every resource in OCI that you are using this method on. In these examples you would need to set usage free-form tag on every compute instance, or Zabbix LLD will stop working. If a tag does not exist when Zabbix LLD searches for it, it will throw an error. The free-form tag does not need to have a value, make sure that the tag key is there.

HTTP proxy usage

If needed, you can specify an HTTP proxy for the template to use by changing the value of {$OCI.HTTP.PROXY} user macro. Every request will use this proxy.

Custom OK HTTP response

If using proxy, returned OK HTTP response could change from 200 to some different value. In that case, please change user macro {$OCI.HTTP.RETURN.CODE.OK} to the appropriate value. By default, this is set to 200, therefore, all request items will only accept data when response is 200.

That is about it. Good luck and have fun!

The post Monitoring Oracle Cloud Infrastructure (OCI) with Zabbix appeared first on Zabbix Blog.

Introducing zabbix_utils – the official Python library for Zabbix API

Aleksandr Iantsen — Thu, 01 Feb 2024 11:36:28 +0000

In this article, we will introduce you to the main capabilities of the newly released official zabbix_utils library and provide examples of how to use it with Zabbix components.

Zabbix is a flexible and universal monitoring solution that integrates with a wide variety of different systems right out of the box.

Despite actively expanding the list of natively supported systems for integration (via templates or webhook integrations), there may still be a need to integrate with custom systems and services that are not yet supported. In such cases, a library taking care of implementing interaction protocols with the Zabbix API, Zabbix server/proxy, or Agent/Agent2 becomes extremely useful. Given that Python is widely adopted among DevOps and SRE engineers as well as server administrators, we decided to release a library for this programming language first.

We are pleased to introduce zabbix_utils – a Python library for seamless interaction with Zabbix API, Zabbix server/proxy, and Zabbix Agent/Agent2. Of course, there are popular community solutions for working with these Zabbix components in Python. Keeping this fact in mind, we have tried to consolidate popular issues and cases along with our experience to develop as convenient a tool as possible. Furthermore, we made sure that transitioning to the tool is as straightforward and clear as possible. Thanks to official support, you can be confident that the current version of the library is compatible with the latest Zabbix release.

Usage Scenarios

The zabbix_utils library can be used in the following scenarios, but is not limited to them:

Zabbix automation

Integration with third-party systems

Custom monitoring solutions

Data export (hosts, templates, problems, etc.)

Integration into your Python application for Zabbix monitoring support

Anything else that comes to mind

You can use zabbix_utils for automating Zabbix tasks, such as scripting the automatic monitoring setup of your IT infrastructure objects. This can involve using ZabbixAPI for the direct management of Zabbix objects, Sender for sending values to hosts, and Getter for gathering data from Agents. We will discuss Sender and Getter in more detail later in this article.

For example, let’s imagine you have an infrastructure consisting of different branches. Each server or workstation is deployed from an image with an automatically configured Zabbix Agent and each branch is monitored by a Zabbix proxy since it has an isolated network. Your custom service or script can fetch a list of this equipment from your CMDB system, along with any additional information. It can then use this data to create hosts in Zabbix and link the necessary templates using ZabbixAPI based on the received information. If the information from CMDB is insufficient, you can request data directly from the configured Zabbix Agent using Getter and then use this information for further configuration and decision-making during setup. Another part of your script can access AD to get a list of branch users to update the list of users in Zabbix through the API and assign them the appropriate permissions and roles based on information from AD or CMDB (e.g., editing rights for server owners).

Another use case of the library may be when you regularly export templates from Zabbix for subsequent import into a version control system. You can also establish a mechanism for loading changes and rolling back to previous versions of templates. Here a variety of other use cases can also be implemented – it’s all up to your requirements and the creative usage of the library.

Of course, if you are a developer and there is a requirement to implement Zabbix monitoring support for your custom system or tool, you can implement sending data describing any events generated by your custom system/tool to Zabbix using Sender.

Installation and Configuration

To begin with, you need to install the zabbix_utils library. You can do this in two main ways:

By using pip:

~$ pip install zabbix_utils

By cloning from GitHub:

~$ git clone https://github.com/zabbix/python-zabbix-utils
~$ cd python-zabbix-utils/
~$ python setup.py install

No additional configuration is required. But you can specify values for the following environment variables: ZABBIX_URL, ZABBIX_TOKEN, ZABBIX_USER, ZABBIX_PASSWORD if you need. These use cases are described in more detail below.

Working with Zabbix API

To work with Zabbix API, it is necessary to import the ZabbixAPI class from the zabbix_utils library:

from zabbix_utils import ZabbixAPI

If you are using one of the existing popular community libraries, in most cases, it will be sufficient to simply replace the ZabbixAPI import statement with an import from our library.

At that point you need to create an instance of the ZabbixAPI class. There are several usage scenarios:

Use preset values of environment variables, i.e., not pass any parameters to ZabbixAPI:

~$ export ZABBIX_URL="https://zabbix.example.local"
~$ export ZABBIX_USER="Admin"
~$ export ZABBIX_PASSWORD="zabbix"

from zabbix_utils import ZabbixAPI


api = ZabbixAPI()

Pass only the Zabbix API address as input, which can be specified as either the server IP/FQDN address or DNS name (in this case, the HTTP protocol will be used) or as an URL, and the authentication data should still be specified as values for environment variables:

~$ export ZABBIX_USER="Admin"
~$ export ZABBIX_PASSWORD="zabbix"

from zabbix_utils import ZabbixAPI

api = ZabbixAPI(url="127.0.0.1")

Pass only the Zabbix API address to ZabbixAPI, as in the example above, and pass the authentication data later using the login() method:

from zabbix_utils import ZabbixAPI

api = ZabbixAPI(url="127.0.0.1")
api.login(user="Admin", password="zabbix")

Pass all parameters at once when creating an instance of ZabbixAPI; in this case, there is no need to subsequently call login():

from zabbix_utils import ZabbixAPI

api = ZabbixAPI(
    url="127.0.0.1",
    user="Admin",
    password="zabbix"
)

The ZabbixAPI class supports working with various Zabbix versions, automatically checking the API version during initialization. You can also work with the Zabbix API version as an object as follows:

from zabbix_utils import ZabbixAPI

api = ZabbixAPI()

# ZabbixAPI version field
ver = api.version
print(type(ver).__name__, ver) # APIVersion 6.0.24

# Method to get ZabbixAPI version
ver = api.api_version()
print(type(ver).__name__, ver) # APIVersion 6.0.24

# Additional methods
print(ver.major)    # 6.0
print(ver.minor)    # 24
print(ver.is_lts()) # True

As a result, you will get an APIVersion object that has major and minor fields returning the respective minor and major parts of the current version, as well as the is_lts() method, returning true if the current version is LTS (Long Term Support), and false otherwise. The APIVersion object can also be compared to a version represented as a string or a float number:

# Version comparison
print(ver < 6.4)      # True
print(ver != 6.0)     # False
print(ver != "6.0.5") # True

If the account and password (or starting from Zabbix 5.4 – token instead of login/password) are not set as environment variable values or during the initialization of ZabbixAPI, then it is necessary to call the login() method for authentication:

from zabbix_utils import ZabbixAPI

api = ZabbixAPI(url="127.0.0.1")
api.login(token="xxxxxxxx")

After authentication, you can make any API requests described for all supported versions in the Zabbix documentation.

The format for calling API methods looks like this:

api_instance.zabbix_object.method(parameters)

For example:

api.host.get()

After completing all the necessary API requests, it’s necessary to execute logout() if authentication was done using login and password:

api.logout()

More examples of usage can be found here.

Sending Values to Zabbix Server/Proxy

There is often a need to send values to Zabbix Trapper. For this purpose, the zabbix_sender utility is provided. However, if your service or script sending this data is written in Python, calling an external utility may not be very convenient. Therefore, we have developed the Sender, which will help you send values to Zabbix server or proxy one by one or in groups. To work with Sender, you need to import it as follows:

from zabbix_utils import Sender

After that, you can send a single value:

from zabbix_utils import Sender

sender = Sender(server='127.0.0.1', port=10051)
resp = sender.send_value('example_host', 'example.key', 50, 1702511920)

Alternatively, you can put them into a group for simultaneous sending, for which you need to additionally import ItemValue:

from zabbix_utils import ItemValue, Sender


items = [
    ItemValue('host1', 'item.key1', 10),
    ItemValue('host1', 'item.key2', 'Test value'),
    ItemValue('host2', 'item.key1', -1, 1702511920),
    ItemValue('host3', 'item.key1', '{"msg":"Test value"}'),
    ItemValue('host2', 'item.key1', 0, 1702511920, 100)
]

sender = Sender('127.0.0.1', 10051)
response = sender.send(items)

For cases when there is a necessity to send more values than Zabbix Trapper can accept at one time, there is an option for fragmented sending, i.e. sequential sending in separate fragments (chunks). By default, the chunk size is set to 250 values. In other words, when sending values in bulk, the 400 values passed to the send() method for sending will be sent in two stages. 250 values will be sent first, and the remaining 150 values will be sent after receiving a response. The chunk size can be changed, to do this, you simply need to specify your value for the chunk_size parameter when initializing Sender:

from zabbix_utils import ItemValue, Sender


items = [
    ItemValue('host1', 'item.key1', 10),
    ItemValue('host1', 'item.key2', 'Test value'),
    ItemValue('host2', 'item.key1', -1, 1702511920),
    ItemValue('host3', 'item.key1', '{"msg":"Test value"}'),
    ItemValue('host2', 'item.key1', 0, 1702511920, 100)
]

sender = Sender('127.0.0.1', 10051, chunk_size=2)
response = sender.send(items)

In the example above, the chunk size is set to 2. So, 5 values passed will be sent in three requests of two, two, and one value, respectively.

If your server has multiple network interfaces, and values need to be sent from a specific one, the Sender provides the option to specify a source_ip for the sent values:

from zabbix_utils import Sender

sender = Sender(
    server='zabbix.example.local',
    port=10051,
    source_ip='10.10.7.1'
)
resp = sender.send_value('example_host', 'example.key', 50, 1702511920)

It also supports reading connection parameters from the Zabbix Agent/Agent2 configuration file. To do this, set the use_config flag, after which it is not necessary to pass connection parameters when creating an instance of Sender:

from zabbix_utils import Sender

sender = Sender(
    use_config=True,
    config_path='/etc/zabbix/zabbix_agent2.conf'
)
response = sender.send_value('example_host', 'example.key', 50, 1702511920)

Since the Zabbix Agent/Agent2 configuration file can specify one or even several Zabbix clusters consisting of multiple Zabbix server instances, Sender will send data to the first available server of each cluster specified in the ServerActive parameter in the configuration file. In case the ServerActive parameter is not specified in the Zabbix Agent/Agent2 configuration file, the server address from the Server parameter with the standard Zabbix Trapper port – 10051 will be taken.

By default, Sender returns the aggregated result of sending across all clusters. But it is possible to get more detailed information about the results of sending for each chunk and each cluster:

print(response)
# {"processed": 2, "failed": 0, "total": 2, "time": "0.000108", "chunk": 2}

if response.failed == 0:
    print(f"Value sent successfully in {response.time}")
else:
    print(response.details)
    # {
    #     127.0.0.1:10051: [
    #         {
    #             "processed": 1,
    #             "failed": 0,
    #             "total": 1,
    #             "time": "0.000051",
    #             "chunk": 1
    #         }
    #     ],
    #     zabbix.example.local:10051: [
    #         {
    #             "processed": 1,
    #             "failed": 0,
    #             "total": 1,
    #             "time": "0.000057",
    #             "chunk": 1
    #         }
    #     ]
    # }
    for node, chunks in response.details.items():
        for resp in chunks:
            print(f"processed {resp.processed} of {resp.total} at {node.address}:{node.port}")
            # processed 1 of 1 at 127.0.0.1:10051
            # processed 1 of 1 at zabbix.example.local:10051

More usage examples can be found here.

Getting values from Zabbix Agent/Agent2 by item key.

Sometimes it can also be useful to directly retrieve values from the Zabbix Agent. To assist with this task, zabbix_utils provides the Getter. It performs the same function as the zabbix_get utility, allowing you to work natively within Python code. Getter is straightforward to use; just import it, create an instance by passing the Zabbix Agent’s address and port, and then call the get() method, providing the data item key for the value you want to retrieve:

from zabbix_utils import Getter

agent = Getter('10.8.54.32', 10050)
resp = agent.get('system.uname')

In cases where your server has multiple network interfaces, and requests need to be sent from a specific one, you can specify the source_ip for the Agent connection:

from zabbix_utils import Getter

agent = Getter(
    host='zabbix.example.local',
    port=10050,
    source_ip='10.10.7.1'
)
resp = agent.get('system.uname')

The response from the Zabbix Agent will be processed by the library and returned as an object of the AgentResponse class:

print(resp)
# {
#     "error": null,
#     "raw": "Linux zabbix_server 5.15.0-3.60.5.1.el9uek.x86_64",
#     "value": "Linux zabbix_server 5.15.0-3.60.5.1.el9uek.x86_64"
# }

print(resp.error)
# None

print(resp.value)
# Linux zabbix_server 5.15.0-3.60.5.1.el9uek.x86_64

More usage examples can be found here.

Conclusions

The zabbix_utils library for Python allows you to take full advantage of monitoring using Zabbix, without limiting yourself to the integrations available out of the box. It can be valuable for both DevOps and SRE engineers, as well as Python developers looking to implement monitoring support for their system using Zabbix.

In the next article, we will thoroughly explore integration with an external service using this library to demonstrate the capabilities of zabbix_utils more comprehensively.

Questions

Q: Which Agent versions are supported for Getter?

A: Supported versions of Zabbix Agents are the same as Zabbix API versions, as specified in the readme file. Our goal is to create a library with full support for all Zabbix components of the same version.

Q: Does Getter support Agent encryption?

A: Encryption support is not yet built into Sender and Getter, but you can create your wrapper using third-party libraries for both.

from zabbix_utils import Sender

def psk_wrapper(sock, tls):
    # ...
    # Implementation of TLS PSK wrapper for the socket
    # ...

sender = Sender(
    server='zabbix.example.local',
    port=10051,
    socket_wrapper=psk_wrapper
)

More examples can be found here.

Q: Is it possible to set a timeout value for Getter?

A: The response timeout value can be set for the Getter, as well as for ZabbixAPI and Sender. In all cases, the timeout is set for waiting for any responses to requests.

# Example of setting a timeout for Sender
sender = Sender(server='127.0.0.1', port=10051, timeout=30)

# Example of setting a timeout for Getter
agent = Getter(host='127.0.0.1', port=10050, timeout=30)

Q: Is parallel (asynchronous) mode supported?

A: Currently, the library does not include asynchronous classes and methods, but we plan to develop asynchronous versions of ZabbixAPI and Sender.

Q: Is it possible to specify multiple servers when sending through Sender without specifying a configuration file (for working with an HA cluster)?

A: Yes, it’s possible by the following way:

from zabbix_utils import Sender


zabbix_clusters = [
    [
        'zabbix.cluster1.node1',
        'zabbix.cluster1.node2:10051'
    ],
    [
        'zabbix.cluster2.node1:10051',
        'zabbix.cluster2.node2:20051',
        'zabbix.cluster2.node3'
    ]
]

sender = Sender(clusters=zabbix_clusters)
response = sender.send_value('example_host', 'example.key', 10, 1702511922)

print(response)
# {"processed": 2, "failed": 0, "total": 2, "time": "0.000103", "chunk": 2}

print(response.details)
# {
#     "zabbix.cluster1.node1:10051": [
#         {
#             "processed": 1,
#             "failed": 0,
#             "total": 1,
#             "time": "0.000050",
#             "chunk": 1
#         }
#     ],
#     "zabbix.cluster2.node2:20051": [
#         {
#             "processed": 1,
#             "failed": 0,
#             "total": 1,
#             "time": "0.000053",
#             "chunk": 1
#         }
#     ]
# }

The post Introducing zabbix_utils – the official Python library for Zabbix API appeared first on Zabbix Blog.

Monitoring AWS Cost Explorer with Zabbix

evgenii.gordymov — Tue, 12 Dec 2023 12:58:37 +0000

Cloud-based service platforms are becoming increasingly popular, and one of the most widely adopted is Amazon Web Services (AWS). Like many cloud services, AWS charges a user fee, which has led many users to look for a breakdown of which specific services they are being charged for. Fortunately, Zabbix has an AWS Cost Explorer over HTTP template that’s ready to run right out of the box and provides a list of daily and monthly maintenance costs.

Why monitor AWS costs?

While AWS cost data is stored for 12 months, Zabbix allows data to be stored for up to 25 years (see Keep lost resources period). The Keep lost resources period is a vital parameter for storing data longer than 12 months since the cost data removed from AWS will result in the discovered items becoming lost. Therefore, if we want to keep our cost data for a period longer than 12 months, Keep lost resources period parameter needs to be adjusted accordingly.

In addition, Zabbix can show fees charged for unavailable services, such as test deployments for a cluster in the us-east-1 region.

Preparing to monitor in a few easy steps

I recommend visiting zabbix.com/integrations/aws for any sources referred to in this tutorial. You can also find a link to all Zabbix templates there. For the most part, we will follow the steps outlined in the readme.

The AWS Cost Explorer by HTTP template can use key-based and role-based authorization. Set the following macros {$AWS.AUTH_TYPE}, possible values: role_base, access_key (using by default).

If you are using access key-based authorization, be sure set the following macros {$AWS.ACCESS.KEY.ID}, {$AWS.SECRET.ACCESS.KEY}.

Create or use an existing access key, which you can get from Identity and Access Management (IAM).

Accessing the IAM Console:

Log in to your AWS Management Console.Navigate to the IAM service.
Next, go to the Users tab and select the required user.

Creating a access key for monitoring:

After that, go to the Security credentials tab.
Select Create access key.

Add the following required permissions to your Zabbix IAM policy in order to collect metrics.

Defining Permissions through IAM Policies:

Access the “Policies” section within IAM.
Click on “Create Policy”.
Select the JSON tab to define policy permissions.
Provide a meaningful name and description for the policy.
Structure the policy document based on the permissions needed for the AWS Cost Explorer by HTTP template.

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "ce:GetDimensionValues", "ce:GetCostAndUsage" ], "Effect": "Allow", "Resource": "*" } ] }

Attaching Policies to the User:

Go back to the “Users” section within IAM.
Click on “Add Permissions”.

– Search for and select the policy created in the previous step.

– Review the attached policies to ensure they align with the intended permissions for the user.

Creating a host in Zabbix

Now, let’s create a host that will represent the metrics available via the Cost Explorer API:

Create a Host Group in which to put hosts related to AWS. For this example, let’s create one that we’ll call AWS Cloud.
Head to the host page under Configuration and click Create host. Give this host the name AWS Cost. We’ll also assign this host to the AWS Cloud group we created and attach the AWS Cost Explorer template by HTTP.
Click the Macros tab and select Inherited and host macros. In this case, we need to change the first two macros. The first, {$AWS.ACCESS.KEY.ID}, should be set to the received access key ID. For the second, {$AWS.SECRET.ACCESS.KEY}, the secret access key should be set to the previously retrieved value from the Security credentials tab.
Click Add. The AWS Cost Explorer template has three low-level discovery rules that use master items. The low-level discovery rules will start discovering resources only after the master item has collected the required data.

The best practice is to always test such items for data. Don’t forget to fill in the required macros!

In AWS daily costs by services and AWS monthly costs by services discovery you can filter by service, which can be specified in macros.
Let’s execute the master items to collect the required data on-demand. Choose both items to get data and click Execute now.

In a few minutes, you should receive cost metrics by services for 12 months plus the current month, as well as by day. If you want the information to be stored longer, remember to change the Keep lost resources period in the LLD rule, as it’s set to 30 days by default.

Good luck!

The post Monitoring AWS Cost Explorer with Zabbix appeared first on Zabbix Blog.

Maintaining Zabbix API token via JavaScript

Aigars Kadiķis — Tue, 14 Sep 2021 10:05:15 +0000

In this blog post, we will talk about maintaining and storing the Zabbix API session key in an automated fashion. The blog post builds upon the Close problem automatically via Zabbix API subject and can be used as extra configuration for this particular use-case. The blog post also shares a great example of synthetic monitoring by way of JavaScript preprocessing – how to emulate a scenario in an automated fashion and get alerted in case of any problems.

Prerequisites

First, let us create the Zabbix API user and user macros where we will store our username, password, Zabbix URL and the API session key.

1) Open “Administration” => “Users”. Create a new user ‘api’ with password ‘zabbix’. At the permissions tab set User Type “Zabbix Super Admin”.

2) Go to “Administration” => “General” => “Macros”. Configure base characteristics:

      {$Z_API_PHP} = http://demo.zabbix.demo/api_jsonrpc.php
     {$Z_API_USER} = api
 {$Z_API_PASSWORD} = zabbix
{$Z_API_SESSIONID} =

It’s OK to leave {$Z_API_SESSIONID} empty for now.

3) Let’s check if the Zabbix backend server can reach the Zabbix frontend server. Make sure that you are logged into the Zabbix backend server by looking up the zabbix_server process:

ps auxww | grep "[z]abbix_server.*conf"

Ensure that we can reach the Zabbix frontend by curling the Zabbix frontend server from the Zabbix backend server:

curl -s "http://demo.zabbix.demo/index.php" | grep Zabbix

4) Download template “Check and repair Zabbix API token” and import it in your instance.

5) Create a new host with an Agent interface and link the template. The IP address of the host does not matter. The template will use an agentless check to do the monitoring, it will use an “HTTP agent” item.

How it works

Our goal for today is to figure out a way to keep the Zabbix API authentication token up to date in a user macro. This way we can reuse the macro repeatedly for items, action operations and scripts that require for us to use the Zabbix API. We need to ensure that even if the token changes, the macro gets automatically updated with the new token value! Let’s try and understand each step of the underlying workflow required for us to achieve this goal.

The first component of our workflow is the “Validate session key raw” item. This is an HTTP agent item that performs a POST request with an arbitrary method – proxy.get in this case, but we could have used ANY other method. We simply want to check if an arbitrary Zabbix API call can be executed with the current {$Z_API_SESSIONID} macro value.

The second part of the workflow is the “Repair session key” dependent item. This item utilizes the JavaScript preprocessing step with custom JavaScript code to check the values obtained by the previous item and generate a new authentication token if that is necessary.

The third item – “Status”, is another dependent item that uses regular expression preprocessing steps to check for different error messages or status codes in the value of the “Validate session key raw” item. Most of the triggers defined in this template will react to the values obtained by this item.

Below you can see the full underlying workflow:

Code-wise, the magic is implemented with the following JavaScript code snippet:

if (value.match(/Session terminated/)) {

var req = new CurlHttpRequest();

// Zabbix API
var json_rpc='{$Z_API_PHP}'

// lib curl header
req.AddHeader('Content-Type: application/json');

// First request to get authentication token
var token =  JSON.parse(req.Post(json_rpc,
'{"jsonrpc":"2.0","method":"user.login","params":{"user":"{$Z_API_USER}","password":"{$Z_API_PASSWORD}"},"id":1,"auth":null}'
));

// If authentication was unsuccessful
if ( token.error )
{
// Login name or password is incorrect
return 32500;
}

else {
// Update the macro

// Get the global macro ID
// We cannot plot here a very native Zabbix macro because it will be automatically expanded
// Must use a workaround to distinguish a dollar sign from the actual macro name and merge with '+'
var id = JSON.parse(req.Post(json_rpc,
'{"jsonrpc":"2.0","method":"usermacro.get","params":{"output":["globalmacroid"],"globalmacro":true,"filter":{"macro":"{$'+'Z_API_SESSIONID'+'}"}},"auth":"'+token.result+'","id":1}'
)).result[0].globalmacroid;

// This line contains a keyword '+value+' which will grab exactly the previous value outside this JavaScript snippet
var overwrite = JSON.parse(req.Post(json_rpc,
'{"jsonrpc":"2.0","method":"usermacro.updateglobal","params":{"globalmacroid":"'+id+'","value":"'+token.result+'"},"auth":"'+token.result+'","id":1}'
));

// Return the id (an integer) of the macro, which was updated
return overwrite.result.globalmacroids[0];
}

} else {
return 0;
}

Throughout the JavaScript code, we are extracting a value from one step and using that as an input for the next step.

After the data has been collected, the values are analyzed by 4 triggers. 3 out of these 4 triggers prints a misconfiguration problem that requires a human investigation. We also have a “repairing Zabbix API session key in background” title, which is the main trigger that indicates a token has been expired and repair automatically.

And that’s it – now any integration that requires a Zabbix API authentication token can receive the current token value by referencing the user macro that we created in the article. We’ve ensured that the token is always going to stay up to date and in case of any issues, we will receive an alert from Zabbix!

The post Maintaining Zabbix API token via JavaScript appeared first on Zabbix Blog.

How to deploy Zabbix on PostgreSQL with Timescale DB plugin

Dmitry Lambert — Tue, 23 Mar 2021 13:21:45 +0000

PostgresSQL is one of the supported database engines that Zabbix uses to store all configuration data and history. The popularity of Postgres makes it a very sought-after Database engine for Zabbix. TimescaleDB is a great extension to Postgres that empowers Zabbix with native partitioning functionality and data compression, which saves a lot of disk space for our users.

This post and the video are here to help you get Zabbix deployed on Postgres with TimescaleDB.

I. Why Zabbix? (1:17)
II. Architecture (4:06)
III. Getting started with PostgreSQL (9:00)
IV. TimescaleDB (14:44)
V. Conclusion (21:52)
VI. Questions & Answers (22:42)

Why Zabbix?

The first question is why Zabbix? Or, in other words, what is Zabbix? There are many monitoring solutions out there, but there are quite a few great and important benefits to Zabbix that make it stand out.

Without diving too deep into details, Zabbix is a true open-source monitoring solution.
Zabbix is absolutely enterprise-ready (scalable, with remote monitoring and a granular permission system, which was improved in the recent Zabbix 5.2 release). So you don’t have to worry whether Zabbix is able to monitor 1,000, 50,000, 500,000, or 1,000,000 hosts, as Zabbix can grow with your enterprise.
You don’t have to think of Zabbix as large-scale software only meant for extremely big companies and workloads – absolutely not. If you have a small home project or a small office with a few desktops, sensors, network devices, printers, or whatever else, you can easily monitor them with Zabbix. And you don’t have to spend a lot of resources on hardware on top of that – Zabbix can run even on a simple Raspberry Pi device!
Another important factor is that Zabbix is 100% free software. There are no paid add-ons or per-host payments, and you don’t have to pay for any metrics or any hosts that you wish to monitor. At any time, you can go to zabbix.com, click the Download button, get the product, and start using it with all the latest functionality, including all the native PostgreSQL monitoring templates.
Zabbix offers a continuously growing list of out-of-the-box templates.
One of the benefits of Zabbix is flexibility, which is appreciated by those who wish to gather data from complex and custom solutions. If there is something that cannot be monitored out of the box, you can easily extend Zabbix functionality and start collecting new data.
On top of that, Zabbix can easily adapt to your needs. We know that not everybody wants to actually spend extra time creating custom templates, writing plugins or models, etc. So the Zabbix team is constantly working on releasing new official out-of-the-box templates for databases, network devices, services, applications, and many other solutions.

Architecture

The basic Zabbix architecture lets you collect and monitor any metrics — hosts, servers, databases, applications, cloud services, AWS Azure, IoT infrastructure, or whatever else.

With Zabbix we can do much more than just collect the required metrics. Zabbix architecture involves the core server, which we call the Zabbix system. It’s responsible for the data visualization as we don’t just collect the data and react to it based on our triggers. Zabbix also sends notifications and lets users create their own custom dashboards and flexible graphs that can be used for visualization. A native REST API interface significantly helps create customizations and integrations with any third-party software.

If we go a bit deeper into what Zabbix is, the core Zabbix software consists of three components:

Zabbix Frontend that we normally open in our web browser and which is used as a centralized visualization and configuration platform.

Since Zabbix is enterprise-ready and scalable, you can monitor remote branches. It doesn’t matter if you have an office in the US, somewhere in Europe, or in Japan: you can see all the data and all of the graphs in one single centralized frontend. The same applies to configuration: if you’re sitting in Riga, Latvia, and you want to add some new metrics to be monitored in Japan, you can just open your frontend, add a new host, add an item, and start monitoring.

The next component — Zabbix Server — is the brain of the entire system, which is responsible for gathering data from the hosts and proxies, and analyzing it based on the triggers – our problem thresholds, that we have previously created in our frontend. It also tries to find problems, which may arise in our environment or in our data centers right now or, for instance, use the native predictive functions and warn us 10 days in advance.

The final component — Zabbix Database — is the component that we will be focusing on today. At the end of the day, it’s basically the heart of Zabbix, since normally Zabbix Server and Zabbix Frontend cannot function without a properly configured database, which should be healthy in terms of performance. This is where Postgres comes into play.

Database

What do we have in the database? What is it used for?

The database is the storage of entity configurations. Whatever we configure in our frontend, be it items, hosts, proxies, triggers, actions, or whatever else, everything is stored in this single database.
We collect the data from our hosts and store this history data based on our settings for the period we need. This means that in situations where Zabbix is used to store a lot of data we might have a very large database. Depending on the size of the monitored environment, in more extreme cases we might be talking about a couple of terabytes of data.
The Zabbix Server communicates with the database about the stored data, writing new history data, getting configuration data to populate our configuration cache, etc.
On the other side, we have the Zabbix Frontend, where we might have 10, 15, 20, or 50 users working online at the same time, watching the dashboards, looking at the graphs, and adding configuration parameters.

In conclusion – The database has to be very stable and healthy in terms of performance to ensure optimal performance of all of the Zabbix components.

Supported databases

Zabbix supports several databases, including:

PostgreSQL ( 9.2.24 or later ),
MySQL ( 5.5.62 – 8.0.x ) and MySQL Forks,
Oracle DB ( 11.2 or later ), and
TimescaleDB, which is an extension for Postgres.

Getting started with PostgreSQL

If you want to use Zabbix, or you are already using Zabbix and just want to deploy another Zabbix instance for your development environment, and you decide to use Postgres as your backend database, it’s very straightforward. When it comes to deploying Zabbix the difference between Postgres and MySQL is very minor – most of the steps can be duplicated for both of these database backends.

1. All we have to do is open the zabbix.com webpage, click the big green Download button and choose our platform: Zabbix version, OS distribution, OS version, database, and the web server, for instance, NGINX or Apache.

2. Then we need to follow full installation instructions to install and configure the Zabbix server for our platform.

Long story short, we have to install the packages for the core Zabbix components — Zabbix Server, Zabbix Frontend, Zabbix Agent, Zabbix Web, and adjust the configuration for Zabbix Web. The most important part here is the pgsql prefix, since we cannot download zabbix-server-mysql and use it with the PostgreSQL database.

Installing Zabbix server, frontend, and agent

Post-installation

After we have downloaded Zabbix, we need to create a Database (just an empty database without any additions at this step), and a user for our Zabbix instance.

# sudo -u postgres createuser --pwprompt zabbix
# sudo -u postgres createdb -O zabbix zabbix

Here it comes down to two options. For beginners, you can create just one single user that we called ‘zabbix‘ here, and use it for the Frontend and for the Server. If you want to have an installation more aligned with best practices, then you should create one user for the Zabbix Server and then a separate user for the Zabbix Frontend.

At this point, we have an empty database without any tables, but we need to import the default schema and the default data from our Zabbix instance.

# zcat /usr/share/doc/zabbix-server-pgsql*/create.sql.gz | sudo -u zabbix psql zabbix

When we install the Zabbix Server component — zabbix-server-pgsql — we can find the create.sql file in the usr/share/doc directory, which we can put inside our created Zabbix Database with a pipe command. This will import all the tables necessary for Zabbix, including the default data on the Admin User, Zabbix Server Host, and all of the templates.

The last part will be the configuration of our Zabbix Server, which requires opening the Zabbix Server configuration file — zabbix_server.conf — and specifying the DB name, DB user, and DB password. Very simple.

DBPassword=password

Frontend configuration

If we have installed all the previously mentioned packages, we can just open a web browser and enter the IP address of the web frontend (virtual machine or physical machine), and we will be redirected to the first-time configuration wizard.

Simple configuration wizard

Here, the database type will be PostgreSQL since we downloaded the Zabbix web PostgreSQL package. Then, we just have to fill those simple parameters as a Database host. For instance, if our PostgreSQL database is stored on the same machine as the Zabbix Frontend, then it will be localhost. For a remote machine, we will have to fill in the IP address or DNS name. We also need to specify Database name, Password, and User name, and we’re good to go.

Native PostgreSQL encryption

If we want to be super secure and we don’t want the communication between the Zabbix Frontend and the Database, or between the Zabbix Server and the Database to be unencrypted, we can also enable native encryption between the Zabbix core components and our Postgres database.

Simple configuration wizard

To enable native encryption, we have to specify the location of our CA file, TLS key file, certificate file, and choose whether we want to do host verification in the simple configuration wizard on the first-time login page. But that’s not all.

Prerequisites

When we install the PostgreSQL database, it’s not yet ready to provide encryption out of the box. So we have to fill in a couple of parameters in the PostgreSQL configuration file — postgresql.conf:

The configuration might also differ depending on our security requirements. The lowest SSL version that is supported by our security policy might be TLS 1.3. We may have a wider list of ciphers for this encryption. In the end, it’s up to the user.

Then we have to prepare the pg_hba.conf file where we must define that we will be using verify CA and verify full for the client certificate. That will enable the encrypted communication between Postgres and the Zabbix core components thanks to the native Postgres encryption configuration.

TimescaleDB

Now, what is the Timescale database? Timescale is still a relatively new thing.

Timescale is not really a standalone database engine. It is an extension of the existing PostgreSQL database, which is absolutely free, so we don’t have to pay any money to use it.
Timescale requires PostgreSQL 11 or 12. If you are planning to use Timescale on the latest version, PostgreSQL 13, you have to check if it is supported.
The idea behind Timescale is working with time-series data. Right now, Zabbix has around 170 tables with five of them for history data and two for trends, where we can actually use the benefits of Timescale.
The greatest benefit in terms of Zabbix is that TimescaleDB offers native partitioning in Zabbix and lets us compress the heaviest tables.

Native partitioning

What is the greatest benefit of native partitioning? Normally, history tables are not partitioned. Zabbix ordinarily uses an internal single-threaded process — Housekeeper — which is responsible for deleting data older than what we specify in the Frontend. For instance, if we specify that we need to keep history for 90 days and trends for a year, the Housekeeper will delete data older than that.

The problem is that if we have an enterprise-scale instance of Zabbix, then we could have terabytes of data. Most of that will betaken by our history tables. Housekeeper just runs delete statements on history and trends data. Since those are really big, we’ll be just checking the clock value of the historical data, so this will take a lot of time. As you can see, the history tables are the busiest tables in the Zabbix Database, and we definitely don’t want to lock them with a Housekeeper process running for 20-30 minutes or even more than one hour just to delete all the data that we don’t want to store.

So, the greatest benefit of partitioning is that you partition history tables based on days and then you just drop a whole partition that you don’t want to keep anymore. This is a lot faster and has a much smaller performance impact than deleting the data via native housekeeping functionality.

TimescaleDB is not the only solution. But all other solutions are based on third-party scripts or some internal procedures. That’s additional administration overhead, and you need to make sure that the script will be executed and all the dependencies for the script will be working. With TimescaleDB, we can have this as a native functionality without any third-party scripts and any dependencies.

Native compression

We have to remember the terabytes of stored data and the performance impact it can have. To have good performance, we need to have good storage. In enterprise monitoring, we will most likely have all the data stored in disk arrays. So we can just calculate how much money will be spent on disks to support a multi-terabyte database and, for instance, 50,000 new values per second coming from Zabbix.

So the native compression in TimescaleDB lets us compress the heaviest tables inside Zabbix, such as history and trends. The most common question is how much disk space this will save us. In the use-case example below, the 1.396-TB Telco database was compressed to 77 GB of the data, which means up to 90 % of disk space saved.

The limitation of compression is that you can’t modify compressed chunks. This means that we cannot edit, delete, or insert new data if it’s already compressed. But this is nothing to worry about. Normally, native Zabbix functionality will not try to edit history data, which is already collected and stored in a database. That’s why we can easily use compression, as it will not affect any native functionality at all.
You can’t change the schema for compressed tables either. But from Zabbix’s perspective, this is just a formal thing as we normally wouldn’t suggest doing any schema modifications anyway.
Compression can be administered from the Zabbix Frontend.

TimescaleDB installation

So how do we install TimescaleDB? It’s very easy, and all we have to do is open docs.timescale.com, choose where and how we want to install it (in our case, on Red Hat), choose the OS distribution, installation method, and PostgreSQL version. A couple of commands on top of that, and we’re good to go.

TimescaleDB configuration

To configure TimescaleDB, first we need to:

Enable the TimescaleDB extension for the Zabbix Database as specified in the Zabbix documentation about the TimescaleDB configuration by executing:

echo "CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;" | sudo -u postgres psql zabbix

Run the timescaledb.sql script located in database/postgresql by executing:

cat timescaledb.sql | sudo -u zabbix psql zabbix

The only thing you need to keep in mind is that if you already have a Postgres database and you’re planning to use TimescaleDB, or you have Oracle or MySQL and you plan to migrate to Postgres and use it with Timescale, you might already have a database hundreds of gigabytes or even terabytes in size.

So, the initial process of applying TimescaleDB to your database will definitely take some time. We recommend users to actually stop the Zabbix Server, the Frontend, schedule a maintenance period, and then let it run, reconfigure all the tables, do the compression, and only then start it and receive the metric backlog from all the proxies and the monitored hosts back to the database.

Conclusion

These are the basic steps to use PostgreSQL with TimescaleDB. If you are just starting out with Zabbix and thinking about what database to choose, PostgreSQL and TimescaleDB will be one of the best choices available to you. Our best-practice opinion is that you need to choose the database you are experienced with, since you will have to work with it. But TimescaleDB’s compression and native partitioning really give a huge advantage against all other supported database backends.

Questions & Answers

Question. Schema changes for compressed tables are not allowed, so how do we proceed with an upgrade in that case?

Answer: I think that it might cause some problems. The good thing is the last schema change on the history and the trends table, if I remember correctly, was in version 3.2. So that’s something to keep in mind. However, since TimescaleDB is fully supported by Zabbix, I guess that if we know that the next major release will have affect the history or trends tables, then there will definitely be a solution.

In addition, you can decompress the tables, upgrade, and compress them again, though this is not the best approach in terms of time-saving.

Question. Is Timescale the only option for people who want to go with some sort of compression?

Answer: Currently, yes, if we’re talking about native compression, which is also supported in Zabbix. With Timescale, the configuration, enabling, and disabling compression, and other procedures must be actually performed on the table. So there must be a procedure, which is executed to compress it. Currently, with the TimescaleDB, this is done with the good old Housekeeper process, which is a native process integrated with Zabbix itself. So, there’s nothing besides Timescale DB that would work so great with Zabbix at the moment.

Question. How do we install our database into a non-default schema?

Answer. If you have a different schema in Postgres, there is a specific parameter in the server configuration file for it. I think it was also in the frontend wizard where you just specify which schema you want to use.

Question. Which user on the machine will be used to access configuration files for permissions?

Answer. It is a general question about permissions and configuration files. However, in the case of Zabbix, we can use Zabbix User. It’s possible to change it, but that might not be the best practice in terms of Postgres. I guess, after a default installation of Postgres, there will be a user ‘Postgres’ that will have an access to the configuration file. Zabbix works only with the Zabbix configuration file, while Postgres — only with the Postgres configuration file, so we don’t need to have them cross-related to one another.

Question. Do we provide or do we have plans to provide a script or a tool to migrate Zabbix 4.0 or 5.0 with MySQL to Timescale?

Answer: Initially, you need to migrate to Postgres and then apply Timescale on top of that. We don’t have any official scripts or utilities to do the migration. If you’re really interested, it’s actually not as complicated as it may seem at first sight, and you can do it by watching my video, for instance.

There is a native Postgres utility to import the data which has been exported from MySQL with MariaDB. In this case, you need to make sure that you only import the data, and apply the constraints only after the data is imported.

The most important thing is that you should have proper data integrity from the beginning. If you have some constraint breaches in your MySQL instance, then you will encounter issues trying to import it and upload it to the Postgres database. But it’s still not a disaster, it can be fixed though it might be a challenge at first.

If the data integrity is fine, then it’s a pretty straightforward process, though this depends on the size of the database. But if we exclude the history, then it takes like a couple of minutes if you know what to do.

Question. So, you mentioned those constraint breaches, maybe just some issues with integrity. Could you tell us when and why those could happen?

Answer. The most popular opinion about why those breaches could happen is doing something manually, for instance, editing the database schema. I mentioned that we don’t really like it when users do that. But I still do believe that there are some installations that have been working with Zabbix since 1.x, and they have upgraded to 5.2 across all of these years.

It’s really possible that in some older versions there was some bug that affected this database integrity and it was fixed later on. Or partitioning was not that easy. You couldn’t just turn it on, you actually had to drop some indexes, edit the tables, and do other stuff that we do not recommend.

Question. What if we have different history settings for different items?

Answer. That’s a limitation that does not apply only to TimescaleDB, but also to any currently used partitioning solution, at least with Zabbix. There is a functionality within the Zabbix Frontend where you can define that data for a certain item should be kept for two years, for instance, as it is extremely important, while data for some other item just collecting hostname should be kept only for one month.

If you choose the partitioning path, then this functionality disappears and you must check the Override item storage period box in the Housekeeper settings, and all of your items will be stored for the period that you specify in the Housekeeper settings page.

Question. What are the performance advantages of using PostgreSQL with Timescale over PostgreSQL with native partitioning or MySQL with custom partitioning approach.

Answer: In general, talking about performance benefits is about database engines without partitioning and compression. You would not say that one is specifically better than the other. It is always a question of the users’ knowledge and ability to tune and configure the database properly.

We still frequently see database servers that have 500 GB of memory and the default config file. If we check the server statistics, we’ll see that out of 500 GB only 1 GB is used simply because they did not edit the config file. That’s why we are sure that you should choose the database you are experienced with, since you will have to tune it and ensure that is the DB health and performance is up to par.

In terms of partitioning benefits, the greatest benefit of TimescaleDB comes from compression and partitioning. A couple of years back, we had a semi-official partitioning script, which was based on triggers in Postgres. That particular solution was not working very well with very big Zabbix installations just because of the trigger mechanism on the tables.

The existing one with TimescaleDB is based on Housekeeper, which executes the partitioning procedure once per day. In addition, there is the current MySQL solution with a third-party script. I think that their performance is the same, and I do not see any particular difference. However, Timescale offers compression, and there is definitely a difference working with 50 GB database versus 1 TB, no doubt about it.

The post How to deploy Zabbix on PostgreSQL with Timescale DB plugin appeared first on Zabbix Blog.

Getting your notifications via Signal

Brian van Baekel — Tue, 26 Jan 2021 10:02:59 +0000

Recently, Whatsapp pushed their new privacy policy where they announced to share more data with Facebook, causing an exodus to other platforms, where Signal is one of the more popular ones, among Telegram. Both are great alternatives, but I prefer Signal due to the open-source part, end to end encryption, and last but not least: their business model (living on donations instead of selling your data).

Typically, Zabbix is sending notifications to whatever medium you’ve chosen if a problem is detected. We all know the Email messages, the various webhook integrations with Slack/MS Teams/ Jira, etc, perhaps even some text message integrations and such. Now, if we’re migrating to Signal, we suddenly have access to the Signal API and can utilize it to receive Zabbix notifications. Nice!

There is only one drawback. You need a separate phone number to register against Signal. Don’t use your own phone number – unless you want to lose the ability to use Signal ;(

There are various ways to get a phone number for this purpose:

Use the phone number of your current SMS gateway
Use the company phone number (a lot of cloud PBX are providing the option to receive the verification email)
Purchase a prepaid phone number.
Use a service like Twilio

You just need to receive one text message, the rest of the communications will go via the internet

Time to get rid of Whatsapp and move to Signal! But… How to use Signal to get your notifications?

Signal-cli

Although we could built everything from scratch, talking to the API of Signal, there is a nice implementation available in order to talk to Signal within a few minutes: Signal-cli

Although this github page is very comprehensive in order to get Signal-cli installed, but of course it is not doing anything with Zabbix.

Configuration tasks

For this guide, we’re using:

Centos 8
Zabbix 5.2

signal-cli installation

First, lets install the Signal-cli utility, and in order to do so we need to resolve the dependency of Java by installing the openjdk application:

dnf -y install java-11-openjdk-devel.x86_64

After this installation, we should be good to continue with the installation of signal-cli. According to their installation guide, this should be sufficient:

export VERSION="0.7.3"
wget https://github.com/AsamK/signal-cli/releases/download/v"${VERSION}"/signal-cli-"${VERSION}".tar.gz
sudo tar xf signal-cli-"${VERSION}".tar.gz -C /opt
sudo ln -sf /opt/signal-cli-"${VERSION}"/bin/signal-cli /usr/local/bin/

At the time of writing, the most recent version is 0.7.3, and that’s what we’re installing here. If in the future a new version is released, of course you should install that!

If everything went as expected, we should be able to register ourself to Signal.

signal-cli registration

Since we want to execute these commands by Zabbix, we must make sure the registration is done with the correct user on the Zabbix server, otherwise you will get the following error message:

(ERROR App – User +19293771253 is not registered.)

In order to prevent this error, lets do the authentication against Signal as Zabbix user:

Important: The USERNAME (your phone number) must include the country calling code, i.e. the number must start with a “+” sign and you must replace everything between the < > in the following examples with your own values

runuser -l zabbix -c 'signal-cli -u  register'

Now, check for incoming test messages on this phone number. Within seconds you should receive a 6 digit code in the following format: xxx-xxx

Once you’ve received the text, it’s time to complete the registration:

runuser -l zabbix -c 'signal-cli -u  verify '


Since we’re running these commands as a different user, we won’t see the output of them. Let’s just test!
Sending messages from the command line is straight forward:
runuser -l zabbix -c 'signal-cli -u  send -m  '
You will see the message id as output. Simply ignore it, since it’s not relevant at this point.
Within seconds:

It works! Great.
So now we’ve got this part covered, time to get the AlertScript set up, before heading to the frontend.
Zabbix AlertScript setup
Ok, so now we’ve got the registration done, we need to make sure Zabbix can utilise it. In order to do so, we use a very old method. Although it would’ve made more sense to use the webhook option, that means I had to built the communication with Signal from scratch.
So AlertScripts it is. In your terminal/SSH session with the Zabbix server open a new file with this command: vi /usr/lib/zabbix/alertscripts/signal.sh and insert the following contents:
#!/bin/bash
signal-cli -u '+19293771253' send -m "$1" $2
 That’s right. just 2 lines. After saving the file, change the owner and set the permissions:
chown zabbix:zabbix /usr/lib/zabbix/alertscripts/signal.sh
chmod 7000 /usr/lib/zabbix/alertscripts/signal.sh
and it’s time to move to our frontend.
Zabbix mediatype configuration
In the frontend, go to Administration -> Mediatypes and create a new mediatype:

Name: Signal
Type: Script
Script name: signal.sh
Script parameters:
    {ALERT.MESSAGE}
    {ALERT.SENDTO}
don’t forget to configure some Message templates as well (second tab in the Mediatype configuration). You can just use the defaults if you click on ‘add’
Zabbix media configuration
Next step. Navigate to Administration -> Users (or just open your own user profile) and create a new media:

Type: Signal
Sendto: 
When active / severity as per needs
Important: The USERNAME (your phone number) must include the country calling code, i.e. the number must start with a “+” sign
We’re almost there, just some configuration on the actions
Zabbix action configuration
This step is only needed if you are sending notifications right now via a specific mediatype. If you configured the ‘send only to’ option to ‘- All -‘ there is nothing to change, and it will work straight away!
Otherwise, navigate to Configuration -> Actions and find the action you want to change, and in the Operations, Recovery operations and Update operations change the ‘send only to’ option to ‘Signal’
Save your action and it’s time to test – Generate some problem to confirm the implementation actually works.
Wrap up
That’s it. By now you should have a working implementation where Zabbix is sending notifications to Signal. The setup was extremely straight forward and easy to configure. Nevertheless, if you need help getting this going, we (Opensource ICT Solutions) offer consultancy services as well, and are more than happy to help you out!
 
The post Getting your notifications via Signal appeared first on Zabbix Blog.



SAF Products Integration into Zabbix
Tatjana Dunce — Tue, 19 Jan 2021 14:39:07 +0000
Top of the line point-to-point microwave equipment manufacturer SAF Tehnika has partnered with Zabbix to provide NMS capabilities to its end customers. SAF Tehnika appreciates Zabbix’s customizability, scalability, ease of template design, and SAF products integration.

Content
I. SAF Tehnika (1:14)

II. SAF point-to-point microwave systems (3:20)

III. SAF product lines (5:37)
√ Integra (5:54)

√ PhoeniX-G2 (6:41)
IV. SAF services (7:21)

V. SAF partnership with Zabbix (8:48)
√  Zabbix templates for SAF equipment (10:50)

√ Zabbix Maps view for Phoenix G2 (15:00)
VI. Zabbix services provided by SAF (17:56)

VII. Questions & Answers (20:00)
SAF Tehnika
SAF Tehnika comes from a really small country — Latvia, as well as Zabbix.
SAF Tehnika:
✓ has been around for over 20 years,

✓ has been profitable/has no debt balance sheet,

✓ is present in 130+ countries,

✓ has manufacturing facilities in the European Union,

✓ is ISO 9001 certified,

✓ is Zabbix Certified Partner since August 2020,

✓ is publicly traded on NASDAQ Riga Stock Exchange,

✓ has flexible R&D, and is able to provide custom solutions based on customer requirements.
SAF Tehnika is primarily manufacturing:

point-to-point systems,
hand-held MW spectrum analyzers,
Aranet wireless sensors and solutions.


SAF Tehnika main product groups
SAF point-to-point microwave systems
Point-to-point microwave systems are an alternative to a fiber line. Instead of a fiber line, we have two radio systems with the antenna installed on two towers. The distance between those towers could be anywhere from a few km up to 50 or even 100 km. The data is transmitted from one point to another wirelessly.
SAF Tehnika point-to-point MW system technology provides:

long-distance wireless links;
free and excellent technical support;
fast & easy deployment;
The 5-year standard warranty for SAF products as SAF Tehnika ensures their top quality thanks to using high-quality materials and manufacturing reliable chipsets, as well as chamber testing of all products;
solutions for:

√ WISPs,

√ TV and broadcasting (No.1 in the USA),

√ public safety,

√ utilities & mining,

√ enterprise networks,

√ local government & military,

√ low-latency/HFT (No.1 globally).
SAF product lines
The primary PTP Microwave product series manufactured by SAF Tehnika are Integra and Phoenix G2

SAF Tehnika main radio products
Integra
Integra is a full-outdoor radio, which can be attached directly to the antenna, so there is nothing indoors besides the power supply.
Integra-E — wireless fiber solution specifically tailored for dense urban deployment:

operates in E-band range,
can achieve the throughput of up to 10 Gbps per second,
operates in 2GHz bandwidth.

Integra-X — a powerful dual-core system for network backbone deployment, incorporates two radios in a single enclosure and two modem chains allowing this system to operate with built-in 2+0 XPIC, reaching the maximum data transmission capacity of up to 2.2 gigabits per second.
PhoeniX-G2
The PhoeniX G2 product line can either be split-mount with the modem installed indoors and radio — outdoors or full-indoors solution. For instance, the broadcast market is mostly using full-indoor solutions as they prefer to have all the equipment to be indoors and then have a long elliptical line going up to the antenna. Phoenix G2 product line features native ASI transmission in parallel with the IP traffic – a crucial requirement of our broadcast customers.
SAF services
SAF has also been offering different sets of services:
✓ product training.

✓ link planning.

✓ technical support.

✓ staging and configuration — enables the customers to have all the equipment labeled and configured before it gets to the customer.

✓ FCC coordination — recently added to the SAF portfolio and offered only to the customers in the USA. This provides an opportunity to save time on link planning, FCC coordination, pre-configuration, and hardware purchase from the one-stop-shop – SAF Tehnika.
✓ Zabbix deployment and support.
SAF partnership with Zabbix
Before partnering with Zabbix, SAF Tehnika had developed its own network management system and used it for many years. Unfortunately, this software was limited to SAF products. Adding other vendors’ products was a difficult and complicated process.
More and more SAF customers were inquiring about the possibility of adding other vendors’ products to the network management system. That is where Zabbix came in handy, as besides monitoring SAF products, Zabbix can also monitor other vendors’ products just by adding appropriate templates.
Zabbix is an open-source, advanced and robust platform with high customizability and scalability – there are virtually no limits to the number of devices Zabbix can monitor. I am confident our customers will appreciate all of these benefits and enjoy the ability to add SAF and other vendor products to the list of monitored devices.
Finally, SAF Tehnika and Zabbix are located in the same small town, so the partnership was easy and natural.
Zabbix templates for SAF equipment
Following training at Zabbix, SAF engineers obtained the certified specialist status and developed Zabbix SNMP-based templates for main product lines:

Integra-X, Integra-E, Integra-G, Integra-W, and Phoenix G2.


SAF main product line templates are available free of charge to all of SAF customers on the SAF webpage: https://www.saftehnika.com/en/downloads

(registration required).
Users proficient in Linux and familiar with Zabbix can definitely install and deploy these templates themselves. Otherwise, SAF specialists are ready to assist in the deployment and integration of Zabbix templates and tuning of the required parameters.
Zabbix dashboard for Integra X
We have an Integra-X link Zabbix dashboard shown as an example below. As Integra-X is a dual-core radio, we provide the monitoring parameters for two radios in a single enclosure.

Zabbix dashboard for Integra-X link
On the top, we display the main health parameters of the link, current received signal level, MSE or so-called noise level, the transmit power, and the IP address – a small summary of the link.
On the left, we display the parameters of the radio and the graphs for the last couple of minutes — the live graphs of the received signal level and MSE, the noise level of the RF link.
On the right, we have the same parameters for the remote side. In the middle, we have added a few parameters, which should be monitored, such as CPU load percentage, the current traffic over the link, and diagnostic parameters, such as the temperature of each of the modems.
At the bottom, we have added the alarm widget. In this example, the alarm of too low received signal level is shown. These alarms are also colored by their severity: red alarms are for disaster-level issues, blue alarms — for information.
From this dashboard, the customers are able to estimate the current status of the link and any issues that have appeared in the past. Note that Zabbix graphs can be easily customized to display the widgets or graphs of customer choice.
Zabbix Maps view for Phoenix G2

Zabbix Maps view for Phoenix G2 1+1 system
In the map, our full-indoor Phoenix G2 system is displayed in duplicate, as this is a 1+1 protected link. Each of the IDU,  ASI module, and radio module is protected by the second respective module.
Zabbix allows for naming each of these modules and for monitoring every module’s performance individually. In this example, the ASI module is colored in red as one of the ASI ports has lost the connection, while the radio unit’s red color shows that the received signal is lower than expected.

Zabbix dashboard for Phoenix G2 1+1 system
Besides the maps view, the dashboard for Phoenix G2 1+1 system shows the historical data like alarm log and graphs. The data in red indicates that an issue hasn’t been cleared yet. The data in green – an issue was resolved, for instance, a low signal level was restored after going down for a short period of time.
In the middle we see a summary graph of all four radios’ performance — two on the local side and two on the remote side. Here, we are monitoring the most important parameters — the received signal level and MSE i.e., the noise level.
The graph at the bottom is important for broadcast customers as the majority of them transmit ASI traffic besides Ethernet and IP traffic. Here they’re able to monitor how much traffic was going through this link in the past.
Zabbix services provided by SAF
Since SAF Tehnika has experienced Zabbix-certified specialists, who have developed these multiple templates, we are ready to provide Zabbix-related services to our end customers, such as:

Zabbix deployment on other customer’s machine, integration and configuration of all the parameters, and fine-tuning according to our customers’ requirements.
consulting services and provide technical support on an annual contract basis.

NOTE. SAF Zabbix support services are limited to SAF products.
SAF Tehnika is ready and eager to provide Zabbix related services to our customers. If you already have a SAF network and would be willing to integrate it into Zabbix or plan to deploy a new SAF network integrated with Zabbix, you can contact our offices.

SAF contact details
Questions & Answers
Question. You told us that you provide Zabbix for your customers and create templates to pass to them, and so on. But do you use Zabbix in your own environment, for instance, in your offices, to monitor your own infrastructure?
Answer. SAF has been using Zabbix for almost 10 years and we use it to monitor our internal infrastructure. Currently, SAF has three separate Zabbix networks: one for SAF IT system monitoring, the other for Kubernetes system monitoring (Aranet Cloud services), and a separate Zabbix server for testing purposes, where we are able to test SAF equipment as well as experiment with Zabbix server deployments, templates, etc.
Question. You have passed the specialist courses. Do you have any plans on becoming certified professionals?
Answer. Our specialists are definitely interested in Zabbix certified professionals’ courses. We will make the decision about that based on the revenue Zabbix will bring us and interest of our customers.
Question. You have already provided a couple of templates for Zabbix and for your customers. Do you have any interesting templates you are working on? Do you have plans to create or upgrade some existing templates?
Answer. So far, we have released the templates for the main product lines — Integra and Phoenix G2. We have a few product lines that are are more specialized, such as low-latency products and some older products, such as CFIP Lumina. In case any of our customers are interested in integrating these older products or low-latency products, we might create more templates. 
Question. Do you plan to refine the templates to make them a part of the Zabbix out-of-the-box solution?
Answer. If Zabbix is going to approve our templates to make them a part of the out-of-the-box solution, it will benefit our customers in using and monitoring our products. We’ll be delighted to provide the templates for this purpose.
 
The post SAF Products Integration into Zabbix appeared first on Zabbix Blog.