Skip to content

Commit

Permalink
[ZEPPELIN-4977] Metrics and healthcheck
Browse files Browse the repository at this point in the history
### What is this PR for?
This PR includes:
 - Configurable Prometheus monitoring with endpoint (`/metrics`)
 - Rewrite the JMX metric endpoint
 - two new Healthcheck endpoints (`/health/readiness`, `/health/liveness`) and a ping endpoint (`/ping`)
 - some default metrics (jetty, jvm, interpreter)

### What type of PR is it?
 - Improvement

### What is the Jira issue?
* https://issues.apache.org/jira/browse/ZEPPELIN-4977
* https://issues.apache.org/jira/browse/ZEPPELIN-4976

### How should this be tested?
* Travic-CI: https://travis-ci.com/github/Reamer/zeppelin/builds/201787227

### Questions:
* Does the licenses files need update? Yes, included in PR
* Is there breaking changes for older versions? Yes, the JMX output for metrics changes
* Does this needs documentation? Yes, included in PR

Author: Philipp Dallig <philipp.dallig@gmail.com>

Closes apache#3971 from Reamer/metrics_and_healthcheck_micro and squashes the following commits:

6dd4113 [Philipp Dallig] Add cron properties in configuration.md
eed7ff5 [Philipp Dallig] Add CronJobs metrics
103241a [Philipp Dallig] Rewrite JMX metric endpoint
282865d [Philipp Dallig] Add jetty metrics
6d81060 [Philipp Dallig] Add Interpreter metrics
e075751 [Philipp Dallig] Add HDFS Healthcheck
91d735b [Philipp Dallig] Add Healthchecks with Dropwizard
14038c3 [Philipp Dallig] Add micrometer for metrics
  • Loading branch information
Reamer committed Nov 23, 2020
1 parent 2355ce3 commit 1e177db
Show file tree
Hide file tree
Showing 22 changed files with 653 additions and 99 deletions.
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ limitations under the License.
* [MongoDB Storage](./setup/storage/storage.html#notebook-storage-in-mongodb)
* Operation
* [Configuration](./setup/operation/configuration.html): lists for Apache Zeppelin
* [Monitoring](./setup/operation/monitoring.html): monitoring instructions for Apache Zeppelin
* [Proxy Setting](./setup/operation/proxy_setting.html)
* [Upgrading](./setup/operation/upgrading.html): a manual procedure of upgrading Apache Zeppelin version
* [Trouble Shooting](./setup/operation/trouble_shooting.html)
Expand Down
24 changes: 21 additions & 3 deletions docs/setup/operation/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,13 +61,13 @@ If both are defined, then the **environment variables** will take priority.
</tr>
<tr>
<td><h6 class="properties">ZEPPELIN_JMX_ENABLE</h6></td>
<td><h6 class="properties">N/A</h6></td>
<td></td>
<td><h6 class="properties">zeppelin.jmx.enable</h6></td>
<td>false</td>
<td>Enable JMX by defining "true"</td>
</tr>
<tr>
<td><h6 class="properties">ZEPPELIN_JMX_PORT</h6></td>
<td><h6 class="properties">N/A</h6></td>
<td><h6 class="properties">zeppelin.jmx.port</h6></td>
<td>9996</td>
<td>Port number which JMX uses</td>
</tr>
Expand Down Expand Up @@ -443,6 +443,24 @@ If both are defined, then the **environment variables** will take priority.
<td>true</td>
<td>Value to enable/disable timeout handling when starting Interpreter Pods. Caution: This can lead to an infinity loop</td>
</tr>
<tr>
<td><h6 class="properties">ZEPPELIN_METRIC_ENABLE_PROMETHEUS</h6></td>
<td><h6 class="properties">zeppelin.metric.enable.prometheus</h6></td>
<td>false</td>
<td>Value to enable/disable Prometheus metric endpoint on /metric</td>
</tr>
<tr>
<td><h6 class="properties">ZEPPELIN_NOTEBOOK_CRON_ENABLE</h6></td>
<td><h6 class="properties">zeppelin.notebook.cron.enable</h6></td>
<td>false</td>
<td>Value to enable/disable Cron support in Notes</td>
</tr>
<tr>
<td><h6 class="properties">ZEPPELIN_NOTEBOOK_CRON_FOLDERS</h6></td>
<td><h6 class="properties">zeppelin.notebook.cron.folders</h6></td>
<td></td>
<td>comma-separated list of folder, where cron is allowed</td>
</tr>
</table>


Expand Down
37 changes: 37 additions & 0 deletions docs/setup/operation/monitoring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
layout: page
title: "Apache Zeppelin Monitoring"
description: "This page shows you the monitoring options you have in Apache Zeppelin"
---
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
# Apache Zeppelin Monitoring

<div id="toc"></div>

## Monitoring Options

Apache Zeppelin is using [Micrometer](https://micrometer.io/) - a vendor-neutral application metrics facade.

### Prometheus Monitoring

[Prometheus](https://prometheus.io/) is the leading monitoring solution for [Kubernetes](https://kubernetes.io/). The Prometheus endpoint can be activated with the configuration property `zeppelin.metric.enable.prometheus`. The metrics are accessible via the unauthenticated endpoint `/metrics`.

### JMX Monitoring

[JMX](https://en.wikipedia.org/wiki/Java_Management_Extensions) is a general solution for monitoring Java applications. JMX can be activated with the configuration property `zeppelin.jmx.enable`. The default port 9996 can be changed with the configuration property `zeppelin.jmx.port`.

## Healthcheck Probe

Apache Zeppelin has two healthcheck related unauthenticated endpoints (`/health/readiness`, `/health/liveness`) that could be used for proxy and/or cloud setups.
2 changes: 2 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,8 @@
<joda.version>2.9.9</joda.version>
<bouncycastle.version>1.60</bouncycastle.version>
<maven.version>3.6.3</maven.version>
<dropwizard.version>4.1.14</dropwizard.version>
<micrometer.version>1.6.0</micrometer.version>

<hadoop2.7.version>2.7.7</hadoop2.7.version>
<hadoop2.6.version>2.6.5</hadoop2.6.version>
Expand Down
17 changes: 14 additions & 3 deletions zeppelin-distribution/src/bin_license/LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,9 @@ The following components are provided under Apache License.
(Apache 2.0) Codehaus Plexus Utils (org.codehaus.plexus:plexus-utils:3.2.1 - http://github.com/codehaus-plexus/plexus-utils)
(Apache 2.0) findbugs jsr305 (com.google.code.findbugs:jsr305:jar:1.3.9 - http://findbugs.sourceforge.net/)
(Apache 2.0) Google Guava (com.google.guava:guava:15.0 - https://code.google.com/p/guava-libraries/)
(Apache 2.0) Jackson (com.fasterxml.jackson.core:jackson-core:2.7.0 - https://github.com/FasterXML/jackson-core)
(Apache 2.0) Jackson (com.fasterxml.jackson.core:jackson-annotations:2.9.9 - https://github.com/FasterXML/jackson-core)
(Apache 2.0) Jackson (com.fasterxml.jackson.core:jackson-databind:2.9.9.1 - https://github.com/FasterXML/jackson-core)
(Apache 2.0) Jackson (com.fasterxml.jackson.core:jackson-core:2.9.10 - https://github.com/FasterXML/jackson-core)
(Apache 2.0) Jackson (com.fasterxml.jackson.core:jackson-annotations:2.9.10 - https://github.com/FasterXML/jackson-core)
(Apache 2.0) Jackson (com.fasterxml.jackson.core:jackson-databind:2.9.10.6 - https://github.com/FasterXML/jackson-core)
(Apache 2.0) Jackson Mapper ASL (org.codehaus.jackson:jackson-mapper-asl:1.9.13 - https://mvnrepository.com/artifact/org.codehaus.jackson/jackson-mapper-asl/1.9.13)
(Apache 2.0) javax.servlet (org.eclipse.jetty.orbit:javax.servlet:jar:3.1.0.v201112011016 - http://www.eclipse.org/jetty)
(Apache 2.0) Joda-Time (joda-time:joda-time:2.8.1 - http://www.joda.org/joda-time/)
Expand Down Expand Up @@ -221,6 +221,16 @@ The following components are provided under Apache License.
(Apache 2.0) Neo4j Java Driver (https://github.com/neo4j/neo4j-java-driver) - https://github.com/neo4j/neo4j-java-driver/blob/1.4.3/LICENSE.txt
(Apache 2.0) Hazelcast Jet (http://jet.hazelcast.org) - https://github.com/hazelcast/hazelcast-jet/blob/master/LICENSE
(Apache 2.0) RxJava (io.reactivex.rxjava2:rxjava:2.2.17) - https://github.com/ReactiveX/RxJava/blob/2.x/LICENSE
(Apache 2.0) Application monitoring instrumentation facade (io.micrometer:micrometer-core:1.6.0) - https://github.com/micrometer-metrics/micrometer/blob/master/LICENSE
(Apache 2.0) Application monitoring instrumentation facade (io.micrometer:micrometer-registry-prometheus:1.6.0) - https://github.com/micrometer-metrics/micrometer/blob/master/LICENSE
(Apache 2.0) Application monitoring instrumentation facade (io.micrometer:micrometer-registry-jmx:1.6.0) - https://github.com/micrometer-metrics/micrometer/blob/master/LICENSE
(Apache 2.0) Prometheus Java Simpleclient Common (io.prometheus:simpleclient_common:0.9.0) - https://github.com/prometheus/client_java/blob/master/LICENSE
(Apache 2.0) Prometheus Java Simpleclient (io.prometheus:simpleclient:0.9.0) - https://github.com/prometheus/client_java/blob/master/LICENSE
(Apache 2.0) Dropwizard Metrics Core (io.dropwizard.metrics:metrics-core:4.1.14) - https://github.com/dropwizard/metrics/blob/release/4.1.x/LICENSE
(Apache 2.0) Dropwizard Metrics Utility Servlets (io.dropwizard.metrics:metrics-servlets:4.1.14) - https://github.com/dropwizard/metrics/blob/release/4.1.x/LICENSE
(Apache 2.0) Dropwizard Jackson Integration for Metrics (io.dropwizard.metrics:metrics-json:4.1.14) - https://github.com/dropwizard/metrics/blob/release/4.1.x/LICENSE
(Apache 2.0) Dropwizard Metrics Health Checks (io.dropwizard.metrics:metrics-healthchecks:4.1.14) - https://github.com/dropwizard/metrics/blob/release/4.1.x/LICENSE
(Apache 2.0) Dropwizard Metrics Integration with JMX (io.dropwizard.metrics:metrics-jmx:4.1.14) - https://github.com/dropwizard/metrics/blob/release/4.1.x/LICENSE

========================================================================
MIT licenses
Expand Down Expand Up @@ -420,3 +430,4 @@ Creative Commons CC0 (http://creativecommons.org/publicdomain/zero/1.0/)
Multiple licenses
========================================================================
(LGPLv2) (GPLv2) (MPL 1.1) Jtransforms (com.github.rwl:jtransforms:2.4.0 - https://sourceforge.net/projects/jtransforms/)
(CC0 1.0) (BSD-2) HdrHistogram (org.hdrhistogram:HdrHistogram:2.1.12 - http://hdrhistogram.github.io/HdrHistogram/)
12 changes: 0 additions & 12 deletions zeppelin-interpreter-integration/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -77,12 +77,6 @@
<groupId>org.apache.zeppelin</groupId>
<artifactId>zeppelin-server</artifactId>
<version>${project.version}</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
</exclusion>
</exclusions>
</dependency>

<dependency>
Expand Down Expand Up @@ -128,12 +122,6 @@
<version>${project.version}</version>
<classifier>tests</classifier>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
</exclusion>
</exclusions>
</dependency>

<dependency>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,14 @@ public String getPemCAFile() {
return getString(ConfVars.ZEPPELIN_SSL_PEM_CA);
}

public boolean isJMXEnabled() {
return getBoolean(ConfVars.ZEPPELIN_JMX_ENABLE);
}

public int getJMXPort() {
return getInt(ConfVars.ZEPPELIN_JMX_PORT);
}

public String getNotebookDir() {
return getAbsoluteDir(ConfVars.ZEPPELIN_NOTEBOOK_DIR);
}
Expand Down Expand Up @@ -508,7 +516,7 @@ public boolean isS3ServerSideEncryption() {
public String getS3SignerOverride() {
return getString(ConfVars.ZEPPELIN_NOTEBOOK_S3_SIGNEROVERRIDE);
}

public boolean isS3PathStyleAccess() {
return getBoolean(ConfVars.ZEPPELIN_NOTEBOOK_S3_PATH_STYLE_ACCESS);
}
Expand Down Expand Up @@ -884,6 +892,10 @@ public String getDockerContainerImage() {
return getString(ConfVars.ZEPPELIN_DOCKER_CONTAINER_IMAGE);
}

public boolean isPrometheusMetricEnabled() {
return getBoolean(ConfVars.ZEPPELIN_METRIC_ENABLE_PROMETHEUS);
}

public Map<String, String> dumpConfigurations(Predicate<String> predicate) {
Map<String, String> properties = new HashMap<>();

Expand Down Expand Up @@ -958,6 +970,8 @@ public enum ConfVars {
ZEPPELIN_WAR("zeppelin.war", "zeppelin-web/dist"),
ZEPPELIN_ANGULAR_WAR("zeppelin.angular.war", "zeppelin-web-angular/dist"),
ZEPPELIN_WAR_TEMPDIR("zeppelin.war.tempdir", "webapps"),
ZEPPELIN_JMX_ENABLE("zeppelin.jmx.enable", false),
ZEPPELIN_JMX_PORT("zeppelin.jmx.port", 9996),

ZEPPELIN_INTERPRETER_JSON("zeppelin.interpreter.setting", "interpreter-setting.json"),
ZEPPELIN_INTERPRETER_DIR("zeppelin.interpreter.dir", "interpreter"),
Expand Down Expand Up @@ -1095,6 +1109,8 @@ public enum ConfVars {

ZEPPELIN_DOCKER_CONTAINER_IMAGE("zeppelin.docker.container.image", "apache/zeppelin:" + Util.getVersion()),

ZEPPELIN_METRIC_ENABLE_PROMETHEUS("zeppelin.metric.enable.prometheus", false),

ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER("zeppelin.impersonate.spark.proxy.user", true),
ZEPPELIN_NOTEBOOK_GIT_REMOTE_URL("zeppelin.notebook.git.remote.url", ""),
ZEPPELIN_NOTEBOOK_GIT_REMOTE_USERNAME("zeppelin.notebook.git.remote.username", "token"),
Expand Down
63 changes: 44 additions & 19 deletions zeppelin-server/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,12 @@
<properties>

<!--library versions-->
<jersey.version>2.27</jersey.version>
<jersey.version>2.30</jersey.version>
<jersey.servlet.version>1.13</jersey.servlet.version>
<javax.ws.rsapi.version>2.1</javax.ws.rsapi.version>
<libpam4j.version>1.11</libpam4j.version>
<jna.version>4.1.0</jna.version>
<jackson.version>2.9.10.6</jackson.version>

<!--test library versions-->
<selenium.java.version>2.48.2</selenium.java.version>
Expand Down Expand Up @@ -104,6 +105,42 @@
<artifactId>jcl-over-slf4j</artifactId>
</dependency>

<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-servlets</artifactId>
<version>${dropwizard.version}</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</exclusion>
</exclusions>
</dependency>

<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
<version>${micrometer.version}</version>
</dependency>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-jmx</artifactId>
<version>${micrometer.version}</version>
<exclusions>
<!-- manual loading to get the right version that fits to other Dropwizard libraries -->
<exclusion>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-jmx</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Used by io.micrometer:micrometer-registry-jmx -->
<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-jmx</artifactId>
<version>${dropwizard.version}</version>
</dependency>

<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-client</artifactId>
Expand Down Expand Up @@ -133,17 +170,6 @@
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.9.10.1</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
</exclusion>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
Expand All @@ -161,6 +187,12 @@
<version>${jersey.version}</version>
</dependency>

<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>${jackson.version}</version>
</dependency>

<dependency>
<groupId>javax.ws.rs</groupId>
<artifactId>javax.ws.rs-api</artifactId>
Expand Down Expand Up @@ -223,13 +255,6 @@
<version>${jna.version}</version>
</dependency>

<!-- Needed for dependency conergence -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.9.9</version>
</dependency>

<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-webapp</artifactId>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.zeppelin.metric;

import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.binder.MeterBinder;

public class JVMInfoBinder implements MeterBinder {
private static final String UNKNOWN = "unknown";

@Override
public void bindTo(MeterRegistry registry) {
Counter.builder("jvm.info")
.description("JVM version info")
.tags("version", System.getProperty("java.runtime.version", UNKNOWN),
"vendor", System.getProperty("java.vm.vendor", UNKNOWN),
"runtime", System.getProperty("java.runtime.name", UNKNOWN))
.register(registry)
.increment();
}
}
Loading

0 comments on commit 1e177db

Please sign in to comment.