Skip to content

Commit

Permalink
Deferred creation of SkyDNS, monitoring and logging objects
Browse files Browse the repository at this point in the history
This implements phase 1 of the proposal in #3579, moving the creation
of the pods, RCs, and services to the master after the apiserver is
available.

This is such a wide commit because our existing initial config story
is special:

* Add kube-addons service and associated salt configuration:
** We configure /etc/kubernetes/addons to be a directory of objects
that are appropriately configured for the current cluster.
** "/etc/init.d/kube-addons start" slurps up everything in that dir.
(Most of the difficult is the business logic in salt around getting
that directory built at all.)
** We cheat and overlay cluster/addons into saltbase/salt/kube-addons
as config files for the kube-addons meta-service.
* Change .yaml.in files to salt templates
* Rename {setup,teardown}-{monitoring,logging} to
{setup,teardown}-{monitoring,logging}-firewall to properly reflect
their real purpose now (the purpose of these functions is now ONLY to
bring up the firewall rules, and possibly to relay the IP to the user).
* Rework GCE {setup,teardown}-{monitoring,logging}-firewall: Both
functions were improperly configuring global rules, yet used
lifecycles tied to the cluster. Use $NODE_INSTANCE_PREFIX with the
rule. The logging rule needed a $NETWORK specifier. The monitoring
rule tried gcloud describe first, but given the instancing, this feels
like a waste of time now.
* Plumb ENABLE_CLUSTER_MONITORING, ENABLE_CLUSTER_LOGGING,
ELASTICSEARCH_LOGGING_REPLICAS and DNS_REPLICAS down to the master,
since these are needed there now.

(Desperately want just a yaml or json file we can share between
providers that has all this crap. Maybe #3525 is an answer?)

Huge caveats: I've gone pretty firm testing on GCE, including
twiddling the env variables and making sure the objects I expect to
come up, come up. I've tested that it doesn't break GKE bringup
somehow. But I haven't had a chance to test the other providers.
  • Loading branch information
zmerlynn committed Jan 21, 2015
1 parent 3c15427 commit a305269
Show file tree
Hide file tree
Showing 21 changed files with 336 additions and 153 deletions.
8 changes: 8 additions & 0 deletions build/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -551,6 +551,14 @@ function kube::release::package_salt_tarball() {

cp -R "${KUBE_ROOT}/cluster/saltbase" "${release_stage}/"

# TODO(#3579): This is a temporary hack. It gathers up the yaml,
# yaml.in files in cluster/addons (minus any demos) and overlays
# them into kube-addons, where we expect them. (This pipeline is a
# fancy copy, stripping anything but the files we don't want.)
local objects
objects=$(cd "${KUBE_ROOT}/cluster/addons" && find . -name \*.yaml -or -name \*.yaml.in | grep -v demo)
tar c -C "${KUBE_ROOT}/cluster/addons" ${objects} | tar x -C "${release_stage}/saltbase/salt/kube-addons"

local package_name="${RELEASE_DIR}/kubernetes-salt.tar.gz"
kube::release::create_tarball "${package_name}" "${release_stage}/.."
}
Expand Down
25 changes: 8 additions & 17 deletions cluster/addons/dns/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# DNS in Kubernetes
This directory holds an example of how to run
[SkyDNS](https://github.com/skynetservices/skydns) in a Kubernetes cluster.
[SkyDNS](https://github.com/skynetservices/skydns) can be configured
to automatically run in a Kubernetes cluster.

## What things get DNS names?
The only objects to which we are assigning DNS names are Services. Every
Expand All @@ -18,23 +18,14 @@ Of course, giving services a name is just half of the problem - DNS names need a
domain also. This implementation uses the variable `DNS_DOMAIN` (see below).
You can configure your docker daemon with the flag `--dns-search`.

## How do I run it?
The first thing you have to do is substitute the variables into the
configuration. You can then feed the result into `kubectl`.
## How do I configure it?
The following environment variables are used at cluster startup to create the SkyDNS pods and configure the kubelets. If you need to, you can reconfigure your provider as necessary (e.g. `cluster/gce/config-default.sh`):

```shell
DNS_SERVER_IP=10.0.0.10
DNS_DOMAIN=kubernetes.local
DNS_REPLICAS=2

sed -e "s/{DNS_DOMAIN}/$DNS_DOMAIN/g" \
-e "s/{DNS_REPLICAS}/$DNS_REPLICAS/g" \
./cluster/addons/dns/skydns-rc.yaml.in \
| ./cluster/kubectl.sh create -f -

sed -e "s/{DNS_SERVER_IP}/$DNS_SERVER_IP/g" \
./cluster/addons/dns/skydns-svc.yaml.in \
| ./cluster/kubectl.sh create -f -
ENABLE_CLUSTER_DNS=true
DNS_SERVER_IP="10.0.0.10"
DNS_DOMAIN="kubernetes.local"
DNS_REPLICAS=1
```

## How does it work?
Expand Down
6 changes: 3 additions & 3 deletions cluster/addons/dns/skydns-rc.yaml.in
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ namespace: default
labels:
k8s-app: skydns
desiredState:
replicas: {DNS_REPLICAS}
replicas: {{ pillar['dns_replicas'] }}
replicaSelector:
k8s-app: skydns
podTemplate:
Expand All @@ -28,15 +28,15 @@ desiredState:
image: kubernetes/kube2sky:1.0
command: [
# entrypoint = "/kube2sky",
"-domain={DNS_DOMAIN}",
"-domain={{ pillar['dns_domain'] }}",
]
- name: skydns
image: kubernetes/skydns:2014-12-23-001
command: [
# entrypoint = "/skydns",
"-machines=http://localhost:4001",
"-addr=0.0.0.0:53",
"-domain={DNS_DOMAIN}.",
"-domain={{ pillar['dns_domain'] }}.",
]
ports:
- name: dns
Expand Down
2 changes: 1 addition & 1 deletion cluster/addons/dns/skydns-svc.yaml.in
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ id: skydns
namespace: default
protocol: UDP
port: 53
portalIP: {DNS_SERVER_IP}
portalIP: {{ pillar['dns_server'] }}
containerPort: 53
labels:
k8s-app: skydns
Expand Down
2 changes: 1 addition & 1 deletion cluster/addons/fluentd-elasticsearch/es-controller.yaml.in
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v1beta1
kind: ReplicationController
id: elasticsearch-logging-controller
desiredState:
replicas: {ELASTICSEARCH_LOGGING_REPLICAS}
replicas: {{ pillar['elasticsearch_replicas'] }}
replicaSelector:
name: elasticsearch-logging
podTemplate:
Expand Down
14 changes: 9 additions & 5 deletions cluster/aws/util.sh
Original file line number Diff line number Diff line change
Expand Up @@ -106,14 +106,14 @@ function ensure-temp-dir {
fi
}

function setup-monitoring {
function setup-monitoring-firewall {
if [[ "${ENABLE_CLUSTER_MONITORING:-false}" == "true" ]]; then
# TODO: Implement this.
echo "Monitoring not currently supported on AWS"
fi
}

function teardown-monitoring {
function teardown-monitoring-firewall {
if [[ "${ENABLE_CLUSTER_MONITORING:-false}" == "true" ]]; then
# TODO: Implement this.
echo "Monitoring not currently supported on AWS"
Expand Down Expand Up @@ -296,10 +296,14 @@ function kube-up {
echo "readonly AWS_ZONE='${ZONE}'"
echo "readonly MASTER_HTPASSWD='${htpasswd}'"
echo "readonly PORTAL_NET='${PORTAL_NET}'"
echo "readonly ENABLE_CLUSTER_MONITORING='${ENABLE_CLUSTER_MONITORING:-false}'"
echo "readonly ENABLE_NODE_MONITORING='${ENABLE_NODE_MONITORING:-false}'"
echo "readonly ENABLE_CLUSTER_LOGGING='${ENABLE_CLUSTER_LOGGING:-false}'"
echo "readonly ENABLE_NODE_LOGGING='${ENABLE_NODE_LOGGING:-false}'"
echo "readonly LOGGING_DESTINATION='${LOGGING_DESTINATION:-}'"
echo "readonly ELASTICSEARCH_LOGGING_REPLICAS='${ELASTICSEARCH_LOGGING_REPLICAS:-}'"
echo "readonly ENABLE_CLUSTER_DNS='${ENABLE_CLUSTER_DNS:-false}'"
echo "readonly DNS_REPLICAS='${DNS_REPLICAS:-}'"
echo "readonly DNS_SERVER_IP='${DNS_SERVER_IP:-}'"
echo "readonly DNS_DOMAIN='${DNS_DOMAIN:-}'"
grep -v "^#" "${KUBE_ROOT}/cluster/aws/templates/create-dynamic-salt-files.sh"
Expand Down Expand Up @@ -498,10 +502,10 @@ function kube-down {
$AWS_CMD delete-vpc --vpc-id $vpc_id > $LOG
}

function setup-logging {
function setup-logging-firewall {
echo "TODO: setup logging"
}

function teardown-logging {
function teardown-logging-firewall {
echo "TODO: teardown logging"
}
}
8 changes: 4 additions & 4 deletions cluster/azure/util.sh
Original file line number Diff line number Diff line change
Expand Up @@ -558,18 +558,18 @@ function restart-kube-proxy {
}

# Setup monitoring using heapster and InfluxDB
function setup-monitoring {
function setup-monitoring-firewall {
echo "not implemented" >/dev/null
}

function teardown-monitoring {
function teardown-monitoring-firewall {
echo "not implemented" >/dev/null
}

function setup-logging {
function setup-logging-firewall {
echo "TODO: setup logging"
}

function teardown-logging {
function teardown-logging-firewall {
echo "TODO: teardown logging"
}
4 changes: 4 additions & 0 deletions cluster/gce/templates/create-dynamic-salt-files.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,14 @@ mkdir -p /srv/salt-overlay/pillar
cat <<EOF >/srv/salt-overlay/pillar/cluster-params.sls
node_instance_prefix: '$(echo "$NODE_INSTANCE_PREFIX" | sed -e "s/'/''/g")'
portal_net: '$(echo "$PORTAL_NET" | sed -e "s/'/''/g")'
enable_cluster_monitoring: '$(echo "$ENABLE_CLUSTER_MONITORING" | sed -e "s/'/''/g")'
enable_node_monitoring: '$(echo "$ENABLE_NODE_MONITORING" | sed -e "s/'/''/g")'
enable_cluster_logging: '$(echo "$ENABLE_CLUSTER_LOGGING" | sed -e "s/'/''/g")'
enable_node_logging: '$(echo "$ENABLE_NODE_LOGGING" | sed -e "s/'/''/g")'
logging_destination: '$(echo "$LOGGING_DESTINATION" | sed -e "s/'/''/g")'
elasticsearch_replicas: '$(echo "$ELASTICSEARCH_LOGGING_REPLICAS" | sed -e "s/'/''/g")'
enable_cluster_dns: '$(echo "$ENABLE_CLUSTER_DNS" | sed -e "s/'/''/g")'
dns_replicas: '$(echo "$DNS_REPLICAS" | sed -e "s/'/''/g")'
dns_server: '$(echo "$DNS_SERVER_IP" | sed -e "s/'/''/g")'
dns_domain: '$(echo "$DNS_DOMAIN" | sed -e "s/'/''/g")'
EOF
Expand Down
136 changes: 52 additions & 84 deletions cluster/gce/util.sh
Original file line number Diff line number Diff line change
Expand Up @@ -393,10 +393,14 @@ function kube-up {
echo "readonly SALT_TAR_URL='${SALT_TAR_URL}'"
echo "readonly MASTER_HTPASSWD='${htpasswd}'"
echo "readonly PORTAL_NET='${PORTAL_NET}'"
echo "readonly ENABLE_CLUSTER_MONITORING='${ENABLE_CLUSTER_MONITORING:-false}'"
echo "readonly ENABLE_NODE_MONITORING='${ENABLE_NODE_MONITORING:-false}'"
echo "readonly ENABLE_CLUSTER_LOGGING='${ENABLE_CLUSTER_LOGGING:-false}'"
echo "readonly ENABLE_NODE_LOGGING='${ENABLE_NODE_LOGGING:-false}'"
echo "readonly LOGGING_DESTINATION='${LOGGING_DESTINATION:-}'"
echo "readonly ELASTICSEARCH_LOGGING_REPLICAS='${ELASTICSEARCH_LOGGING_REPLICAS:-}'"
echo "readonly ENABLE_CLUSTER_DNS='${ENABLE_CLUSTER_DNS:-false}'"
echo "readonly DNS_REPLICAS='${DNS_REPLICAS:-}'"
echo "readonly DNS_SERVER_IP='${DNS_SERVER_IP:-}'"
echo "readonly DNS_DOMAIN='${DNS_DOMAIN:-}'"
grep -v "^#" "${KUBE_ROOT}/cluster/gce/templates/common.sh"
Expand Down Expand Up @@ -731,106 +735,70 @@ function restart-kube-proxy {
ssh-to-node "$1" "sudo /etc/init.d/kube-proxy restart"
}

# Setup monitoring using heapster and InfluxDB
function setup-monitoring {
if [[ "${ENABLE_CLUSTER_MONITORING}" == "true" ]]; then
echo "Setting up cluster monitoring using Heapster."

detect-project
if ! gcloud compute firewall-rules --project "${PROJECT}" describe monitoring-heapster &> /dev/null; then
if ! gcloud compute firewall-rules create monitoring-heapster \
--project "${PROJECT}" \
--target-tags="${MINION_TAG}" \
--network="${NETWORK}" \
--allow tcp:80 tcp:8083 tcp:8086; then
echo -e "${color_red}Failed to set up firewall for monitoring ${color_norm}" && false
fi
fi
# Setup monitoring firewalls using heapster and InfluxDB
function setup-monitoring-firewall {
if [[ "${ENABLE_CLUSTER_MONITORING}" != "true" ]]; then
return
fi

local kubectl="${KUBE_ROOT}/cluster/kubectl.sh"
local grafana_host=""
if "${kubectl}" create -f "${KUBE_ROOT}/cluster/addons/cluster-monitoring/" &> /dev/null; then
# wait for pods to be scheduled on a node.
echo "waiting for monitoring pods to be scheduled."
for i in `seq 1 10`; do
grafana_host=$("${kubectl}" get pods -l name=influxGrafana -o template -t {{range.items}}{{.currentState.hostIP}}:{{end}} | sed s/://g)
if [[ $grafana_host != *"<"* ]]; then
echo "Setting up firewalls to Heapster based cluster monitoring."

detect-project
gcloud compute firewall-rules create "${INSTANCE_PREFIX}-monitoring-heapster" --project "${PROJECT}" \
--allow tcp:80 tcp:8083 tcp:8086 --target-tags="${MINION_TAG}" --network="${NETWORK}"

local kubectl="${KUBE_ROOT}/cluster/kubectl.sh"
local grafana_host=""
echo "waiting for monitoring pods to be scheduled."
for i in `seq 1 10`; do
grafana_host=$("${kubectl}" get pods -l name=influxGrafana -o template -t {{range.items}}{{.currentState.hostIP}}:{{end}} | sed s/://g)
if [[ ${grafana_host} != *"<"* ]]; then
break
fi
sleep 10
done
if [[ $grafana_host != *"<"* ]]; then
echo
echo -e "${color_green}Grafana dashboard will be available at ${color_yellow}http://$grafana_host${color_green}. Wait for the monitoring dashboard to be online.${color_norm}"
echo
else
echo -e "${color_red}monitoring pods failed to be scheduled.${color_norm}"
fi
else
echo -e "${color_red}Failed to Setup Monitoring ${color_norm}"
teardown-monitoring
fi
sleep 10
done
if [[ ${grafana_host} != *"<"* ]]; then
echo
echo -e "${color_green}Grafana dashboard will be available at ${color_yellow}http://${grafana_host}${color_green}. Wait for the monitoring dashboard to be online.${color_norm}"
echo
else
echo -e "${color_red}Monitoring pods failed to be scheduled!${color_norm}"
fi
}

function teardown-monitoring {
if [[ "${ENABLE_CLUSTER_MONITORING}" == "true" ]]; then
detect-project

local kubectl="${KUBE_ROOT}/cluster/kubectl.sh"
local kubecfg="${KUBE_ROOT}/cluster/kubecfg.sh"
"${kubecfg}" resize monitoring-influxGrafanaController 0 &> /dev/null || true
"${kubecfg}" resize monitoring-heapsterController 0 &> /dev/null || true
"${kubectl}" delete -f "${KUBE_ROOT}/cluster/addons/cluster-monitoring/" &> /dev/null || true
if gcloud compute firewall-rules describe --project "${PROJECT}" monitoring-heapster &> /dev/null; then
gcloud compute firewall-rules delete \
--project "${PROJECT}" \
--quiet \
monitoring-heapster &> /dev/null || true
fi
function teardown-monitoring-firewall {
if [[ "${ENABLE_CLUSTER_MONITORING}" != "true" ]]; then
return
fi

detect-project
gcloud compute firewall-rules delete -q "${INSTANCE_PREFIX}-monitoring-heapster" --project "${PROJECT}" || true
}

function setup-logging {
function setup-logging-firewall {
# If logging with Fluentd to Elasticsearch is enabled then create pods
# and services for Elasticsearch (for ingesting logs) and Kibana (for
# viewing logs).
if [[ "${ENABLE_NODE_LOGGING-}" == "true" ]] && \
[[ "${LOGGING_DESTINATION-}" == "elasticsearch" ]] && \
[[ "${ENABLE_CLUSTER_LOGGING-}" == "true" ]]; then
local -r kubectl="${KUBE_ROOT}/cluster/kubectl.sh"
if sed -e "s/{ELASTICSEARCH_LOGGING_REPLICAS}/${ELASTICSEARCH_LOGGING_REPLICAS}/g" \
"${KUBE_ROOT}"/cluster/addons/fluentd-elasticsearch/es-controller.yaml.in | \
"${kubectl}" create -f - &> /dev/null && \
"${kubectl}" create -f "${KUBE_ROOT}"/cluster/addons/fluentd-elasticsearch/es-service.yaml &> /dev/null && \
"${kubectl}" create -f "${KUBE_ROOT}"/cluster/addons/fluentd-elasticsearch/kibana-controller.yaml &> /dev/null && \
"${kubectl}" create -f "${KUBE_ROOT}"/cluster/addons/fluentd-elasticsearch/kibana-service.yaml &> /dev/null; then
gcloud compute firewall-rules create fluentd-elasticsearch-logging --project "${PROJECT}" \
--allow tcp:5601 tcp:9200 tcp:9300 --target-tags "${INSTANCE_PREFIX}"-minion || true
local -r region="${ZONE::-2}"
local -r es_ip=$(gcloud compute forwarding-rules --project "${PROJECT}" describe --region "${region}" elasticsearch-logging | grep IPAddress | awk '{print $2}')
local -r kibana_ip=$(gcloud compute forwarding-rules --project "${PROJECT}" describe --region "${region}" kibana-logging | grep IPAddress | awk '{print $2}')
echo
echo -e "${color_green}Cluster logs are ingested into Elasticsearch running at ${color_yellow}http://${es_ip}:9200"
echo -e "${color_green}Kibana logging dashboard will be available at ${color_yellow}http://${kibana_ip}:5601${color_norm}"
echo
else
echo -e "${color_red}Failed to launch Elasticsearch and Kibana pods and services for logging.${color_norm}"
fi
if [[ "${ENABLE_NODE_LOGGING-}" != "true" ]] || \
[[ "${LOGGING_DESTINATION-}" != "elasticsearch" ]] || \
[[ "${ENABLE_CLUSTER_LOGGING-}" != "true" ]]; then
return
fi

detect-project
gcloud compute firewall-rules create "${INSTANCE_PREFIX}-fluentd-elasticsearch-logging" --project "${PROJECT}" \
--allow tcp:5601 tcp:9200 tcp:9300 --target-tags "${MINION_TAG}" --network="${NETWORK}"
}

function teardown-logging {
if [[ "${ENABLE_NODE_LOGGING-}" == "true" ]] && \
[[ "${LOGGING_DESTINATION-}" == "elasticsearch" ]] && \
[[ "${ENABLE_CLUSTER_LOGGING-}" == "true" ]]; then
local -r kubectl="${KUBE_ROOT}/cluster/kubectl.sh"
"${kubectl}" delete replicationController elasticsearch-logging-controller &> /dev/null || true
"${kubectl}" delete service elasticsearch-logging &> /dev/null || true
"${kubectl}" delete replicationController kibana-logging-controller &> /dev/null || true
"${kubectl}" delete service kibana-logging &> /dev/null || true
gcloud compute firewall-rules delete -q fluentd-elasticsearch-logging --project "${PROJECT}" || true
function teardown-logging-firewall {
if [[ "${ENABLE_NODE_LOGGING-}" != "true" ]] || \
[[ "${LOGGING_DESTINATION-}" != "elasticsearch" ]] || \
[[ "${ENABLE_CLUSTER_LOGGING-}" != "true" ]]; then
return
fi

detect-project
gcloud compute firewall-rules delete -q "${INSTANCE_PREFIX}-fluentd-elasticsearch-logging" --project "${PROJECT}" || true
}

# Perform preparations required to run e2e tests
Expand Down
12 changes: 6 additions & 6 deletions cluster/gke/util.sh
Original file line number Diff line number Diff line change
Expand Up @@ -115,8 +115,8 @@ function kube-up() {
}

# Called during cluster/kube-up.sh
function setup-monitoring() {
echo "... in setup-monitoring()" >&2
function setup-monitoring-firewall() {
echo "... in setup-monitoring-firewall()" >&2
# TODO(mbforbes): This isn't currently supported in GKE.
}

Expand Down Expand Up @@ -239,8 +239,8 @@ function test-teardown() {
}

# Tears down monitoring.
function teardown-monitoring() {
echo "... in teardown-monitoring()" >&2
function teardown-monitoring-firewall() {
echo "... in teardown-monitoring-firewall()" >&2
# TODO(mbforbes): This isn't currently supported in GKE.
}

Expand All @@ -257,10 +257,10 @@ function kube-down() {
--zone="${ZONE}" "${CLUSTER_NAME}"
}

function setup-logging {
function setup-logging-firewall {
echo "TODO: setup logging"
}

function teardown-logging {
function teardown-logging-firewall {
echo "TODO: teardown logging"
}
4 changes: 2 additions & 2 deletions cluster/kube-down.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ source "${KUBE_ROOT}/cluster/${KUBERNETES_PROVIDER}/util.sh"
echo "Bringing down cluster using provider: $KUBERNETES_PROVIDER"

verify-prereqs
teardown-monitoring
teardown-logging
teardown-monitoring-firewall
teardown-logging-firewall

kube-down

Expand Down
Loading

2 comments on commit a305269

@satnam6502
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the cluster level logging setup reported the URL for the Kibana dashboard (just like how the monitoring dashboard reports the Grafana dashboard). But this is no longer reported?

@zmerlynn
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I forgot to port the echo, sorry.

Please sign in to comment.