|
4 | 4 | .. _platform_monitoring: |
5 | 5 |
|
6 | 6 | Platform monitoring |
7 | | -========================= |
| 7 | +=================== |
8 | 8 |
|
9 | | -Comprehensive platform monitoring is a goal for Deis 1.0. We are currently investigating solutions |
10 | | -for this, and progress can be tracked in GitHub issue `#981`_. |
| 9 | +While Deis itself doesn't have a built-in monitoring platform, Deis components and deployed |
| 10 | +applications alike run entirely within Docker containers. This means that monitoring tools and |
| 11 | +services which support Docker containers should work with Deis. A few tools and monitoring services |
| 12 | +which support Docker integrations are detailed below. |
11 | 13 |
|
12 | | -A requirement for this monitoring system would be to: |
| 14 | +Tools |
| 15 | +----- |
13 | 16 |
|
14 | | -* Track system metrics (CPU, memory, etc.) |
15 | | -* Report Deis component failure |
16 | | -* Track deployed application container states and metrics |
| 17 | +cadvisor |
| 18 | +~~~~~~~~ |
17 | 19 |
|
18 | | -Platform monitoring is an ongoing discussion, and feedback is much appreciated in issue `#981`_. |
| 20 | +Google's Container Advisor (`cadvisor`_) runs inside a Docker container and shows memory and CPU |
| 21 | +usage for all containers running on the host. To run cAdvisor: |
19 | 22 |
|
20 | | -.. _`#981`: https://github.com/deis/deis/issues/981 |
| 23 | +.. code-block:: console |
| 24 | +
|
| 25 | + sudo docker run \ |
| 26 | + --volume=/:/rootfs:ro \ |
| 27 | + --volume=/var/run:/var/run:rw \ |
| 28 | + --volume=/sys:/sys:ro \ |
| 29 | + --volume=/var/lib/docker/:/var/lib/docker:ro \ |
| 30 | + --publish=8080:8080 \ |
| 31 | + --detach=true \ |
| 32 | + --name=cadvisor \ |
| 33 | + google/cadvisor:latest |
| 34 | +
|
| 35 | +To run cAdvisor on all hosts in the cluster, you can submit and start a fleet service: |
| 36 | + |
| 37 | +.. code-block:: console |
| 38 | +
|
| 39 | + [Unit] |
| 40 | + Description=Google Container Advisor |
| 41 | + Requires=docker.socket |
| 42 | + After=docker.socket |
| 43 | +
|
| 44 | + [Service] |
| 45 | + ExecStartPre=/bin/sh -c "docker history google/cadvisor:latest >/dev/null || docker pull google/cadvisor:latest" |
| 46 | + ExecStart=/usr/bin/docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8080:8080 --name=cadvisor google/cadvisor:latest |
| 47 | +
|
| 48 | + [Install] |
| 49 | + WantedBy=multi-user.target |
| 50 | +
|
| 51 | + [X-Fleet] |
| 52 | + Global=true |
| 53 | +
|
| 54 | +Save the file as ``cadvisor.service``. Load and start the service with |
| 55 | +``fleetctl load cadvisor.service && fleetctl start cadvisor.service``. |
| 56 | + |
| 57 | +The web interface will be accessible at port 8080 on each host. |
| 58 | + |
| 59 | +In addition to starting a cAdvisor instance on each CoreOS host, there's also a project called |
| 60 | +`heapster`_ from the Google Cloud Platform team, which seems to be a cluster-aware cAdvisor. |
| 61 | + |
| 62 | +Monitoring services |
| 63 | +------------------- |
| 64 | + |
| 65 | +These are a few monitoring services which are known to provide Docker integrations. |
| 66 | +Additions to this reference guide are much appreciated! |
| 67 | + |
| 68 | +Datadog |
| 69 | +~~~~~~~ |
| 70 | + |
| 71 | +The `Datadog`_ cloud monitoring service provides a monitor agent which runs on the host and provides |
| 72 | +metrics for all Docker containers (which is functionally similar to cAdvisor's implementation). |
| 73 | +See `this blog post`_ for details. The `Datadog agent`_ for Docker can be run on a single host as |
| 74 | +follows: |
| 75 | + |
| 76 | +.. code-block:: console |
| 77 | +
|
| 78 | + docker run -d --privileged --name dd-agent -h `hostname` -v /var/run/docker.sock:/var/run/docker.sock -v /proc/mounts:/host/proc/mounts:ro -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro -e API_KEY=YOUR_REAL_API_KEY datadog/docker-dd-agent |
| 79 | +
|
| 80 | +Be sure to substitute ``YOUR_REAL_API_KEY`` for your Datadog API key. |
| 81 | + |
| 82 | +To run Datadog for the entire cluster, you can submit and start a fleet service (again, substitute ``YOUR_REAL_API_KEY``): |
| 83 | + |
| 84 | +.. code-block:: console |
| 85 | +
|
| 86 | + [Unit] |
| 87 | + Description=Datadog |
| 88 | + Requires=docker.socket |
| 89 | + After=docker.socket |
| 90 | +
|
| 91 | + [Service] |
| 92 | + ExecStartPre=/bin/sh -c "docker history datadog/docker-dd-agent:latest >/dev/null || docker pull datadog/docker-dd-agent:latest" |
| 93 | + ExecStart=/usr/bin/docker run --privileged --name dd-agent -h %H -v /var/run/docker.sock:/var/run/docker.sock -v /proc/mounts:/host/proc/mounts:ro -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro -e API_KEY=YOUR_REAL_API_KEY datadog/docker-dd-agent |
| 94 | +
|
| 95 | + [Install] |
| 96 | + WantedBy=multi-user.target |
| 97 | +
|
| 98 | + [X-Fleet] |
| 99 | + Global=true |
| 100 | +
|
| 101 | +Save the file as ``datadog.service``. Load and start the service with |
| 102 | +``fleetctl load datadog.service && fleetctl start datadog.service``. |
| 103 | + |
| 104 | +Shortly thereafter, you should start to see metrics from your Deis cluster appear in your Datadog dashboard. |
| 105 | + |
| 106 | +New Relic |
| 107 | +~~~~~~~~~ |
| 108 | + |
| 109 | +The `New Relic`_ monitoring service's agent will run on the CoreOS host and report metrics to New Relic. |
| 110 | + |
| 111 | +Unlike Datadog, however, the agent running on the host doesn't send metrics for individual containers |
| 112 | +unless those containers have been built with a Dockerfile that installs their own instance of the agent. |
| 113 | + |
| 114 | +The Deis community's own Johannes Würbach has developed a fleet service for New Relic in his |
| 115 | +`newrelic-sysmond`_ repository. |
| 116 | + |
| 117 | +.. _`cadvisor`: https://github.com/google/cadvisor |
| 118 | +.. _`Datadog`: https://www.datadoghq.com |
| 119 | +.. _`Datadog agent`: https://github.com/DataDog/docker-dd-agent |
| 120 | +.. _`heapster`: https://github.com/GoogleCloudPlatform/heapster/blob/master/clusters/coreos/README.md |
| 121 | +.. _`this blog post`: https://www.datadoghq.com/2014/06/monitor-docker-datadog/ |
| 122 | +.. _`New Relic`: http://newrelic.com/ |
| 123 | +.. _`newrelic-sysmond`: https://github.com/johanneswuerbach/newrelic-sysmond-service |
0 commit comments