Skip to content

Commit c0228e5

Browse files
committed
docs(managing_deis): flesh out platform monitoring docs
1 parent fe28cde commit c0228e5

1 file changed

Lines changed: 112 additions & 9 deletions

File tree

docs/managing_deis/platform_monitoring.rst

Lines changed: 112 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,120 @@
44
.. _platform_monitoring:
55

66
Platform monitoring
7-
=========================
7+
===================
88

9-
Comprehensive platform monitoring is a goal for Deis 1.0. We are currently investigating solutions
10-
for this, and progress can be tracked in GitHub issue `#981`_.
9+
While Deis itself doesn't have a built-in monitoring platform, Deis components and deployed
10+
applications alike run entirely within Docker containers. This means that monitoring tools and
11+
services which support Docker containers should work with Deis. A few tools and monitoring services
12+
which support Docker integrations are detailed below.
1113

12-
A requirement for this monitoring system would be to:
14+
Tools
15+
-----
1316

14-
* Track system metrics (CPU, memory, etc.)
15-
* Report Deis component failure
16-
* Track deployed application container states and metrics
17+
cadvisor
18+
~~~~~~~~
1719

18-
Platform monitoring is an ongoing discussion, and feedback is much appreciated in issue `#981`_.
20+
Google's Container Advisor (`cadvisor`_) runs inside a Docker container and shows memory and CPU
21+
usage for all containers running on the host. To run cAdvisor:
1922

20-
.. _`#981`: https://github.com/deis/deis/issues/981
23+
.. code-block:: console
24+
25+
sudo docker run \
26+
--volume=/:/rootfs:ro \
27+
--volume=/var/run:/var/run:rw \
28+
--volume=/sys:/sys:ro \
29+
--volume=/var/lib/docker/:/var/lib/docker:ro \
30+
--publish=8080:8080 \
31+
--detach=true \
32+
--name=cadvisor \
33+
google/cadvisor:latest
34+
35+
To run cAdvisor on all hosts in the cluster, you can submit and start a fleet service:
36+
37+
.. code-block:: console
38+
39+
[Unit]
40+
Description=Google Container Advisor
41+
Requires=docker.socket
42+
After=docker.socket
43+
44+
[Service]
45+
ExecStartPre=/bin/sh -c "docker history google/cadvisor:latest >/dev/null || docker pull google/cadvisor:latest"
46+
ExecStart=/usr/bin/docker run --volume=/:/rootfs:ro --volume=/var/run:/var/run:rw --volume=/sys:/sys:ro --volume=/var/lib/docker/:/var/lib/docker:ro --publish=8080:8080 --name=cadvisor google/cadvisor:latest
47+
48+
[Install]
49+
WantedBy=multi-user.target
50+
51+
[X-Fleet]
52+
Global=true
53+
54+
Save the file as ``cadvisor.service``. Load and start the service with
55+
``fleetctl load cadvisor.service && fleetctl start cadvisor.service``.
56+
57+
The web interface will be accessible at port 8080 on each host.
58+
59+
In addition to starting a cAdvisor instance on each CoreOS host, there's also a project called
60+
`heapster`_ from the Google Cloud Platform team, which seems to be a cluster-aware cAdvisor.
61+
62+
Monitoring services
63+
-------------------
64+
65+
These are a few monitoring services which are known to provide Docker integrations.
66+
Additions to this reference guide are much appreciated!
67+
68+
Datadog
69+
~~~~~~~
70+
71+
The `Datadog`_ cloud monitoring service provides a monitor agent which runs on the host and provides
72+
metrics for all Docker containers (which is functionally similar to cAdvisor's implementation).
73+
See `this blog post`_ for details. The `Datadog agent`_ for Docker can be run on a single host as
74+
follows:
75+
76+
.. code-block:: console
77+
78+
docker run -d --privileged --name dd-agent -h `hostname` -v /var/run/docker.sock:/var/run/docker.sock -v /proc/mounts:/host/proc/mounts:ro -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro -e API_KEY=YOUR_REAL_API_KEY datadog/docker-dd-agent
79+
80+
Be sure to substitute ``YOUR_REAL_API_KEY`` for your Datadog API key.
81+
82+
To run Datadog for the entire cluster, you can submit and start a fleet service (again, substitute ``YOUR_REAL_API_KEY``):
83+
84+
.. code-block:: console
85+
86+
[Unit]
87+
Description=Datadog
88+
Requires=docker.socket
89+
After=docker.socket
90+
91+
[Service]
92+
ExecStartPre=/bin/sh -c "docker history datadog/docker-dd-agent:latest >/dev/null || docker pull datadog/docker-dd-agent:latest"
93+
ExecStart=/usr/bin/docker run --privileged --name dd-agent -h %H -v /var/run/docker.sock:/var/run/docker.sock -v /proc/mounts:/host/proc/mounts:ro -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro -e API_KEY=YOUR_REAL_API_KEY datadog/docker-dd-agent
94+
95+
[Install]
96+
WantedBy=multi-user.target
97+
98+
[X-Fleet]
99+
Global=true
100+
101+
Save the file as ``datadog.service``. Load and start the service with
102+
``fleetctl load datadog.service && fleetctl start datadog.service``.
103+
104+
Shortly thereafter, you should start to see metrics from your Deis cluster appear in your Datadog dashboard.
105+
106+
New Relic
107+
~~~~~~~~~
108+
109+
The `New Relic`_ monitoring service's agent will run on the CoreOS host and report metrics to New Relic.
110+
111+
Unlike Datadog, however, the agent running on the host doesn't send metrics for individual containers
112+
unless those containers have been built with a Dockerfile that installs their own instance of the agent.
113+
114+
The Deis community's own Johannes Würbach has developed a fleet service for New Relic in his
115+
`newrelic-sysmond`_ repository.
116+
117+
.. _`cadvisor`: https://github.com/google/cadvisor
118+
.. _`Datadog`: https://www.datadoghq.com
119+
.. _`Datadog agent`: https://github.com/DataDog/docker-dd-agent
120+
.. _`heapster`: https://github.com/GoogleCloudPlatform/heapster/blob/master/clusters/coreos/README.md
121+
.. _`this blog post`: https://www.datadoghq.com/2014/06/monitor-docker-datadog/
122+
.. _`New Relic`: http://newrelic.com/
123+
.. _`newrelic-sysmond`: https://github.com/johanneswuerbach/newrelic-sysmond-service

0 commit comments

Comments
 (0)