You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/managing-workflow/platform-monitoring.md
+14-9Lines changed: 14 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,7 @@
1
1
# Platform Monitoring
2
2
3
3
## Description
4
+
4
5
With the release of Workflow Beta4 we now include a monitoring stack for introspection on a running Kubernetes cluster. The stack includes 4 components:
5
6
6
7
*[Telegraf](https://docs.influxdata.com/telegraf/v0.12/) - Metrics collection daemon written by team behind InfluxDB.
@@ -36,21 +37,22 @@ With the release of Workflow Beta4 we now include a monitoring stack for introsp
36
37
37
38
### Grafana
38
39
39
-
We expose Grafana through the router using [service annotations](https://github.com/deis/router#how-it-works). This
40
-
allows users to access the Grafana UI by accessing `grafana.mydomain.com`. While we provide a default username/password
41
-
of `admin/admin` this can be overridden at any time by setting the following environment variables in
40
+
Deis Workflow exposes Grafana through the router using [service annotations](https://github.com/deis/router#how-it-works). This
41
+
allows users to access the Grafana UI at `http://grafana.mydomain.com`. The default username/password of
42
+
`admin/admin` can be overridden at any time by setting the following environment variables in
42
43
`$CHART_HOME/workspace/workflow-$WORKFLOW_RELEASE/manifests/deis-monitor-grafana-rc.yaml`: `GRAFANA_USER` and
43
44
`GRAFANA_PASSWD`.
44
45
45
-
It will preload several dashboards that we've created to help operators get started with monitoring their Kubernetes and
46
-
Workflow installations. Each dashboard is meant to be a starting place for the operator and is not representative of all
47
-
the dashboards needed to monitor a production installation.
46
+
Grafana will preload several dashboards to help operators get started with monitoring Kubernetes and Deis Workflow.
47
+
These dashboards are meant as starting points and don't include every item that might be desirable to monitor in a
48
+
production installation.
48
49
49
-
We are currently not writing the data to the host file system or to longterm storage. Therefore, if the Grafana
50
-
instance dies you will lose all custom and modified dashboards. It is recommended that you export your dashboards and
51
-
store them in version control until a solution is implemented for long term storage.
50
+
Deis Workflow monitoring does not currently write data to the host filesystem or to long-term storage. If the Grafana
51
+
instance fails, modified dashboards are lost. Until there is a solution to persist this, export dashboards and store
52
+
them separately in version control.
52
53
53
54
### InfluxDB
55
+
54
56
As of the Beta4 release InfluxDB is writing data to the host disk, however, if the InfluxDB pod dies and comes back on
55
57
another host the data will not be recovered. We intend to fix this in a future release. The InfluxDB Admin UI is also
56
58
exposed through the router allowing users to access the query engine by going to `influx.mydomain.com`. You will need to
@@ -65,6 +67,7 @@ You can choose to not expose the Influx UI and API to the world by updating
65
67
following line - `router.deis.io/routable: "true"`.
66
68
67
69
### Telegraf
70
+
68
71
Telegraf is the metrics collection daemon used within the monitoring stack. It will collect and send the following metrics to InfluxDB:
69
72
70
73
* System level metrics such as CPU, Load Average, Memory, Disk, and Network stats
@@ -74,9 +77,11 @@ Telegraf is the metrics collection daemon used within the monitoring stack. It w
74
77
It is possible to send these metrics to other endpoints besides InfluxDB. For more information please consult the following [file](https://github.com/deis/monitor/blob/master/telegraf/rootfs/config.toml.tpl)
75
78
76
79
### Stdout-Metrics
80
+
77
81
Stdout-Metrics is a custom tool built by the Deis team to provide metrics that are reported via standard out - like Nginx. It consumes the log stream from FluentD filtering out messages that are not from the [Deis Router](https://github.com/deis/router). Once it finds a message it can parse it will turn that into a metric and send it directly to InfluxDB.
78
82
79
83
### Customizing
84
+
80
85
Each of these components allows for customization via environment variables. If you would like to learn more please visit the following github repositories:
0 commit comments