This is a series. You can find part 1 here and part 2 here.
On this last post I am going to cover the monitoring (with Prometheus and Grafana) for OpenSearch and Fluentd.
FluentD
If you recall from part-1, we set up a specific configuration for Prometheus in Fluentd
main-fluentd-conf.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: fluentd-es-config
namespace: logging
labels:
addonmanager.kubernetes.io/mode: Reconcile
data:
fluent.conf: |-
<source>
type forward
bind 0.0.0.0
port 32000
</source>
[...]
@include /fluentd/etc/prometheus.conf
[...]
Here, we’re adding a Prometheus configuration that will allow us to monitor some metrics coming from FluentD.
prometheus-conf.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: fluentd-prometheus-config
namespace: logging
labels:
addonmanager.kubernetes.io/mode: Reconcile
data:
prometheus.conf: |-
<source>
@type prometheus
bind "#{ENV['FLUENTD_PROMETHEUS_BIND'] || '0.0.0.0'}"
port "#{ENV['FLUENTD_PROMETHEUS_PORT'] || '24231'}"
metrics_path "#{ENV['FLUENTD_PROMETHEUS_PATH'] || '/metrics'}"
</source>
<source>
@type prometheus_output_monitor
interval 10
</source>
<filter kube.**>
@type prometheus
<metric>
name fluentd_input_status_num_records_total
type counter
desc The total number of incoming records
<labels>
tenant_id ${tenant_id}
</labels>
</metric>
</filter>
This configuration will expose, other than basic fluentd’s metrics, also a metric that will tell us how many logs grouped by tenant_id
we’re receving from FluentBit.
Then, we define our ServiceMonitor that will allow Prometheus to scrape metrics from FluentD
fluentd-servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: fluentd
namespace: monitoring
labels:
k8s-app: fluentd-logging
spec:
selector:
matchLabels:
k8s-app: fluentd-logging
namespaceSelector:
matchNames:
- logging
endpoints:
- port: metrics
path: /metrics
Be sure to have the label k8s-app: fluentd-logging
for the FluentD service.
At the end we can import this dashboard to our Grafana instance.
Once we have this dashboard, we can customize it by adding a panel for the metric we exposed for the field tenant_id
; this panel will query Prometheus for this metric sum(rate(fluentd_input_status_num_records_total[1m])) by (tenant_id)
.
OpenSearch
In order to monitor OpenSearch with Prometheus and Grafana, we need to install a plugin on top of the OpenSearch docker’s base image.
FROM opensearchproject/opensearch:1.2.0
RUN /usr/share/opensearch/bin/opensearch-plugin install -b https://github.com/aparo/opensearch-prometheus-exporter/releases/download/1.2.0/prometheus-exporter-1.2.0.zip
and then use this image for the OpenSearch statefulset.
Once the new pods are up and running we can define the ServiceMonitor that will scrape the metrics for OS.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: opensearch
namespace: monitoring
labels:
app: opensearch
spec:
endpoints:
- basicAuth:
password:
name: opensearch-admin-credentials
key: password
username:
name: opensearch-admin-credentials
key: user
port: http
interval: 5s
scheme: https
tlsConfig:
insecureSkipVerify: true
path: /_prometheus/metrics
selector:
matchLabels:
app.kubernetes.io/name: opensearch
prometheus.io/scrape: "true"
namespaceSelector:
matchNames:
- "logging"
---
apiVersion: v1
kind: Secret
metadata:
name: opensearch-admin-credentials
namespace: monitoring
type: Opaque
data:
password: <omitted>
user: <omitted>
As you can see this ServiceMonitor is a bit different from the other one as we need to define:
- basic auth credentials: for this demo here I am using the Admin credential used by OpenSearch Dashboard. Be sure to use a dedicated user for Prometheus (with the right privileges).
- scheme: usually, inside a Kubernetes cluster, we scrape metrics by using http. In this case OpenSearch is exposing https hence the scheme.
- tlsConfig: we need to skip the ssl certs verify as they are self signed.
- path: as stated from the documentation, the path is
/_prometheus/metrics
Then we can use this dashboard in order to display those metrics with Grafana.
Conclusion
With this post we finished the setup and monitoring for our logging infrastructure. By leveraging OpenSearch, FluentBit, FluentD we created an alternative (to ES) multi-tenant logging stack that will allow the dev teams to search and filters for their logs. At the end we also saw how to monitoring this infrastructure by using Prometheus and display these metrics with Grafana.