This is a series. You can find part 1 here and part 3 here.
We were left on part-1 with a fully functional log pipeline. As I said, we’re going to check how to achieve a single shared index in OpenSearch.
Shared Index
If you recall, on the previous part we set a field on each document we were indexing in OpenSearch by leveraging the FluentD record_transformer plugin.
main-fluentd-conf.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: fluentd-es-config
namespace: logging
labels:
addonmanager.kubernetes.io/mode: Reconcile
data:
fluent.conf: |-
<source>
@type forward
bind 0.0.0.0
port 32000
</source>
<filter kube.**>
@type record_transformer
remove_keys $.kubernetes.annotations, $.kubernetes.labels, $.kubernetes.pod_id, $.kubernetes.docker_id, logtag
</filter>
<filter kube.tenant-1.**>
@type record_transformer
<record>
tenant_id "tenant-1"
</record>
</filter>
<filter kube.tenant-2.**>
@type record_transformer
<record>
tenant_id "tenant-2"
</record>
</filter>
@include /fluentd/etc/prometheus.conf
@include /fluentd/etc/tenant-1.conf
@include /fluentd/etc/tenant-2.conf
[...]
Basically here we are adding the field “tenant_id” to each log coming from different tenants (fluent bit instances). This field will be used to restrict access to the tenants on OpenSearch’s “Discover” section and to achieve a routing that forwards each search request for the “tenant_id” to a specific shard without forwarding the request to the primary or replica of all N shards. By simple means: all documents with the same routing value will be stored on the same shard.
shard = hash(routing) % number_of_primary_shards
By default the routing value is equal to the document’s _id but we are allowed to override it by supplying our custom field. Let’s start by refactoring the above fluentd configuration:
main-fluentd-conf.yaml
[...]
@include /fluentd/etc/prometheus.conf
@include /fluentd/etc/general-tenant.conf
We just provide a single fluentD general-tenant.conf file where we match all the logs coming with the kube tag.
general-tenant-conf.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: general-tenant-config
namespace: logging
labels:
addonmanager.kubernetes.io/mode: Reconcile
data:
general-tenant.conf: |-
<match kube.**>
@type elasticsearch
@id out_es_general_tenant
@log_level "info"
routing_key tenant_id # <- Added here
include_tag_key true
host "#{ENV['FLUENT_ELASTICSEARCH_HOST'] || 'localhost'}"
port "#{ENV['FLUENT_ELASTICSEARCH_PORT'] || '9200'}"
user "#{ENV['FLUENT_ELASTICSEARCH_USER'] || 'admin'}"
password "#{ENV['FLUENT_ELASTICSEARCH_PASSWORD'] || 'admin'}"
scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
ssl_verify false
reload_connections false
reconnect_on_error true
reload_on_failure true
logstash_prefix "application-logs"
logstash_dateformat "%Y.%m"
logstash_format true
type_name "_doc"
suppress_type_name true
template_overwrite true
request_timeout 30s
<buffer>
@type file
path /var/log/fluentd-buffers/general-tenant/kubernetes.system.buffer
retry_type exponential_backoff
flush_thread_count 2
flush_interval 10s
retry_max_interval 30
retry_forever true
chunk_limit_size 8M
queue_limit_length 512
overflow_action block
</buffer>
</match>
Note that this file will match every log streams in this way but we can create more specific matching if we’ll ever need it just by specificing another configuration before this general one.
Note that we added routing_key tenant_id
in this configuration and ended up by creating a single monthly index named “application-logs”
OpenSearch Security
Now we need to isolate each tenant from each other; we need a way to let the users from a tenant to see only their logs. We can achive this with the OpenSearch Security plugin (installed by default) and configuring the related roles / roles-mapping and tenant:
create-role-conf.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: create-role
data:
tenant-1-create-role.json: |-
{
"cluster_permissions": [],
"index_permissions": [{
"index_patterns": [
"application-logs-*"
],
"dls": "{\"bool\": {\"must\": {\"match\": { \"tenant_id\":\"tenant-1\"}}}}",
"fls": [],
"masked_fields": [],
"allowed_actions": [
"read",
"get",
"search"
]
}],
"tenant_permissions": [{
"tenant_patterns": [
"tenant-1"
],
"allowed_actions": []
}]
}
tenant-2-create-role.json: |-
{
"cluster_permissions": [],
"index_permissions": [{
"index_patterns": [
"application-logs-*"
],
"dls": "{\"bool\": {\"must\": {\"match\": { \"tenant_id\":\"tenant-2\"}}}}",
"fls": [],
"masked_fields": [],
"allowed_actions": [
"read",
"get",
"search"
]
}],
"tenant_permissions": [{
"tenant_patterns": [
"tenant-2"
],
"allowed_actions": []
}]
}
Basically here we’re telling OpenSearch to create two roles: one for tenant-1 and the other for tenant-2. For each we’re restricting OS to only return the documents where the field tenant_id
is equal to the relative tenant for the index name “application-logs”.
Then we can create the tenant configuration: each user that will login to OpenSearch will be assigned to the relative tenant.
apiVersion: v1
kind: ConfigMap
metadata:
name: create-tenant
data:
tenant-1-create-tenant.json: |-
{
"description": "A tenant for the Tenant-1 dev team."
}
tenant-2-create-tenant.json: |-
{
"description": "A tenant for the Tenant-2 dev team."
}
And the mapping (here I get a backend role from the LDAP as I configured LDAP integration in OpenSearch)
apiVersion: v1
kind: ConfigMap
metadata:
name: create-role-mapping
data:
tenant-1-create-role-mapping.json: |-
{
"backend_roles" : [ "Tenant-1_ROLE_FROM_LDAP" ],
"hosts" : [],
"users" : []
}
tenant-2-create-role-mapping.json: |-
{
"backend_roles" : [ "Tenant-2_ROLE_FROM_LDAP" ],
"hosts" : [],
"users" : []
}
So how do we send this Jsons to OpenSearch? We could have created this resources just by using the OpenSearch UI. Since we want to keep everything reproducible I make use of an init container. (NB: mount username and password in the init container and pass them on the script below)
apiVersion: batch/v1
kind: Job
metadata:
name: init-opensearch
spec:
ttlSecondsAfterFinished: 100
backoffLimit: 1
template:
spec:
containers:
- name: curl
image: curl
command:
- /bin/sh
- -c
- |
curl -XPUT -u $USERNAME:$PASSWORD --insecure --header "Content-Type: application/json" --data-binary "@/var/tenant/tenant-1-create-tenant.json" https://opensearch-cluster-master:9200/_plugins/_security/api/tenants/tenant-1
curl -XPUT -u $USERNAME:$PASSWORD --insecure --header "Content-Type: application/json" --data-binary "@/var/tenant/tenant-2-create-tenant.json" https://opensearch-cluster-master:9200/_plugins/_security/api/tenants/tenant-2
curl -XPUT -u $USERNAME:$PASSWORD --insecure --header "Content-Type: application/json" --data-binary "@/var/role/tenant-1-create-role.json" https://opensearch-cluster-master:9200/_plugins/_security/api/roles/tenant-1
curl -XPUT -u $USERNAME:$PASSWORD --insecure --header "Content-Type: application/json" --data-binary "@/var/role/tenant-2-create-role.json" https://opensearch-cluster-master:9200/_plugins/_security/api/roles/tenant-2
curl -XPUT -u $USERNAME:$PASSWORD --insecure --header "Content-Type: application/json" --data-binary "@/var/role-mapping/tenant-1-create-role-mapping.json" https://opensearch-cluster-master:9200/_plugins/_security/api/rolesmapping/tenant-1
curl -XPUT -u $USERNAME:$PASSWORD --insecure --header "Content-Type: application/json" --data-binary "@/var/role-mapping/tenant-2-create-role-mapping.json" https://opensearch-cluster-master:9200/_plugins/_security/api/rolesmapping/tenant-2
volumeMounts:
- mountPath: /var/tenant/tenant-1-create-tenant.json
name: create-tenant
subPath: tenant-1-create-tenant.json
- mountPath: /var/tenant/tenant-2-create-tenant.json
name: create-tenant
subPath: tenant-2-create-tenant.json
- mountPath: /var/role/tenant-1-create-role.json
name: create-role
subPath: tenant-1-create-role.json
- mountPath: /var/role/tenant-2-create-role.json
name: create-role
subPath: tenant-2-create-role.json
- mountPath: /var/role-mapping/tenant-1-create-role-mapping.json
name: create-role-mapping
subPath: tenant-1-create-role-mapping.json
- mountPath: /var/role-mapping/tenant-2-create-role-mapping.json
name: create-role-mapping
subPath: tenant-2-create-role-mapping.json
restartPolicy: Never
volumes:
- name: create-tenant
configMap:
name: create-tenant
- name: create-role
configMap:
name: create-role
- name: create-role-mapping
configMap:
name: create-role-mapping
After the apply of the above init container we get the multi tenancy on OpenSearch; the users will be restricted to access their logs since we’re filtering based on the field “tenant_id”.
Conclusion
In this post we managed to have a single shared index for all the tenants we have by specifying the routing key and, with a bunch of roles for ElasticSearch we were also able to restrict what documents a tenant could see. This allowed us to save memory and cpu since we ended up by having just one index instead of a dedicated index per tenant.
In the next post we will check how to monitor OpenSearch and FluentD with Prometheus.