mattermost/server/build/docker-compose.common.yml

164 lines
5.1 KiB
YAML
Raw Permalink Normal View History

MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
x-logging: &default-logging
driver: "json-file"
options:
tag: "{{.Name}}"
services:
postgres:
image: "postgres:14"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
restart: always
networks:
- mm-test
environment:
POSTGRES_USER: mmuser
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-mostest}
POSTGRES_DB: mattermost_test
MM-64878: FIPS Build (#33809) * pin to ubuntu-24.04 * always use FIPS compatible Postgres settings * use sha256 for remote cluster IDs * use sha256 for client config hash * rework S3 backend to be FIPS compatible * skip setup-node during build, since already in container * support FIPS builds * Dockerfile for FIPS image, using glibc-openssl-fips * workaround entrypoint inconsistencies * authenticate to DockerHub * fix FIPS_ENABLED, add test-mmctl-fips * decouple check-mattermost-vet from test/build steps * fixup! decouple check-mattermost-vet from test/build steps * only build-linux-amd64 for fips * rm entrypoint workaround * tweak comment grammar * rm unused Dockerfile.fips (for now) * ignore gpg import errors, since would fail later anyway * for fips, only make package-linux-amd64 * set FIPS_ENABLED for build step * Add a FIPS-specific list of prepackaged plugins Note that the names are still temporary, since they are not uploaded to S3 yet. We may need to tweak them when that happens. * s/golangci-lint/check-style/ This ensures we run all the `check-style` checks: previously, `modernize` was missing. * pin go-vet to @v2, remove annoying comment * add -fips to linux-amd64.tz.gz package * rm unused setup-chainctl * use BUILD_TYPE_NAME instead * mv fips build to enterprise-only * fixup! use BUILD_TYPE_NAME instead * temporarily pre-package no plugins for FIPS * split package-cleanup * undo package-cleanup, just skip ARM, also test * skip arm for FIPS in second target too * fmt Makefile * Revert "rm unused Dockerfile.fips (for now)" This reverts commit 601e37e0fff7b7703540bb9e91961ad8bb83b2e7. * reintroduce Dockerfile.fips and align with existing Dockerfile * s/IMAGE/BUILD_IMAGE/ * bump the glibc-openssl-fips version * rm redundant comment * fix FIPS checks * set PLUGIN_PACKAGES empty until prepackaged plugins ready * upgrade glibc-openssl-fips, use non-dev version for final stage * another BUILD_IMAGE case * Prepackage the FIPS versions of plugins * relocate FIPS_ENABLED initialization before use * s/Config File MD5/Config File Hash/ * Update the FIPS plugin names and encode the + sign * add /var/tmp for local socket manipulation --------- Co-authored-by: Alejandro García Montoro <alejandro.garciamontoro@gmail.com> Co-authored-by: Mattermost Build <build@mattermost.com>
2025-09-15 09:53:28 -04:00
POSTGRES_INITDB_ARGS: "--auth-host=scram-sha-256 --auth-local=scram-sha-256"
command: postgres -c 'config_file=/etc/postgresql/postgresql.conf'
volumes:
- "./docker/postgres.conf:/etc/postgresql/postgresql.conf:Z"
- "./docker/postgres_node_database.sql:/docker-entrypoint-initdb.d/postgres_node_database.sql:Z"
healthcheck:
test: [ "CMD", "pg_isready", "-h", "localhost" ]
interval: 5s
timeout: 10s
retries: 3
minio:
image: "minio/minio:RELEASE.2024-06-22T05-26-45Z"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
2024-11-04 10:04:06 -05:00
command: "server /data --console-address :9002"
networks:
- mm-test
environment:
MINIO_ROOT_USER: minioaccesskey
MINIO_ROOT_PASSWORD: miniosecretkey
MINIO_KMS_SECRET_KEY: my-minio-key:OSMM+vkKUTCvQs9YL/CVMIMt43HFhkUpqJxTmGl6rYw=
azurite:
image: "mcr.microsoft.com/azure-storage/azurite:3.34.0"
logging: *default-logging
command: "azurite-blob --blobHost 0.0.0.0 --blobPort 10000 --skipApiVersionCheck"
networks:
- mm-test
inbucket:
image: "inbucket/inbucket:3.1.1"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
restart: always
2021-05-12 06:45:03 -04:00
environment:
INBUCKET_WEB_ADDR: "0.0.0.0:9001"
2021-05-12 06:45:03 -04:00
INBUCKET_POP3_ADDR: "0.0.0.0:10110"
INBUCKET_SMTP_ADDR: "0.0.0.0:10025"
networks:
- mm-test
openldap:
image: "osixia/openldap:1.4.0"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
restart: always
networks:
- mm-test
environment:
LDAP_TLS_VERIFY_CLIENT: "never"
LDAP_ORGANISATION: "Mattermost Test"
LDAP_DOMAIN: "mm.test.com"
LDAP_ADMIN_PASSWORD: "mostest"
elasticsearch:
build:
context: .
dockerfile: ./Dockerfile.elasticsearch
args:
ELASTICSEARCH_VERSION: ${ELASTICSEARCH_VERSION:-9.0.0}
networks:
- mm-test
environment:
http.host: "0.0.0.0"
http.port: 9200
http.cors.enabled: "true"
http.cors.allow-origin: "http://localhost:1358,http://127.0.0.1:1358"
http.cors.allow-headers: "X-Requested-With,X-Auth-Token,Content-Type,Content-Length,Authorization"
http.cors.allow-credentials: "true"
transport.host: "127.0.0.1"
xpack.security.enabled: "false"
action.destructive_requires_name: "false"
ES_JAVA_OPTS: "-Xms512m -Xmx512m"
opensearch:
build:
context: .
dockerfile: ./Dockerfile.opensearch
networks:
- mm-test
environment:
http.host: "0.0.0.0"
http.port: 9201
http.cors.enabled: "true"
http.cors.allow-origin: "http://localhost:1358,http://127.0.0.1:1358"
http.cors.allow-headers: "X-Requested-With,X-Auth-Token,Content-Type,Content-Length,Authorization"
http.cors.allow-credentials: "true"
transport.host: "127.0.0.1"
discovery.type: single-node
plugins.security.disabled: "true"
ES_JAVA_OPTS: "-Xms512m -Xmx512m"
redis:
image: "redis:7.4.0"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
networks:
- mm-test
dejavu:
image: "appbaseio/dejavu:3.4.2"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
networks:
- mm-test
keycloak:
image: "quay.io/keycloak/keycloak:23.0.7"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
restart: always
entrypoint: /opt/keycloak/bin/kc.sh start --import-realm
networks:
- mm-test
environment:
KEYCLOAK_ADMIN: admin
KEYCLOAK_ADMIN_PASSWORD: admin
KC_HOSTNAME_STRICT: 'false'
KC_HOSTNAME_STRICT_HTTPS: 'false'
KC_HTTP_ENABLED: 'true'
volumes:
- "./docker/keycloak/realm-export.json:/opt/keycloak/data/import/realm-export.json:Z"
prometheus:
image: "prom/prometheus:v2.46.0"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
user: root
volumes:
- "./docker/prometheus.yml:/etc/prometheus/prometheus.yml:Z"
- "/var/run/docker.sock:/var/run/docker.sock"
networks:
- mm-test
extra_hosts:
- "host.docker.internal:host-gateway"
grafana:
image: "grafana/grafana:10.4.2"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
volumes:
- "./docker/grafana/grafana.ini:/etc/grafana/grafana.ini:Z"
- "./docker/grafana/provisioning:/etc/grafana/provisioning:Z"
- "./docker/grafana/dashboards:/var/lib/grafana/dashboards:Z"
networks:
- mm-test
loki:
image: "grafana/loki:3.0.0"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
logging: *default-logging
volumes:
- "./docker/loki/loki-config.yaml:/etc/loki/local-config.yaml:Z"
networks:
- mm-test
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
otel-collector:
image: "otel/opentelemetry-collector-contrib:0.145.0"
logging: *default-logging
user: "0:0"
volumes:
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
- "./docker/otel-collector/otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml:Z"
- "/var/lib/docker/containers:/var/lib/docker/containers:ro"
MM-67668: Replace Promtail with OpenTelemetry collector (#35381) * Add container name to Docker logs This will allow for querying Loki by container's name: {job="docker",container_name="mattermost-postgres"} * Configue Loki to prepare for OTLP ingestion - Add a volume to Loki container to get the config - Configure Loki with the expected labels so that we can query by job, app, container.name... * Add OpenTelemetry collector configuration There are three pipelines: 1. logs/mattermost scrapes the logs from mattermost.log, parsing the timestamp and severity, and pushes them to Loki. 2. logs/docker scrapes the Docker logs from *-json.log, parsing the timestamp, the log itself and the container name, and pushes them to Loki. 3. metrics/docker scrapes the Docker socket to retrieve the containers' uptime values and pushes them to Prometheus. * Replace Promtail with OpenTelemetry collector * Update build tooling for OpenTelemetry collector 1. Make sure that the logs directory is created 2. Swap Promtail with OpenTelemetry collector * Scrape collector to get Docker stats Prometheus needs to scrape the OpenTelemetry collector in the exposed port to get the Docker stats, so that we can query the uptime with metric container_uptime_seconds, which has a container_name label to filter by container. * Update Grafana dashboard for Docker health checks 1. Use Prometheus as the datasource in all queries 2. Simplify the mappings to either 0 (offline, red) or 1 (online, green). 3. Unify all queries on container_uptime_seconds, filtering by container_name, and making sure that the latest value we got is at most 15 seconds old, so that it does not show stale data. 4. Add Redis health check, that was missing 5. Update the dashboard title to Docker containers * Tune Loki and OTel collector configs for local dev - Switch filelog receivers to start_at: beginning so existing logs are ingested on collector startup, not just new entries. - Fix Docker log timestamp layout to use 9s (variable-length nanos) instead of 0s (fixed-width), matching actual Docker JSON log format. - Add ingester max_chunk_age to keep chunks open longer in the single-instance dev setup, so that we can ingest older logs (the window is max_chunk_age/2). - Relax Loki limits for local development: allow unordered writes, disable old-sample rejection, and raise ingestion rate/burst to 64 MB to avoid throttling during bulk ingest.
2026-02-27 10:48:17 -05:00
- "/var/run/docker.sock:/var/run/docker.sock:ro"
- "../logs:/logs:ro,Z"
command: ["--config=/etc/otelcol-contrib/config.yaml"]
networks:
- mm-test