Skip to main content

Overview

This template provides a production‑ready Prometheus + Grafana monitoring stack as Monk runnables. You can:
  • Run it directly to get a complete monitoring solution with metrics collection and visualization
  • Inherit it in your own infrastructure to add observability to your applications
Prometheus is a time-series database and monitoring system that collects metrics via HTTP pulls. Grafana is a visualization platform that creates dashboards from Prometheus and other data sources. Together, they provide a powerful, open-source monitoring stack.

What this template manages

  • Prometheus server for metrics collection
  • Grafana for visualization and dashboards
  • AlertManager for alerting (optional)
  • Service discovery and scrape configuration
  • Persistent storage for metrics and dashboards
  • Pre-configured data sources

Quick start (run directly)

  1. Load templates
monk load MANIFEST
  1. Run the monitoring stack
monk run prometheus-grafana/stack
  1. Customize credentials (recommended via inheritance)
Running directly uses the defaults defined in this template’s variables. Secrets added with monk secrets add will not affect this runnable unless you inherit it and reference those secrets.
  • Preferred: inherit and replace variables with secret("...") as shown below.
  • Alternative: fork/clone and edit the variables in the template, then monk load MANIFEST and run.
Once started:
  • Prometheus UI: http://localhost:9090
  • Grafana UI: http://localhost:3000 (default: admin/admin)

Configuration

Key variables you can customize in this template:
variables:
  # Prometheus
  prometheus-image-tag: "latest"      # Prometheus image tag
  prometheus-port: "9090"             # Prometheus UI/API port
  scrape-interval: "15s"              # metrics scrape interval
  retention-time: "15d"               # metrics retention period
  
  # Grafana
  grafana-image-tag: "latest"         # Grafana image tag
  grafana-port: "3000"                # Grafana UI port
  grafana-admin-user: "admin"         # admin username
  grafana-admin-password: "..."       # admin password
Data is persisted under ${monk-volume-path}/prometheus and ${monk-volume-path}/grafana on the host. Inherit the stack to monitor your applications. Example:
namespace: myapp
monitoring:
  defines: runnable
  inherits: prometheus-grafana/stack
  variables:
    grafana-admin-password: <- secret("grafana-password")
api:
  defines: runnable
  containers:
    api:
      image: myorg/api
      labels:
        prometheus.scrape: "true"
        prometheus.port: "8080"
        prometheus.path: "/metrics"
  connections:
    monitor:
      runnable: monitoring
      service: prometheus
Then set the secrets once and run your app group:
monk secrets add -g grafana-password="STRONG_PASSWORD"
monk run myapp/api

Ports and connectivity

  • Service: prometheus on TCP port 9090
  • Service: grafana on TCP port 3000
  • Service: alertmanager on TCP port 9093 (if enabled)
  • From other runnables in the same process group, use connection-hostname("\<connection-name>") to resolve service hosts.
  • From monitored services, Prometheus scrapes metrics via HTTP

Persistence and configuration

  • Prometheus data: ${monk-volume-path}/prometheus:/prometheus
  • Grafana data: ${monk-volume-path}/grafana:/var/lib/grafana
  • Prometheus config: ${monk-volume-path}/prometheus/config
  • You can customize Prometheus scrape configs and Grafana dashboards via the mounted volumes.

Features

Prometheus

  • Time-series metrics database
  • Powerful PromQL query language
  • Service discovery (Kubernetes, Docker, Consul, etc.)
  • Pull-based metrics collection
  • Alerting with AlertManager
  • High availability and federation

Grafana

  • Beautiful, customizable dashboards
  • Multiple data source support
  • Templating and variables
  • Alerting and notifications
  • User management and RBAC
  • Dashboard sharing and versioning

Metrics Exposition

Expose metrics from your applications:
# Python example with prometheus_client
from prometheus_client import Counter, start_http_server

requests = Counter('http_requests_total', 'Total HTTP requests')

@app.route('/metrics')
def metrics():
    return generate_latest()
Configure Prometheus to scrape:
scrape_configs:
  - job_name: 'myapp'
    static_configs:
      - targets: ['api:8080']

Alerting

Configure alerts in Prometheus:
groups:
  - name: example
    rules:
      - alert: HighErrorRate
        expr: rate(http_errors_total[5m]) > 0.05
        for: 10m
        annotations:
          summary: "High error rate detected"

Use cases

This stack excels at:
  • Application performance monitoring
  • Infrastructure monitoring
  • Real-time alerting
  • Capacity planning
  • SLA monitoring
  • DevOps observability
  • Use alertmanager/ for advanced alerting and notification routing
  • Integrate with node-exporter/ for system metrics collection
  • Combine with loki/ for log aggregation and correlation

Troubleshooting

  • Access Prometheus targets at http://localhost:9090/targets to verify scrape status
  • Check Grafana data sources in Settings → Data Sources
  • Verify metrics are being scraped:
# Query Prometheus API
curl 'http://localhost:9090/api/v1/query?query=up'
  • Check logs:
monk logs -l 500 -f prometheus-grafana/prometheus
monk logs -l 500 -f prometheus-grafana/grafana
  • For missing metrics, verify:
    • Service is exposing metrics on the configured port
    • Prometheus can reach the target (check firewalls)
    • Scrape configuration is correct
  • For Grafana dashboard issues, check data source configuration and time ranges