Prometheus + Grafana

Overview

This template provides a production‑ready Prometheus + Grafana monitoring stack as Monk runnables. You can:

Run it directly to get a complete monitoring solution with metrics collection and visualization
Inherit it in your own infrastructure to add observability to your applications

Prometheus is a time-series database and monitoring system that collects metrics via HTTP pulls. Grafana is a visualization platform that creates dashboards from Prometheus and other data sources. Together, they provide a powerful, open-source monitoring stack.

What this template manages

Prometheus server for metrics collection
Grafana for visualization and dashboards
AlertManager for alerting (optional)
Service discovery and scrape configuration
Persistent storage for metrics and dashboards
Pre-configured data sources

Quick start (run directly)

Load templates

monk load MANIFEST

Run the monitoring stack

monk run prometheus-grafana/stack

Customize credentials (recommended via inheritance)

Running directly uses the defaults defined in this template’s variables. Secrets added with monk secrets add will not affect this runnable unless you inherit it and reference those secrets.

Preferred: inherit and replace variables with secret("...") as shown below.
Alternative: fork/clone and edit the variables in the template, then monk load MANIFEST and run.

Once started:

Prometheus UI: http://localhost:9090
Grafana UI: http://localhost:3000 (default: admin/admin)

Configuration

Key variables you can customize in this template:

variables:
  # Prometheus
  prometheus-image-tag: "latest"      # Prometheus image tag
  prometheus-port: "9090"             # Prometheus UI/API port
  scrape-interval: "15s"              # metrics scrape interval
  retention-time: "15d"               # metrics retention period
  
  # Grafana
  grafana-image-tag: "latest"         # Grafana image tag
  grafana-port: "3000"                # Grafana UI port
  grafana-admin-user: "admin"         # admin username
  grafana-admin-password: "..."       # admin password

Data is persisted under ${monk-volume-path}/prometheus and ${monk-volume-path}/grafana on the host.

Use by inheritance (recommended for monitoring)

Inherit the stack to monitor your applications. Example:

namespace: myapp
monitoring:
  defines: runnable
  inherits: prometheus-grafana/stack
  variables:
    grafana-admin-password: <- secret("grafana-password")
api:
  defines: runnable
  containers:
    api:
      image: myorg/api
      labels:
        prometheus.scrape: "true"
        prometheus.port: "8080"
        prometheus.path: "/metrics"
  connections:
    monitor:
      runnable: monitoring
      service: prometheus

Then set the secrets once and run your app group:

monk secrets add -g grafana-password="STRONG_PASSWORD"
monk run myapp/api

Ports and connectivity

Service: prometheus on TCP port 9090
Service: grafana on TCP port 3000
Service: alertmanager on TCP port 9093 (if enabled)
From other runnables in the same process group, use connection-hostname("\<connection-name>") to resolve service hosts.
From monitored services, Prometheus scrapes metrics via HTTP

Persistence and configuration

Prometheus data: ${monk-volume-path}/prometheus:/prometheus
Grafana data: ${monk-volume-path}/grafana:/var/lib/grafana
Prometheus config: ${monk-volume-path}/prometheus/config
You can customize Prometheus scrape configs and Grafana dashboards via the mounted volumes.

Features

Prometheus

Time-series metrics database
Powerful PromQL query language
Service discovery (Kubernetes, Docker, Consul, etc.)
Pull-based metrics collection
Alerting with AlertManager
High availability and federation

Grafana

Beautiful, customizable dashboards
Multiple data source support
Templating and variables
Alerting and notifications
User management and RBAC
Dashboard sharing and versioning

Metrics Exposition

Expose metrics from your applications:

# Python example with prometheus_client
from prometheus_client import Counter, start_http_server

requests = Counter('http_requests_total', 'Total HTTP requests')

@app.route('/metrics')
def metrics():
    return generate_latest()

Configure Prometheus to scrape:

scrape_configs:
  - job_name: 'myapp'
    static_configs:
      - targets: ['api:8080']

Alerting

Configure alerts in Prometheus:

groups:
  - name: example
    rules:
      - alert: HighErrorRate
        expr: rate(http_errors_total[5m]) > 0.05
        for: 10m
        annotations:
          summary: "High error rate detected"

Use cases

This stack excels at:

Application performance monitoring
Infrastructure monitoring
Real-time alerting
Capacity planning
SLA monitoring
DevOps observability

Use alertmanager/ for advanced alerting and notification routing
Integrate with node-exporter/ for system metrics collection
Combine with loki/ for log aggregation and correlation

Troubleshooting

Access Prometheus targets at http://localhost:9090/targets to verify scrape status
Check Grafana data sources in Settings → Data Sources
Verify metrics are being scraped:

# Query Prometheus API
curl 'http://localhost:9090/api/v1/query?query=up'

Check logs:

monk logs -l 500 -f prometheus-grafana/prometheus
monk logs -l 500 -f prometheus-grafana/grafana

For missing metrics, verify:
- Service is exposing metrics on the configured port
- Prometheus can reach the target (check firewalls)
- Scrape configuration is correct
For Grafana dashboard issues, check data source configuration and time ranges

Networking

CDN & DNS

Identity & Auth

Database

Compute

Serverless

Storage

Messaging

Devtools

Analytics Monitoring

Hosting & CI/CD

Payments & Billing

Cache

Web Server

Database Tools

Data Integration

Data Engineering

Communication

Infrastructure

CMS

Observability

DevOps

Big Data

API

Security

Monitoring

Analytics

Automation

Customer Support

Message Broker

Development

Search

AI/ML

Documentation

Social

​Overview

​What this template manages

​Quick start (run directly)

​Configuration

​Use by inheritance (recommended for monitoring)

​Ports and connectivity

​Persistence and configuration

​Features

​Prometheus

​Grafana

​Metrics Exposition

​Alerting

​Use cases

​Related templates

​Troubleshooting