Deploy Prometheus and Grafana on Kubernetes - Complete Monitoring Stack Setup

Setting up a comprehensive monitoring solution is essential for maintaining visibility into your Kubernetes cluster’s health and performance. This guide walks you through deploying Prometheus and Grafana using the kube-prometheus operator, providing a complete observability stack with metrics collection, visualization, and alerting capabilities.

Prometheus serves as the metrics collection and storage engine, while Grafana provides rich visualization dashboards. Together with Alertmanager, they form a powerful monitoring ecosystem that can help you proactively identify and resolve issues in your Kubernetes environment.

Prerequisites

Before proceeding with this deployment, ensure you have:

  • A running Kubernetes cluster with kubectl configured
  • Sufficient cluster resources (minimum 4GB RAM, 2 CPU cores available)
  • Administrative access to create namespaces and deploy resources
  • Git installed for repository cloning
  • Basic familiarity with Kubernetes concepts and port forwarding

This guide assumes you’re working with a cluster similar to the setup described in our Complete Kubernetes Cluster Setup Guide with Cilium on Arch Linux.

Understanding the Kube-Prometheus Stack

The kube-prometheus project provides a complete monitoring stack that includes:

  • Prometheus Operator: Manages Prometheus instances and related resources
  • Prometheus: Time-series database for metrics collection
  • Grafana: Visualization and dashboard platform
  • Alertmanager: Handles alerts sent by Prometheus
  • Node Exporter: Collects hardware and OS metrics
  • kube-state-metrics: Generates metrics about Kubernetes API objects

This stack follows cloud-native best practices and integrates seamlessly with Kubernetes.

Step 1: Clone and Prepare the Repository

Start by cloning the official kube-prometheus repository and navigating to the project directory:

git clone https://github.com/prometheus-operator/kube-prometheus.git
cd kube-prometheus

The repository contains all the necessary Kubernetes manifests organized in a structured way:

  • manifests/setup/ - Contains CustomResourceDefinitions (CRDs) and namespace setup
  • manifests/ - Contains the actual deployment manifests for all components

Step 2: Deploy Namespace and Custom Resource Definitions

First, create the monitoring namespace and deploy the required Custom Resource Definitions (CRDs):

kubectl create -f manifests/setup

This command will:

  • Create the monitoring namespace
  • Install Prometheus Operator CRDs
  • Set up necessary RBAC permissions
  • Establish the foundation for the monitoring stack

Wait for the setup to complete before proceeding to the next step. You can verify the namespace creation with:

kubectl get namespace monitoring

Step 3: Deploy the Prometheus Monitoring Stack

With the foundation in place, deploy the complete monitoring stack:

kubectl create -f manifests

This comprehensive deployment includes:

  • Prometheus server instances
  • Grafana dashboard server
  • Alertmanager for alert routing
  • Various exporters for metrics collection
  • Service monitors for automatic target discovery
  • Default alerting rules and dashboards

Step 4: Monitor Deployment Progress

Track the deployment progress using the watch command to observe all resources as they become available:

watch kubectl get deploy,pod,svc -n monitoring

You’ll see deployments, pods, and services being created and transitioning through various states:

  • PendingRunning for pods
  • 0/11/1 for deployments
  • Services should show ClusterIP assignments

The deployment is complete when all pods show Running status and deployments show 1/1 ready replicas.

Step 5: Access the Monitoring Interfaces

Once all resources are running, access the web interfaces using port forwarding. Open separate terminal windows for each service:

Grafana Dashboard (Port 3000)

kubectl --namespace monitoring port-forward svc/grafana 3000

Access Grafana at: http://localhost:3000

Default Credentials:

  • Username: admin
  • Password: admin

You’ll be prompted to change the password on first login.

Prometheus Query Interface (Port 9090)

kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090

Access Prometheus at: http://localhost:9090

Alertmanager Interface (Port 9093)

kubectl --namespace monitoring port-forward svc/alertmanager-main 9093

Access Alertmanager at: http://localhost:9093

Initial Configuration and Exploration

Grafana Setup

  1. Login: Use admin:admin credentials
  2. Change Password: Set a secure password when prompted
  3. Explore Dashboards: Navigate to Dashboards → Browse to see pre-configured dashboards
  4. Data Sources: Prometheus is pre-configured as the default data source

Key Dashboards to Explore

  • Kubernetes / Compute Resources / Cluster: Overall cluster resource utilization
  • Kubernetes / Compute Resources / Namespace (Pods): Pod-level resource metrics
  • Kubernetes / Compute Resources / Node (Pods): Node-level resource utilization
  • Kubernetes / Networking / Cluster: Network traffic and connectivity metrics

Prometheus Targets

  1. Navigate to Status → Targets in Prometheus UI
  2. Verify all targets are in “UP” state
  3. Explore available metrics using the query interface

Troubleshooting Common Issues

Pods Stuck in Pending State

  • Check cluster resources: kubectl describe node
  • Verify storage classes if using persistent volumes
  • Check for node selector constraints

Port Forwarding Connection Issues

  • Ensure pods are running before port forwarding
  • Check firewall settings on your local machine
  • Verify kubectl context is correct

Missing Metrics or Dashboards

  • Confirm all CRDs were created properly
  • Check ServiceMonitor resources: kubectl get servicemonitor -n monitoring
  • Verify RBAC permissions are correctly applied

Security Considerations

  • Change default Grafana credentials immediately
  • Consider implementing authentication and authorization
  • Use NetworkPolicies to restrict access between components
  • Regularly update the monitoring stack for security patches

Production Considerations

Persistent Storage

For production deployments, configure persistent volumes:

  • Prometheus data retention and storage
  • Grafana dashboard and configuration persistence
  • Consider backup strategies for monitoring data

Resource Limits

Set appropriate resource requests and limits:

  • Prometheus: CPU and memory scale with metrics volume
  • Grafana: Generally lightweight but consider user load
  • Alertmanager: Minimal resources required

High Availability

Consider deploying multiple replicas for:

  • Prometheus instances for redundancy
  • Grafana for load distribution
  • Alertmanager for alert processing reliability

Additional Integration Opportunities

Cilium Network Observability

If you’re using a Cilium-based cluster, enable Hubble for enhanced network monitoring:

cilium hubble enable --ui
kubectl --namespace kube-system port-forward svc/hubble-ui 12000:80

This integrates seamlessly with your Grafana dashboards for comprehensive cluster observability.

Storage Integration

For clusters using NFS storage provisioning as described in our cluster setup guide, ensure monitoring of storage metrics by checking that the NFS-related ServiceMonitors are properly configured.

Next Steps

  1. Custom Dashboards: Create dashboards specific to your applications
  2. Alert Rules: Configure alerting rules for your specific use cases
  3. Integration: Connect with external systems like Slack or PagerDuty
  4. Metrics Export: Add custom metrics from your applications
  5. Backup Strategy: Implement backup for monitoring data and configurations

References and Resources

Questions Answered in This Document

Q: How do I deploy Prometheus and Grafana on Kubernetes? A: Clone the kube-prometheus repository, deploy the setup manifests first (kubectl create -f manifests/setup), then deploy the main stack (kubectl create -f manifests), and access the interfaces via port forwarding.

Q: What are the default login credentials for Grafana? A: The default username and password are both admin. You’ll be prompted to change the password upon first login.

Q: How do I access the Prometheus and Grafana interfaces? A: Use kubectl port forwarding: kubectl --namespace monitoring port-forward svc/grafana 3000 for Grafana and kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090 for Prometheus.

Q: What components are included in the kube-prometheus stack? A: The stack includes Prometheus Operator, Prometheus server, Grafana, Alertmanager, Node Exporter, kube-state-metrics, and various service monitors and alerting rules.

Q: How do I verify the monitoring stack is working correctly? A: Monitor the deployment with watch kubectl get deploy,pod,svc -n monitoring and ensure all pods are in Running state, then access the web interfaces to verify data collection.

Q: What should I do if pods are stuck in pending state? A: Check cluster resources with kubectl describe node, verify storage classes, and ensure there are no node selector constraints preventing pod scheduling.

Q: How do I troubleshoot missing metrics in Grafana? A: Check that all ServiceMonitor resources exist (kubectl get servicemonitor -n monitoring), verify Prometheus targets are UP in the Prometheus UI, and confirm RBAC permissions are correctly applied.

Q: What ports are used for accessing the monitoring interfaces? A: Grafana uses port 3000, Prometheus uses port 9090, and Alertmanager uses port 9093 when accessed via port forwarding.

Q: How do I secure the monitoring stack for production use? A: Change default Grafana credentials, implement authentication and authorization, use NetworkPolicies, configure persistent storage, and set appropriate resource limits.

Q: What are the resource requirements for the monitoring stack? A: Minimum requirements include 4GB RAM and 2 CPU cores available in your cluster, with actual usage scaling based on metrics volume and retention period.