Deploy Prometheus and Grafana on Kubernetes - Complete Monitoring Stack Setup

Setting up a comprehensive monitoring solution is essential for maintaining visibility into your Kubernetes cluster’s health and performance. This guide walks you through deploying Prometheus and Grafana using the kube-prometheus operator, providing a complete observability stack with metrics collection, visualization, and alerting capabilities.

Prometheus serves as the metrics collection and storage engine, while Grafana provides rich visualization dashboards. Together with Alertmanager, they form a powerful monitoring ecosystem that can help you proactively identify and resolve issues in your Kubernetes environment.

Prerequisites

Before proceeding with this deployment, ensure you have:

A running Kubernetes cluster with kubectl configured
Sufficient cluster resources (minimum 4GB RAM, 2 CPU cores available)
Administrative access to create namespaces and deploy resources
Git installed for repository cloning
Basic familiarity with Kubernetes concepts and port forwarding

This guide assumes you’re working with a cluster similar to the setup described in our Complete Kubernetes Cluster Setup Guide with Cilium on Arch Linux.

Understanding the Kube-Prometheus Stack

The kube-prometheus project provides a complete monitoring stack that includes:

Prometheus Operator: Manages Prometheus instances and related resources
Prometheus: Time-series database for metrics collection
Grafana: Visualization and dashboard platform
Alertmanager: Handles alerts sent by Prometheus
Node Exporter: Collects hardware and OS metrics
kube-state-metrics: Generates metrics about Kubernetes API objects

This stack follows cloud-native best practices and integrates seamlessly with Kubernetes.

Step 1: Clone and Prepare the Repository

Start by cloning the official kube-prometheus repository and navigating to the project directory:

git clone https://github.com/prometheus-operator/kube-prometheus.git
cd kube-prometheus

The repository contains all the necessary Kubernetes manifests organized in a structured way:

manifests/setup/ - Contains CustomResourceDefinitions (CRDs) and namespace setup
manifests/ - Contains the actual deployment manifests for all components

Step 2: Deploy Namespace and Custom Resource Definitions

First, create the monitoring namespace and deploy the required Custom Resource Definitions (CRDs):

kubectl create -f manifests/setup

This command will:

Create the monitoring namespace
Install Prometheus Operator CRDs
Set up necessary RBAC permissions
Establish the foundation for the monitoring stack

Wait for the setup to complete before proceeding to the next step. You can verify the namespace creation with:

kubectl get namespace monitoring

Step 3: Deploy the Prometheus Monitoring Stack

With the foundation in place, deploy the complete monitoring stack:

kubectl create -f manifests

This comprehensive deployment includes:

Prometheus server instances
Grafana dashboard server
Alertmanager for alert routing
Various exporters for metrics collection
Service monitors for automatic target discovery
Default alerting rules and dashboards

Step 4: Monitor Deployment Progress

Track the deployment progress using the watch command to observe all resources as they become available:

watch kubectl get deploy,pod,svc -n monitoring

You’ll see deployments, pods, and services being created and transitioning through various states:

Pending → Running for pods
0/1 → 1/1 for deployments
Services should show ClusterIP assignments

The deployment is complete when all pods show Running status and deployments show 1/1 ready replicas.

Step 5: Access the Monitoring Interfaces

Once all resources are running, access the web interfaces using port forwarding. Open separate terminal windows for each service:

Grafana Dashboard (Port 3000)

kubectl --namespace monitoring port-forward svc/grafana 3000

Access Grafana at: http://localhost:3000

Default Credentials:

Username: admin
Password: admin

You’ll be prompted to change the password on first login.

Prometheus Query Interface (Port 9090)

kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090

Access Prometheus at: http://localhost:9090

Alertmanager Interface (Port 9093)

kubectl --namespace monitoring port-forward svc/alertmanager-main 9093

Access Alertmanager at: http://localhost:9093

Initial Configuration and Exploration

Grafana Setup

Login: Use admin:admin credentials
Change Password: Set a secure password when prompted
Explore Dashboards: Navigate to Dashboards → Browse to see pre-configured dashboards
Data Sources: Prometheus is pre-configured as the default data source

Key Dashboards to Explore

Kubernetes / Compute Resources / Cluster: Overall cluster resource utilization
Kubernetes / Compute Resources / Namespace (Pods): Pod-level resource metrics
Kubernetes / Compute Resources / Node (Pods): Node-level resource utilization
Kubernetes / Networking / Cluster: Network traffic and connectivity metrics

Prometheus Targets

Navigate to Status → Targets in Prometheus UI
Verify all targets are in “UP” state
Explore available metrics using the query interface

Troubleshooting Common Issues

Pods Stuck in Pending State

Check cluster resources: kubectl describe node
Verify storage classes if using persistent volumes
Check for node selector constraints

Port Forwarding Connection Issues

Ensure pods are running before port forwarding
Check firewall settings on your local machine
Verify kubectl context is correct

Missing Metrics or Dashboards

Confirm all CRDs were created properly
Check ServiceMonitor resources: kubectl get servicemonitor -n monitoring
Verify RBAC permissions are correctly applied

Security Considerations

Change default Grafana credentials immediately
Consider implementing authentication and authorization
Use NetworkPolicies to restrict access between components
Regularly update the monitoring stack for security patches

Production Considerations

Persistent Storage

For production deployments, configure persistent volumes:

Prometheus data retention and storage
Grafana dashboard and configuration persistence
Consider backup strategies for monitoring data

Resource Limits

Set appropriate resource requests and limits:

Prometheus: CPU and memory scale with metrics volume
Grafana: Generally lightweight but consider user load
Alertmanager: Minimal resources required

High Availability

Consider deploying multiple replicas for:

Prometheus instances for redundancy
Grafana for load distribution
Alertmanager for alert processing reliability

Additional Integration Opportunities

Cilium Network Observability

If you’re using a Cilium-based cluster, enable Hubble for enhanced network monitoring:

cilium hubble enable --ui
kubectl --namespace kube-system port-forward svc/hubble-ui 12000:80

This integrates seamlessly with your Grafana dashboards for comprehensive cluster observability.

Storage Integration

For clusters using NFS storage provisioning as described in our cluster setup guide, ensure monitoring of storage metrics by checking that the NFS-related ServiceMonitors are properly configured.

Next Steps

Custom Dashboards: Create dashboards specific to your applications
Alert Rules: Configure alerting rules for your specific use cases
Integration: Connect with external systems like Slack or PagerDuty
Metrics Export: Add custom metrics from your applications
Backup Strategy: Implement backup for monitoring data and configurations

References and Resources

Questions Answered in This Document

Q: How do I deploy Prometheus and Grafana on Kubernetes? A: Clone the kube-prometheus repository, deploy the setup manifests first (kubectl create -f manifests/setup), then deploy the main stack (kubectl create -f manifests), and access the interfaces via port forwarding.

Q: What are the default login credentials for Grafana? A: The default username and password are both admin. You’ll be prompted to change the password upon first login.

Q: How do I access the Prometheus and Grafana interfaces? A: Use kubectl port forwarding: kubectl --namespace monitoring port-forward svc/grafana 3000 for Grafana and kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090 for Prometheus.

Q: What components are included in the kube-prometheus stack? A: The stack includes Prometheus Operator, Prometheus server, Grafana, Alertmanager, Node Exporter, kube-state-metrics, and various service monitors and alerting rules.

Q: How do I verify the monitoring stack is working correctly? A: Monitor the deployment with watch kubectl get deploy,pod,svc -n monitoring and ensure all pods are in Running state, then access the web interfaces to verify data collection.

Q: What should I do if pods are stuck in pending state? A: Check cluster resources with kubectl describe node, verify storage classes, and ensure there are no node selector constraints preventing pod scheduling.

Q: How do I troubleshoot missing metrics in Grafana? A: Check that all ServiceMonitor resources exist (kubectl get servicemonitor -n monitoring), verify Prometheus targets are UP in the Prometheus UI, and confirm RBAC permissions are correctly applied.

Q: What ports are used for accessing the monitoring interfaces? A: Grafana uses port 3000, Prometheus uses port 9090, and Alertmanager uses port 9093 when accessed via port forwarding.

Q: How do I secure the monitoring stack for production use? A: Change default Grafana credentials, implement authentication and authorization, use NetworkPolicies, configure persistent storage, and set appropriate resource limits.

Q: What are the resource requirements for the monitoring stack? A: Minimum requirements include 4GB RAM and 2 CPU cores available in your cluster, with actual usage scaling based on metrics volume and retention period.

Flouda Vault