Deploy Prometheus and Grafana on Kubernetes - Complete Monitoring Stack Setup
Setting up a comprehensive monitoring solution is essential for maintaining visibility into your Kubernetes cluster’s health and performance. This guide walks you through deploying Prometheus and Grafana using the kube-prometheus operator, providing a complete observability stack with metrics collection, visualization, and alerting capabilities.
Prometheus serves as the metrics collection and storage engine, while Grafana provides rich visualization dashboards. Together with Alertmanager, they form a powerful monitoring ecosystem that can help you proactively identify and resolve issues in your Kubernetes environment.
Prerequisites
Before proceeding with this deployment, ensure you have:
- A running Kubernetes cluster with
kubectl
configured - Sufficient cluster resources (minimum 4GB RAM, 2 CPU cores available)
- Administrative access to create namespaces and deploy resources
- Git installed for repository cloning
- Basic familiarity with Kubernetes concepts and port forwarding
This guide assumes you’re working with a cluster similar to the setup described in our Complete Kubernetes Cluster Setup Guide with Cilium on Arch Linux.
Understanding the Kube-Prometheus Stack
The kube-prometheus project provides a complete monitoring stack that includes:
- Prometheus Operator: Manages Prometheus instances and related resources
- Prometheus: Time-series database for metrics collection
- Grafana: Visualization and dashboard platform
- Alertmanager: Handles alerts sent by Prometheus
- Node Exporter: Collects hardware and OS metrics
- kube-state-metrics: Generates metrics about Kubernetes API objects
This stack follows cloud-native best practices and integrates seamlessly with Kubernetes.
Step 1: Clone and Prepare the Repository
Start by cloning the official kube-prometheus repository and navigating to the project directory:
git clone https://github.com/prometheus-operator/kube-prometheus.git
cd kube-prometheus
The repository contains all the necessary Kubernetes manifests organized in a structured way:
manifests/setup/
- Contains CustomResourceDefinitions (CRDs) and namespace setupmanifests/
- Contains the actual deployment manifests for all components
Step 2: Deploy Namespace and Custom Resource Definitions
First, create the monitoring namespace and deploy the required Custom Resource Definitions (CRDs):
kubectl create -f manifests/setup
This command will:
- Create the
monitoring
namespace - Install Prometheus Operator CRDs
- Set up necessary RBAC permissions
- Establish the foundation for the monitoring stack
Wait for the setup to complete before proceeding to the next step. You can verify the namespace creation with:
kubectl get namespace monitoring
Step 3: Deploy the Prometheus Monitoring Stack
With the foundation in place, deploy the complete monitoring stack:
kubectl create -f manifests
This comprehensive deployment includes:
- Prometheus server instances
- Grafana dashboard server
- Alertmanager for alert routing
- Various exporters for metrics collection
- Service monitors for automatic target discovery
- Default alerting rules and dashboards
Step 4: Monitor Deployment Progress
Track the deployment progress using the watch command to observe all resources as they become available:
watch kubectl get deploy,pod,svc -n monitoring
You’ll see deployments, pods, and services being created and transitioning through various states:
Pending
→Running
for pods0/1
→1/1
for deployments- Services should show
ClusterIP
assignments
The deployment is complete when all pods show Running
status and deployments show 1/1
ready replicas.
Step 5: Access the Monitoring Interfaces
Once all resources are running, access the web interfaces using port forwarding. Open separate terminal windows for each service:
Grafana Dashboard (Port 3000)
kubectl --namespace monitoring port-forward svc/grafana 3000
Access Grafana at: http://localhost:3000
Default Credentials:
- Username:
admin
- Password:
admin
You’ll be prompted to change the password on first login.
Prometheus Query Interface (Port 9090)
kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090
Access Prometheus at: http://localhost:9090
Alertmanager Interface (Port 9093)
kubectl --namespace monitoring port-forward svc/alertmanager-main 9093
Access Alertmanager at: http://localhost:9093
Initial Configuration and Exploration
Grafana Setup
- Login: Use
admin:admin
credentials - Change Password: Set a secure password when prompted
- Explore Dashboards: Navigate to Dashboards → Browse to see pre-configured dashboards
- Data Sources: Prometheus is pre-configured as the default data source
Key Dashboards to Explore
- Kubernetes / Compute Resources / Cluster: Overall cluster resource utilization
- Kubernetes / Compute Resources / Namespace (Pods): Pod-level resource metrics
- Kubernetes / Compute Resources / Node (Pods): Node-level resource utilization
- Kubernetes / Networking / Cluster: Network traffic and connectivity metrics
Prometheus Targets
- Navigate to Status → Targets in Prometheus UI
- Verify all targets are in “UP” state
- Explore available metrics using the query interface
Troubleshooting Common Issues
Pods Stuck in Pending State
- Check cluster resources:
kubectl describe node
- Verify storage classes if using persistent volumes
- Check for node selector constraints
Port Forwarding Connection Issues
- Ensure pods are running before port forwarding
- Check firewall settings on your local machine
- Verify kubectl context is correct
Missing Metrics or Dashboards
- Confirm all CRDs were created properly
- Check ServiceMonitor resources:
kubectl get servicemonitor -n monitoring
- Verify RBAC permissions are correctly applied
Security Considerations
- Change default Grafana credentials immediately
- Consider implementing authentication and authorization
- Use NetworkPolicies to restrict access between components
- Regularly update the monitoring stack for security patches
Production Considerations
Persistent Storage
For production deployments, configure persistent volumes:
- Prometheus data retention and storage
- Grafana dashboard and configuration persistence
- Consider backup strategies for monitoring data
Resource Limits
Set appropriate resource requests and limits:
- Prometheus: CPU and memory scale with metrics volume
- Grafana: Generally lightweight but consider user load
- Alertmanager: Minimal resources required
High Availability
Consider deploying multiple replicas for:
- Prometheus instances for redundancy
- Grafana for load distribution
- Alertmanager for alert processing reliability
Additional Integration Opportunities
Cilium Network Observability
If you’re using a Cilium-based cluster, enable Hubble for enhanced network monitoring:
cilium hubble enable --ui
kubectl --namespace kube-system port-forward svc/hubble-ui 12000:80
This integrates seamlessly with your Grafana dashboards for comprehensive cluster observability.
Storage Integration
For clusters using NFS storage provisioning as described in our cluster setup guide, ensure monitoring of storage metrics by checking that the NFS-related ServiceMonitors are properly configured.
Next Steps
- Custom Dashboards: Create dashboards specific to your applications
- Alert Rules: Configure alerting rules for your specific use cases
- Integration: Connect with external systems like Slack or PagerDuty
- Metrics Export: Add custom metrics from your applications
- Backup Strategy: Implement backup for monitoring data and configurations
References and Resources
- Computing for Geeks: Setup Prometheus and Grafana on Kubernetes
- kube-prometheus GitHub Repository
- Prometheus Documentation
- Grafana Documentation
- Prometheus Operator Documentation
- Kubernetes Monitoring Best Practices
Questions Answered in This Document
Q: How do I deploy Prometheus and Grafana on Kubernetes?
A: Clone the kube-prometheus repository, deploy the setup manifests first (kubectl create -f manifests/setup
), then deploy the main stack (kubectl create -f manifests
), and access the interfaces via port forwarding.
Q: What are the default login credentials for Grafana?
A: The default username and password are both admin
. You’ll be prompted to change the password upon first login.
Q: How do I access the Prometheus and Grafana interfaces?
A: Use kubectl port forwarding: kubectl --namespace monitoring port-forward svc/grafana 3000
for Grafana and kubectl --namespace monitoring port-forward svc/prometheus-k8s 9090
for Prometheus.
Q: What components are included in the kube-prometheus stack? A: The stack includes Prometheus Operator, Prometheus server, Grafana, Alertmanager, Node Exporter, kube-state-metrics, and various service monitors and alerting rules.
Q: How do I verify the monitoring stack is working correctly?
A: Monitor the deployment with watch kubectl get deploy,pod,svc -n monitoring
and ensure all pods are in Running state, then access the web interfaces to verify data collection.
Q: What should I do if pods are stuck in pending state?
A: Check cluster resources with kubectl describe node
, verify storage classes, and ensure there are no node selector constraints preventing pod scheduling.
Q: How do I troubleshoot missing metrics in Grafana?
A: Check that all ServiceMonitor resources exist (kubectl get servicemonitor -n monitoring
), verify Prometheus targets are UP in the Prometheus UI, and confirm RBAC permissions are correctly applied.
Q: What ports are used for accessing the monitoring interfaces? A: Grafana uses port 3000, Prometheus uses port 9090, and Alertmanager uses port 9093 when accessed via port forwarding.
Q: How do I secure the monitoring stack for production use? A: Change default Grafana credentials, implement authentication and authorization, use NetworkPolicies, configure persistent storage, and set appropriate resource limits.
Q: What are the resource requirements for the monitoring stack? A: Minimum requirements include 4GB RAM and 2 CPU cores available in your cluster, with actual usage scaling based on metrics volume and retention period.