Skip to content

local-path-provisioner for Storage

Date: 2026-03-23 Status: Accepted (pragmatic — revisit when options change) Context: Providing persistent storage for platform services (Prometheus, Grafana, Loki, Harbor, PostgreSQL)

Decision

Use local-path-provisioner for all persistent volumes. Data is stored on the local filesystem of the node where the pod runs.

Rationale

This was the most pragmatic option available at the time. The primary alternative — CloudStack's CSI driver — requires API connectivity from cluster nodes to the CloudStack management server, which is not currently available on our EduCloud environment.

Data loss risk is tolerable though undesired:

  • Student data is ephemeral and reset per semester
  • Platform state (Grafana dashboards, Prometheus metrics) can be rebuilt from IaC
  • Harbor images can be rebuilt from source via CI

We have not fully investigated all alternatives. If EduCloud/CloudStack API access becomes available (the FICT infrastructure team may resolve this), storage should be re-evaluated.

Alternatives Considered

CloudStack CSI driver

  • ✅ Replicated storage, survives node failures
  • Blocked — cluster nodes cannot reach CloudStack API (145.220.73.10:443)
  • Deferred: Revisit if FICT resolves network connectivity

NFS server

  • ✅ Shared storage accessible from any node
  • ❌ Additional infrastructure to deploy and maintain
  • ❌ Single point of failure (the NFS server)
  • Not investigated in depth

Longhorn / Rook-Ceph

  • ✅ Replicated block storage within the cluster
  • ❌ Significant resource overhead for our cluster size
  • ❌ Operational complexity (distributed storage management)
  • Not investigated in depth

Consequences

Positive

  • Zero additional infrastructure — works out of the box
  • Simple to understand and debug

Negative

  • Data loss if pod migrates to a different node — PVs are node-local
  • No replication or backup built-in
  • Harbor registry data, PostgreSQL databases, and monitoring data are all at risk if a node fails
  • Scaling the cluster (adding/removing nodes) may strand PVs

Monitoring

Current storage consumers and their volumes:

Service Volume Size
Prometheus 10Gi
Grafana 2Gi
Loki 10Gi
Harbor (registry) 50Gi
Harbor (database) 2Gi
PostgreSQL (prj2)