Technology

Enterprise Deployment

One Helm chart. Every service an enterprise needs for AI — from identity to observability. Deploy the complete Zenera Platform on any Kubernetes cluster in minutes.

What Ships Inside the Chart

Zenera ships as a single Helm chart that deploys a fully integrated AI platform onto any Kubernetes cluster. There is nothing to stitch together, no third-party SaaS dependencies to negotiate, no gaps to fill.

Identity, storage, search, orchestration, observability, LLM routing, development environments, and production-grade application services — all packaged, configured, and connected inside one helm install.

This is not a framework you build on top of. It is a finished product that enterprises deploy, configure, and run.

Block diagram showing Core Platform, Storage, Applications, and Infrastructure components

Every rectangle above is a Kubernetes Deployment, Service, or Job managed by the same chart. Every connection between them — networking, environment variables, shared secrets, health-check ordering — is pre-wired.

High-Level Architecture

Identity & Access Management

Every request that reaches any service inside the platform passes through a centralized authentication and authorization layer. There are no anonymous endpoints. There are no services that manage their own credentials.

How Authentication Works

Sequence diagram showing authentication flow between user, Caddy gateway, Zitadel, and internal services

Zitadel is a full OIDC / OAuth 2.0 identity provider deployed inside the cluster. It handles:

User management — Create organizations, invite users, enforce password policies
Multi-factor authentication — TOTP, U2F/FIDO2, passwordless
SSO federation — Connect enterprise IdPs (Okta, Azure AD, Google Workspace) via SAML or OIDC
Service accounts — Machine-to-machine tokens for CI/CD pipelines and external integrations
Audit logging — Every authentication event is recorded

Gateway-Level Authorization

The Caddy gateway with the caddy-security plugin acts as the enforcement point. Every internal service is exposed through a dedicated port on the gateway. Each port maps to an authorization policy that specifies which roles may access it:

Diagram showing role-based access control mapping roles to protected services

Operators assign roles through Zitadel's admin console. A data scientist might receive authp/user and svc:jupyterhub. A platform engineer might receive admin. Business users might receive only authp/user to access the Chat UI. No code changes required — role grants are instant.

Seamless SSO Across All Services

Once a user authenticates through the gateway portal, their JWT session cookie is valid across every service exposed by the platform. JupyterHub reads identity headers injected by Caddy — no second login. Grafana receives the same identity. LakeFS, MinIO Console, Temporal UI — all behind authentication, all seamless.

LLM Abstraction Layer

The platform does not hardcode any LLM provider. Every model call — from every agent, every notebook, every service — routes through LiteLLM, a unified proxy deployed inside the cluster.

Diagram showing model routing through LiteLLM proxy with cost tracking, rate limiting, and fallback chains

Why This Matters for Enterprises

Provider independence — Switch from OpenAI to Anthropic to Google to a self-hosted model by changing one configuration line. No application code changes.
Cost visibility — Every token is tracked by model, by user, by project. Understand your AI spend before the invoice arrives.
Rate limiting and quotas — Prevent any single team or project from consuming the entire model budget.
Fallback chains — If a provider has an outage, requests automatically route to the next configured provider.
Data sovereignty — Route sensitive workloads to self-hosted models while keeping general tasks on cloud APIs. Same interface for both.
Unified API — Every consumer talks OpenAI-compatible API. Internally, LiteLLM translates to whichever provider's native protocol is needed.

Durable Workflow Engine

Agent execution is not a simple request-response cycle. Agents run for minutes, hours, or days. They wait for external events. They retry on failure. They branch into parallel sub-tasks.

Zenera runs all of this on Temporal, a durable workflow engine deployed inside the same chart.

Temporal architecture diagram showing server services, worker pods, and workflow inspector

What Temporal Provides

Durable execution — If a worker pod crashes mid-task, Temporal replays the workflow from history on another worker. No data loss. No manual restart.
Visibility — Every workflow step, every activity execution, every retry is recorded and inspectable through the Temporal UI.
Scaling — Add worker replicas to handle more concurrent workflows. Temporal distributes tasks automatically.
Long-running agents — Workflows that run for days or weeks do not hold open connections. They checkpoint state and resume on demand.
Scheduling — Cron-like schedules for recurring agent tasks (daily reports, periodic data ingestion, compliance scans).

Data Layer

PostgreSQL — The transactional backbone

All application state lives in PostgreSQL 17: project definitions, agent configurations, user data, workflow metadata, LLM spend logs, and audit trails. The chart deploys PostgreSQL with:

Automatic schema initialization via migration jobs
Connection pooling configured for 300+ concurrent connections
pg_stat_statements enabled for query performance analysis
Separate databases for application data (zenera_backend), Temporal (zenera_temporal), Zitadel, and LiteLLM

Redis — Real-time cache and messaging

Redis 7 provides sub-millisecond caching for hot data paths: session state, model response caches, real-time event streams between services, and rate-limiting counters.

MinIO — S3-compatible object storage

MinIO and LakeFS integration diagram showing isolated buckets, versioning, and git-like branches for data

MinIO provides S3-compatible object storage deployed directly inside the cluster. Agent artifacts, training datasets, generated reports, notebook backups, uploaded documents — all stored locally, all encrypted, all under your control.

LakeFS — Git for your data

LakeFS adds version control on top of MinIO. Every dataset change is a commit. Every experiment runs on a branch. If an agent produces incorrect results, roll back the data to the last known good state. This is not file-level versioning — it operates at the scale of entire data lakes with zero data duplication.

OpenSearch — Full-text and vector search

OpenSearch powers the platform’s search capabilities: full-text search across documents, semantic vector search for RAG (Retrieval-Augmented Generation) pipelines, and structured query capabilities for analytics. Agents use OpenSearch to find relevant context, and the platform uses it to index project artifacts for discovery.

Observability — Complete Visibility From Day One

The platform ships with a fully configured observability stack. There is no setup required — dashboards, data sources, and collection pipelines are deployed and connected automatically.

Observability stack diagram showing collection, storage, and visualization through Prometheus, Loki, Tempo, Pyroscope, and Grafana

Pre-built Grafana dashboards

The chart ships with nine pre-configured dashboards, ready to use on first boot:

Dashboard	What it shows
System Metrics	Node CPU, memory, disk, network across the cluster
Kubernetes Metrics	Pod status, restarts, resource requests vs. actual usage
PostgreSQL Metrics	Connections, query throughput, replication lag, cache hit ratios
Redis Metrics	Commands/sec, memory usage, connected clients, key evictions
OpenSearch Metrics	Index size, query latency, cluster health, shard distribution
MinIO Metrics	Storage used, request rates, bucket-level breakdown
Temporal Metrics	Workflow throughput, activity latency, schedule lag, task-queue depth
LiteLLM Metrics	Token usage per model, cost per team, latency percentiles, error rates
LakeFS Metrics	Branch operations, commit frequency, storage utilization

Four pillars of observability

Pillar	Tool	Purpose
Metrics	Prometheus	Time-series data for every service, node, and container
Logs	Loki	Centralized log aggregation with label-based queries
Traces	Tempo	End-to-end distributed traces across service boundaries
Profiles	Pyroscope	Continuous CPU and memory profiling to identify bottlenecks

Every agent execution, every LLM call, every workflow step produces telemetry that flows into these stores. When something goes wrong, you can trace from a user's chat message through the gateway, into the server, across Temporal activities, through LLM calls, and into the data layer — all from a single Grafana pane.

Development Environment

JupyterHub — Multi-user AI development

JupyterHub architecture showing user pods with pre-configured access to platform services

Every user who opens JupyterHub gets their own isolated Kubernetes pod with:

Persistent storage — Notebooks survive pod restarts. Each user has their own PersistentVolumeClaim.
Automatic notebook backup — MinIO sidecar continuously syncs notebooks to object storage.
Pre-configured credentials — LiteLLM, PostgreSQL, OpenSearch, MinIO, Tempo, Loki — all connection strings and API keys are injected automatically. No manual setup.
Zenera SDK pre-installed — Import and use the full platform SDK immediately.
AI code assistance — Jupyter AI integration backed by LiteLLM, so notebook users have in-IDE model access.
Resource isolation — CPU and memory guarantees and limits per user, configurable via Helm values.

Multi-Team Isolation and Collaboration

This is where the architecture becomes strategic. The platform is designed so that different teams can build completely isolated agentic systems — or share agents, artifacts, and skills across projects when collaboration is valuable.

Isolation Model

Multi-team isolation diagram showing separated projects and agents sharing platform resources

What Is Isolated by Default

Resource	Isolation Mechanism
User identity	Zitadel organizations + role-based access
Agent projects	Per-project database scoping
Data artifacts	LakeFS branches + MinIO bucket policies
Workflow execution	Temporal namespaces + task queues
Notebook environments	Per-user Kubernetes pods with dedicated PVCs
LLM spend	LiteLLM per-team budgets and rate limits
Search indices	Per-project OpenSearch indices
Logs and traces	Label-based filtering in Loki and Tempo

What Teams Can Choose to Share

Resource	Sharing Mechanism
Agent skills	Publish skills to the shared skill library; other projects import them
Artifacts	LakeFS branch merging — merge validated datasets across teams
LLM models	Shared model pool via LiteLLM — all teams benefit from negotiated pricing
Observability	Cross-team dashboards in Grafana for platform-wide health
Knowledge bases	Shared OpenSearch indices for cross-team RAG pipelines

Two teams can operate as if they have completely separate AI platforms — different projects, different data pipelines, different agents — while sharing the same infrastructure, the same LLM budget controls, and the same observability stack. When collaboration makes sense, it is opt-in and controlled.

External Integrations

Nango is deployed inside the chart to provide a unified integration layer for external SaaS systems.

Agents that need to read from Salesforce, write to Jira, pull data from HubSpot, or sync with Slack do so through Nango's managed OAuth flows and unified API.

Nango integration layer diagram showing agents connecting to various SaaS external systems

No manual OAuth token management. No per-integration credential rotation headaches. Nango handles the full lifecycle.

Network Architecture

Every service runs behind ClusterIP (internal only). The only externally reachable component is the Caddy gateway, which exposes dedicated ports for each service — all behind authentication.

Internal services communicate directly via Kubernetes DNS. No service mesh required — the chart pre-wires all service discovery through environment variables and Helm template helpers.

Secrets Management

The chart supports two modes:

External Secrets (Default)

Run the provided create_secrets.sh script before helm install. Secrets are created as a Kubernetes Secret object outside of Helm's lifecycle, ensuring they are not stored in Helm release history.

Helm-Managed Secrets

Set secrets.create=true and provide values via --set or a values-secrets.yaml file. Convenient for development environments.

All secrets flow to services through Kubernetes secretKeyRef — database passwords, API keys, OIDC client credentials, JWT signing keys, MinIO credentials, LakeFS access keys. No secrets are ever stored in ConfigMaps or environment variable literals.

Deployment Topology

Resource Defaults

All PersistentVolumeClaims, CPU/memory requests, and replica counts are configurable via Helm values. The chart ships with sensible defaults for production:

Service	Default PVC	Purpose
PostgreSQL	20 Gi	Application + workflow + LLM tracking data
MinIO	50 Gi	Agent artifacts, datasets, notebook backups
OpenSearch	20 Gi	Search indices, vector embeddings
Prometheus	10 Gi	Time-series metrics retention
Loki	10 Gi	Log retention
Tempo	10 Gi	Trace retention
LakeFS	10 Gi	Data versioning metadata
Redis	5 Gi	Cache persistence
Grafana	5 Gi	Dashboard state, alerting rules
Pyroscope	5 Gi	Profiling data
Zitadel	1 Gi	Identity data

Deployment in Three Commands

The chart handles everything else: database initialization, schema migrations, bucket creation, OIDC application provisioning, Grafana dashboard loading, Prometheus scrape configuration, and service health ordering through init containers.

# 1. Create secrets
./scripts/create_secrets.sh

# 2. Install the platform
helm install zenera-platform ./zenera-platform \
  --namespace zenera \
  --create-namespace

# 3. Access the platform
open http://zenera-local.com:4180

Every Component Is Optional

Every service in the chart has an enabled flag. Running in an environment that already has PostgreSQL? Set postgres.enabled: false and point the connection string to your existing instance. Already have Grafana? Disable it. Want to defer JupyterHub until phase two? Turn it off.

# Example: minimal deployment
postgres:
  enabled: true
redis:
  enabled: true
minio:
  enabled: true
opensearch:
  enabled: true
temporal:
  enabled: true
litellm:
  enabled: true
zitadel:
  enabled: true
gateway:
  enabled: true
zenera:
  server:
    enabled: true
  worker:
    enabled: true
  chat:
    enabled: true

# Disable optional services
jupyterhub:
  enabled: false
grafana:
  enabled: false
pyroscope:
  enabled: false
nango:
  enabled: false

This is not vendor lock-in — it is a platform that meets you where your infrastructure already is.

Why This Matters

Most enterprise AI initiatives stall at the infrastructure phase. Teams spend months piecing together identity providers, model gateways, workflow engines, storage layers, and observability stacks — only to discover they still lack the connective tissue that makes them work together.

Zenera eliminates that phase entirely.

Timeline comparison between traditional infrastructure building and Zenera's rapid deployment approach

"One chart. Every service. Any Kubernetes cluster. Deploy the complete AI platform and start building agents — not infrastructure."

See the Platform in Action

Deploy the complete Zenera Platform on your Kubernetes cluster and start building production-grade agentic systems — not infrastructure.

Request a Demo