Zero-trust networking operates on a simple principle: no request is trusted based on its network origin. A request from inside your VPC receives the same scrutiny as a request from the public internet. For APIs, this translates to verifying identity, validating authorization, enforcing rate limits, and inspecting payloads on every request — regardless of whether the caller is an external client, an internal microservice, or a batch job running in the same cluster.
This post covers the specific technologies and configurations required to implement zero-trust API security on Kubernetes: Istio service mesh for automatic mTLS, JWT validation at the ingress and mesh level, rate limiting with both local and global strategies, and OPA for fine-grained authorization decisions.
mTLS: Mutual Authentication Between Services
Standard TLS (what your browser uses for HTTPS) authenticates the server to the client: you verify that api.example.com is who it claims to be. Mutual TLS (mTLS) adds the reverse: the server also authenticates the client. Both parties present certificates, and both verify the other's identity.
In a microservice architecture, mTLS between services means:
- ✓Service A proves its identity to Service B when making a request
- ✓Service B proves its identity to Service A in the response
- ✓The communication channel is encrypted
- ✓No service can impersonate another without possessing its private key
Istio Service Mesh for Automatic mTLS
Istio injects a sidecar proxy (Envoy) into every pod. These proxies handle mTLS automatically — application code doesn't need to manage certificates or TLS configuration.
Install Istio with strict mTLS by default:
istioctl install --set profile=default \
--set meshConfig.defaultConfig.holdApplicationUntilProxyStarts=true
# Enable sidecar injection for your namespace
kubectl label namespace production istio-injection=enabled
Enforce strict mTLS across the mesh:
# peer-authentication.yaml
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system # Mesh-wide policy
spec:
mtls:
mode: STRICT
With STRICT mode, any connection attempt without a valid mTLS certificate is rejected. This applies to all service-to-service communication within the mesh.
Per-service exceptions (when necessary):
Some services need to accept non-mTLS traffic — for example, a health check endpoint called by a load balancer that doesn't participate in the mesh:
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
name: health-check-exception
namespace: production
spec:
selector:
matchLabels:
app: api-gateway
mtls:
mode: STRICT
portLevelMtls:
8080:
mode: PERMISSIVE # Allow non-mTLS on health check port only
Certificate Management with cert-manager
Istio manages its own certificate authority (Citadel) for mesh-internal mTLS. For external-facing TLS certificates, cert-manager automates certificate issuance and renewal:
# cert-manager ClusterIssuer with Let's Encrypt
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
privateKeySecretRef:
name: letsencrypt-prod-account-key
solvers:
- http01:
ingress:
class: istio
---
# Certificate for API domain
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: api-tls
namespace: istio-system
spec:
secretName: api-tls-cert
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- api.example.com
- "*.api.example.com"
renewBefore: 720h # Renew 30 days before expiry
SPIFFE/SPIRE for Workload Identity
For environments requiring stronger workload identity than Istio's built-in CA, SPIFFE (Secure Production Identity Framework for Everyone) provides a standardized identity framework. SPIRE is the reference implementation.
Each workload receives a SPIFFE ID (e.g., spiffe://example.com/ns/production/sa/payment-service) and a short-lived X.509 certificate. Unlike Istio's Citadel, SPIRE supports multi-cluster and hybrid environments, and integrates with external identity providers.
# SPIRE ClusterSPIFFEID for a service
apiVersion: spire.spiffe.io/v1alpha1
kind: ClusterSPIFFEID
metadata:
name: payment-service
spec:
spiffeIDTemplate: "spiffe://{{ .TrustDomain }}/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }}"
podSelector:
matchLabels:
app: payment-service
namespaceSelector:
matchLabels:
environment: production
JWT Validation at the Mesh Level
JSON Web Tokens (JWTs) carry identity and authorization claims. Validating JWTs at the Istio ingress gateway or sidecar proxy means your application code doesn't need to implement JWT verification — the mesh handles it before the request reaches your service.
Istio RequestAuthentication
RequestAuthentication tells Istio how to validate incoming JWTs:
# request-authentication.yaml
apiVersion: security.istio.io/v1
kind: RequestAuthentication
metadata:
name: jwt-auth
namespace: production
spec:
selector:
matchLabels:
app: api-gateway
jwtRules:
- issuer: "https://auth.example.com/"
jwksUri: "https://auth.example.com/.well-known/jwks.json"
audiences:
- "api.example.com"
forwardOriginalToken: true
outputPayloadToHeader: "x-jwt-payload"
- issuer: "https://accounts.google.com"
jwksUri: "https://www.googleapis.com/oauth2/v3/certs"
audiences:
- "api.example.com"
This configuration:
- ✓Validates tokens from two issuers (your auth service and Google)
- ✓Verifies the
aud(audience) claim matches your API - ✓Forwards the original token to the upstream service
- ✓Extracts the JWT payload into a header for downstream use
Important: RequestAuthentication only validates tokens that are present. It does not reject requests without tokens. To require authentication, pair it with an AuthorizationPolicy.
Istio AuthorizationPolicy
AuthorizationPolicy controls which requests are allowed based on JWT claims, source identity, or request attributes:
# authorization-policy.yaml
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: require-jwt
namespace: production
spec:
selector:
matchLabels:
app: api-gateway
action: DENY
rules:
- from:
- source:
notRequestPrincipals: ["*"]
to:
- operation:
notPaths: ["/health", "/ready", "/metrics"]
---
# Role-based access
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: admin-endpoints
namespace: production
spec:
selector:
matchLabels:
app: api-gateway
action: ALLOW
rules:
- from:
- source:
requestPrincipals: ["https://auth.example.com/*"]
to:
- operation:
paths: ["/admin/*"]
when:
- key: request.auth.claims[role]
values: ["admin"]
The first policy denies any request without a valid JWT (except health/ready/metrics endpoints). The second policy restricts /admin/* endpoints to tokens containing "role": "admin".
Token Architecture
A robust token architecture separates short-lived access tokens from longer-lived refresh tokens:
| Token Type | Lifetime | Contains | Storage |
|---|---|---|---|
| Access token | 15–60 minutes | User ID, roles, permissions, tenant ID | Authorization header (Bearer) |
| Refresh token | 7–30 days | User ID, token family ID | HTTP-only secure cookie or secure storage |
| API key | Long-lived (rotate quarterly) | Client ID, tier, rate limit config | Authorization header |
Token scoping: access tokens should contain the minimum claims needed for authorization. A token for the billing API doesn't need permissions for the user management API. Scoped tokens limit the blast radius of a compromised token.
Audience validation: every token should specify its intended audience (aud claim), and every API should verify it. A token issued for billing.api.example.com should be rejected by users.api.example.com.
Rate Limiting
Rate limiting prevents abuse, ensures fair resource allocation, and protects backend services from traffic spikes. In a Kubernetes-native stack, you have two options:
Local Rate Limiting (Envoy)
Local rate limiting runs in each Envoy sidecar independently. It's simple to configure and doesn't require external dependencies, but each pod maintains its own counter — 5 replicas with a 100 req/min limit allows 500 req/min total.
# envoy-filter-local-ratelimit.yaml
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: local-ratelimit
namespace: production
spec:
workloadSelector:
labels:
app: api-gateway
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
listener:
filterChain:
filter:
name: envoy.filters.network.http_connection_manager
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
value:
stat_prefix: http_local_rate_limiter
token_bucket:
max_tokens: 100
tokens_per_fill: 100
fill_interval: 60s
filter_enabled:
runtime_key: local_rate_limit_enabled
default_value:
numerator: 100
denominator: HUNDRED
filter_enforced:
runtime_key: local_rate_limit_enforced
default_value:
numerator: 100
denominator: HUNDRED
response_headers_to_add:
- append_action: OVERWRITE_IF_EXISTS_OR_ADD
header:
key: x-local-rate-limit
value: "true"
Global Rate Limiting (External Service)
Global rate limiting uses a centralized service (typically Redis-backed) that all instances share. This provides accurate, cluster-wide rate limits.
Rate limit service deployment:
# ratelimit-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ratelimit
namespace: production
spec:
replicas: 2
selector:
matchLabels:
app: ratelimit
template:
metadata:
labels:
app: ratelimit
spec:
containers:
- name: ratelimit
image: envoyproxy/ratelimit:master
ports:
- containerPort: 8081 # gRPC
env:
- name: REDIS_SOCKET_TYPE
value: "tcp"
- name: REDIS_URL
value: "redis.production.svc.cluster.local:6379"
- name: RUNTIME_ROOT
value: "/data"
- name: RUNTIME_SUBDIRECTORY
value: "ratelimit"
- name: USE_STATSD
value: "false"
volumeMounts:
- name: config
mountPath: /data/ratelimit/config
volumes:
- name: config
configMap:
name: ratelimit-config
Rate limit configuration per API tier:
# ratelimit-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: ratelimit-config
namespace: production
data:
config.yaml: |
domain: api-gateway
descriptors:
# Per-customer rate limits based on API tier
- key: api_tier
value: free
rate_limit:
unit: minute
requests_per_unit: 60
descriptors:
- key: path
value: "/api/v1/search"
rate_limit:
unit: minute
requests_per_unit: 10
- key: api_tier
value: starter
rate_limit:
unit: minute
requests_per_unit: 600
descriptors:
- key: path
value: "/api/v1/search"
rate_limit:
unit: minute
requests_per_unit: 100
- key: api_tier
value: enterprise
rate_limit:
unit: minute
requests_per_unit: 6000
descriptors:
- key: path
value: "/api/v1/search"
rate_limit:
unit: minute
requests_per_unit: 1000
# Global safety limit
- key: generic_key
value: default
rate_limit:
unit: second
requests_per_unit: 5000
Istio EnvoyFilter to connect to the rate limit service:
# envoy-filter-global-ratelimit.yaml
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: global-ratelimit
namespace: production
spec:
workloadSelector:
labels:
app: api-gateway
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
listener:
filterChain:
filter:
name: envoy.filters.network.http_connection_manager
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.ratelimit
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
domain: api-gateway
failure_mode_deny: false
rate_limit_service:
grpc_service:
envoy_grpc:
cluster_name: rate_limit_cluster
transport_api_version: V3
request_type: external
Setting failure_mode_deny: false means that if the rate limit service is unavailable, requests are allowed through. This is a deliberate choice — a rate limiter outage shouldn't cause a complete API outage. Monitor the rate limit service availability separately.
Request-Level Authorization with OPA
JWT claims handle identity and coarse-grained roles. For fine-grained authorization — "can this user access this specific resource?" — OPA provides a policy engine that evaluates complex rules without embedding authorization logic in application code.
OPA sidecar deployment pattern:
# deployment-with-opa.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
spec:
template:
spec:
containers:
- name: api-service
image: api-service:v1.2.3
ports:
- containerPort: 8080
- name: opa
image: openpolicyagent/opa:latest-envoy
ports:
- containerPort: 9191 # Decision API
- containerPort: 8181 # Management API
args:
- "run"
- "--server"
- "--addr=0.0.0.0:8181"
- "--diagnostic-addr=0.0.0.0:8282"
- "--set=plugins.envoy_ext_authz_grpc.addr=:9191"
- "--set=plugins.envoy_ext_authz_grpc.path=envoy/authz/allow"
- "--set=decision_logs.console=true"
- "/policies"
volumeMounts:
- name: opa-policies
mountPath: /policies
volumes:
- name: opa-policies
configMap:
name: opa-policies
OPA policy for tenant isolation:
# policies/tenant_isolation.rego
package envoy.authz
import rego.v1
default allow := false
allow if {
is_valid_token
is_authorized_for_resource
}
is_valid_token if {
token := input.attributes.request.http.headers.authorization
startswith(token, "Bearer ")
jwt := substring(token, 7, -1)
[header, payload, _] := io.jwt.decode(jwt)
payload.exp > time.now_ns() / 1e9
}
# Extract tenant ID from JWT
tenant_id := tid if {
token := input.attributes.request.http.headers.authorization
jwt := substring(token, 7, -1)
[_, payload, _] := io.jwt.decode(jwt)
tid := payload.tenant_id
}
# Extract tenant ID from request path (e.g., /api/v1/tenants/{tenant_id}/resources)
path_tenant_id := ptid if {
path := input.attributes.request.http.path
parts := split(path, "/")
parts[3] == "tenants"
ptid := parts[4]
}
# Tenant can only access their own resources
is_authorized_for_resource if {
path_tenant_id
tenant_id == path_tenant_id
}
# Non-tenant-scoped paths are allowed for any authenticated user
is_authorized_for_resource if {
not path_tenant_id
}
# Admin override — admins can access any tenant's resources
is_authorized_for_resource if {
token := input.attributes.request.http.headers.authorization
jwt := substring(token, 7, -1)
[_, payload, _] := io.jwt.decode(jwt)
"admin" in payload.roles
}
This policy enforces strict tenant isolation: a request to /api/v1/tenants/tenant-123/resources is only allowed if the JWT's tenant_id claim is tenant-123 — or if the caller has the admin role.
API Threat Protection
Beyond authentication and authorization, APIs need protection against malformed and malicious payloads.
Payload Validation with JSON Schema
Validate request bodies against a JSON schema before they reach your application:
# Istio EnvoyFilter for request body validation
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: payload-limits
namespace: production
spec:
workloadSelector:
labels:
app: api-gateway
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.buffer
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.buffer.v3.Buffer
max_request_bytes: 1048576 # 1MB max request size
For JSON schema validation, implement it at the API gateway level (e.g., Kong, Ambassador) or as middleware in your application:
// Express middleware for JSON schema validation
const Ajv = require('ajv');
const ajv = new Ajv({ allErrors: true, removeAdditional: true });
const createUserSchema = {
type: 'object',
required: ['email', 'name'],
properties: {
email: { type: 'string', format: 'email', maxLength: 254 },
name: { type: 'string', minLength: 1, maxLength: 100 },
role: { type: 'string', enum: ['viewer', 'editor', 'admin'] }
},
additionalProperties: false
};
function validateBody(schema) {
const validate = ajv.compile(schema);
return (req, res, next) => {
if (!validate(req.body)) {
return res.status(400).json({
error: 'Validation failed',
details: validate.errors.map(e => ({
field: e.instancePath,
message: e.message
}))
});
}
next();
};
}
app.post('/api/v1/users', validateBody(createUserSchema), createUser);
Observability: Monitoring the Security Stack
A zero-trust stack generates a large volume of security-relevant telemetry. Structured monitoring is essential for detecting anomalies and debugging legitimate access issues.
Key Metrics to Track
# Prometheus rules for API security monitoring
groups:
- name: api-security
rules:
- alert: HighAuthFailureRate
expr: |
sum(rate(istio_requests_total{response_code="401"}[5m])) by (destination_service)
/
sum(rate(istio_requests_total[5m])) by (destination_service)
> 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "Auth failure rate above 10% for {{ $labels.destination_service }}"
- alert: RateLimitExceeded
expr: |
sum(rate(istio_requests_total{response_code="429"}[5m])) by (source_principal)
> 100
for: 2m
labels:
severity: info
annotations:
summary: "Client {{ $labels.source_principal }} exceeding rate limits"
- alert: OPADecisionLatencyHigh
expr: |
histogram_quantile(0.99, rate(opa_decision_duration_seconds_bucket[5m])) > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: "OPA decision latency p99 above 50ms"
- alert: mTLSHandshakeFailures
expr: |
sum(rate(envoy_ssl_connection_error[5m])) by (pod) > 0
for: 2m
labels:
severity: critical
annotations:
summary: "mTLS handshake failures on {{ $labels.pod }}"
Distributed Tracing for Auth Decisions
Include authorization decision metadata in distributed traces:
# Istio telemetry configuration
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: api-tracing
namespace: production
spec:
tracing:
- providers:
- name: jaeger
randomSamplingPercentage: 10
customTags:
auth.principal:
header:
name: x-jwt-payload
auth.tenant_id:
header:
name: x-tenant-id
rate_limit.remaining:
header:
name: x-ratelimit-remaining
Defense in Depth: The Complete Stack
Each layer addresses specific threat vectors:
┌─────────────────────────────────────────────────────┐
│ Layer 1: Network (WAF / Cloud Load Balancer) │
│ • DDoS protection │
│ • Bot detection │
│ • IP reputation filtering │
│ • Geographic restrictions │
├─────────────────────────────────────────────────────┤
│ Layer 2: Ingress (Istio Gateway) │
│ • TLS termination (external) │
│ • JWT validation │
│ • Global rate limiting │
│ • Request size limits │
├─────────────────────────────────────────────────────┤
│ Layer 3: Mesh (Istio Sidecars) │
│ • mTLS between all services │
│ • Service-level AuthorizationPolicies │
│ • Local rate limiting │
├─────────────────────────────────────────────────────┤
│ Layer 4: Application (OPA + Middleware) │
│ • Tenant isolation │
│ • Fine-grained resource authorization │
│ • JSON schema validation │
│ • Business logic authorization │
├─────────────────────────────────────────────────────┤
│ Layer 5: Data (Encryption + Access Control) │
│ • Encryption at rest (KMS) │
│ • Row-level security (database) │
│ • Field-level encryption for sensitive data │
│ • Audit logging for data access │
└─────────────────────────────────────────────────────┘
A request that passes Layer 1 still faces JWT validation at Layer 2, mTLS verification at Layer 3, tenant isolation at Layer 4, and data-level access controls at Layer 5. Compromising any single layer doesn't grant unrestricted access.
Case Study: Multi-Tenant B2B API Platform
Background
A multi-tenant B2B API platform serving 200+ customers with three pricing tiers (Free, Starter, Enterprise) needed a comprehensive security stack. Customers ranged from individual developers on the free tier to financial institutions on enterprise plans processing thousands of requests per second. The Stripe Systems team designed and implemented the Kubernetes-native security architecture.
Requirements
- ✓mTLS between all internal services — no plaintext traffic within the cluster
- ✓JWT validation at ingress — reject unauthenticated requests before they reach application code
- ✓Per-customer rate limits — different limits per pricing tier, per API endpoint
- ✓Tenant isolation — tenant A cannot access tenant B's data, enforceable at the infrastructure level
- ✓Audit trail — every API call logged with caller identity, tenant context, and authorization decision
Implementation
mTLS with Istio:
Strict mTLS across the production namespace. The PeerAuthentication policy rejected any non-mTLS connection:
apiVersion: security.istio.io/v1
kind: PeerAuthentication
metadata:
name: strict-mtls
namespace: production
spec:
mtls:
mode: STRICT
Service-to-service communication verification ensured that only authorized services could call each other:
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: payment-service-callers
namespace: production
spec:
selector:
matchLabels:
app: payment-service
action: ALLOW
rules:
- from:
- source:
principals:
- "cluster.local/ns/production/sa/order-service"
- "cluster.local/ns/production/sa/billing-service"
to:
- operation:
methods: ["POST", "GET"]
paths: ["/api/v1/payments/*"]
Only order-service and billing-service could call payment-service. Any other service — even with valid mTLS — received a 403.
JWT Validation at Ingress:
apiVersion: security.istio.io/v1
kind: RequestAuthentication
metadata:
name: api-jwt-auth
namespace: production
spec:
selector:
matchLabels:
istio: ingressgateway
jwtRules:
- issuer: "https://auth.platform.example.com/"
jwksUri: "https://auth.platform.example.com/.well-known/jwks.json"
audiences: ["api.platform.example.com"]
forwardOriginalToken: true
outputClaimsToHeaders:
- header: "x-tenant-id"
claim: "tenant_id"
- header: "x-api-tier"
claim: "tier"
- header: "x-user-roles"
claim: "roles"
---
apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
name: require-auth
namespace: production
spec:
selector:
matchLabels:
istio: ingressgateway
action: DENY
rules:
- from:
- source:
notRequestPrincipals: ["*"]
to:
- operation:
notPaths:
- "/health"
- "/ready"
- "/.well-known/*"
- "/docs/*"
The RequestAuthentication extracted tenant_id, tier, and roles claims into headers, making them available to downstream services and the rate limiter without requiring each service to parse the JWT.
Per-Customer Rate Limiting:
The rate limit configuration provided different limits per tier and per endpoint:
# Rate limit config
domain: platform-api
descriptors:
- key: api_tier
value: free
rate_limit:
unit: minute
requests_per_unit: 60
descriptors:
- key: api_endpoint
value: search
rate_limit:
unit: minute
requests_per_unit: 10
- key: api_endpoint
value: bulk_export
rate_limit:
unit: hour
requests_per_unit: 5
- key: api_endpoint
value: webhook_register
rate_limit:
unit: day
requests_per_unit: 10
- key: api_tier
value: starter
rate_limit:
unit: minute
requests_per_unit: 600
descriptors:
- key: api_endpoint
value: search
rate_limit:
unit: minute
requests_per_unit: 100
- key: api_endpoint
value: bulk_export
rate_limit:
unit: hour
requests_per_unit: 50
- key: api_endpoint
value: webhook_register
rate_limit:
unit: day
requests_per_unit: 100
- key: api_tier
value: enterprise
rate_limit:
unit: minute
requests_per_unit: 6000
descriptors:
- key: api_endpoint
value: search
rate_limit:
unit: minute
requests_per_unit: 1000
- key: api_endpoint
value: bulk_export
rate_limit:
unit: hour
requests_per_unit: 500
- key: api_endpoint
value: webhook_register
rate_limit:
unit: day
requests_per_unit: 1000
Rate limit headers were returned on every response:
X-RateLimit-Limit: 600
X-RateLimit-Remaining: 423
X-RateLimit-Reset: 1694188800
OPA Tenant Isolation:
The OPA policy ensured strict tenant isolation — a request authenticated as tenant A could not access resources belonging to tenant B:
package platform.authz
import rego.v1
default allow := false
# Allow request if tenant context matches
allow if {
input.tenant_id != ""
resource_tenant := extract_tenant_from_path(input.path)
resource_tenant != ""
input.tenant_id == resource_tenant
}
# Allow non-tenant-scoped endpoints
allow if {
input.tenant_id != ""
not is_tenant_scoped_path(input.path)
}
# Admin bypass for platform operators
allow if {
"platform_admin" in input.roles
}
extract_tenant_from_path(path) := tenant if {
parts := split(path, "/")
some i
parts[i] == "tenants"
tenant := parts[i + 1]
}
extract_tenant_from_path(path) := "" if {
parts := split(path, "/")
not path_contains_tenants(parts)
}
path_contains_tenants(parts) if {
some i
parts[i] == "tenants"
}
is_tenant_scoped_path(path) if {
contains(path, "/tenants/")
}
This policy ran on every request. OPA decision latency averaged 1.2ms at p50 and 4.8ms at p99 — acceptable overhead for the security guarantee.
Results
After deployment, the platform operated with the following security posture:
- ✓100% mTLS coverage — verified by Istio telemetry showing zero plaintext connections
- ✓0 cross-tenant data access incidents in the first 12 months of operation
- ✓99.97% rate limiter availability — Redis cluster with automatic failover
- ✓Average auth overhead: 6.3ms per request (JWT validation + OPA decision + rate limit check)
- ✓3 blocked unauthorized access attempts detected in the first quarter via auth failure alerting — all were misconfigured API clients, not attacks, but the detection mechanism proved operational
Architecture Decisions and Trade-offs
Why Istio over Linkerd: the platform needed JWT validation and rate limiting at the mesh level. Linkerd focuses on mTLS and observability; Istio provides the AuthorizationPolicy and EnvoyFilter extension points required for the full security stack.
Why OPA sidecar over centralized OPA: deploying OPA as a sidecar with each service eliminated a network hop for authorization decisions and removed a single point of failure. The trade-off was higher resource consumption (each pod runs an OPA container), but the latency and reliability improvements justified it.
Why Redis-backed global rate limiting over local: the free tier limit of 60 requests/minute needed to be accurate regardless of how many pod replicas served the traffic. Local rate limiting would allow 60 * N requests where N is the replica count.
Why failure_mode_deny: false on rate limiting: a rate limiter outage that blocks all API traffic is worse than temporarily allowing over-limit requests. The team monitored rate limiter availability separately and had alerts for when it was unavailable.
Summary
Zero-trust API security is a layered implementation, not a single technology. mTLS provides transport-level identity and encryption. JWT validation provides caller authentication. Rate limiting provides abuse prevention and fairness. OPA provides fine-grained authorization.
Each layer is independently valuable, and implementing them incrementally is practical: start with mTLS (Istio makes this straightforward), add JWT validation at the ingress, then add rate limiting and OPA as your authorization requirements grow.
The overhead — 5–10ms per request for the full stack — is acceptable for most API workloads and significantly cheaper than the cost of a security incident caused by trusting network boundaries that no longer exist.
Ready to discuss your project?
Get in Touch →