Advanced Safe Deployments
Advanced topics for safe deployments with Argo Rollouts
This guide covers advanced topics for safe deployments, including custom analysis options, networking integration, and migration strategies.
Configure custom analysis
Argo Rollouts supports running automated analysis during your deployments. This enables you to run tests, check metrics, and validate the new version before promoting it.
Configure analysis
You can configure analysis templates that run during specific steps of your rollout. Analysis runs during pauses to validate the new version before promoting it:
1apiVersion: apollographql.com/v1alpha2
2kind: Supergraph
3metadata:
4 name: my-supergraph
5spec:
6 deploymentStrategy:
7 rollout:
8 steps:
9 - setWeight: 20 # Shift 20% of traffic to new version
10 - pause:
11 duration: 5m # Wait 5 minutes for analysis
12 - analysis: # Run analysis at this step
13 templates:
14 - templateName: success-rate # Reference to AnalysisTemplate
15 - setWeight: 50 # Shift 50% of traffic if analysis passes
16 - pause:
17 duration: 5m # Wait 5 minutes for analysis
18 - analysis: # Run analysis again at higher traffic
19 templates:
20 - templateName: success-rate
21 - setWeight: 100 # Promote to 100% if analysis passes
22 analysis: # Global analysis configuration - runs when the rollout reaches certain lifecycle events — typically at the start, completion, or during stable promotion
23 templates:
24 - templateName: success-rate # AnalysisTemplate name
25 args: # Arguments passed to the template
26 - name: service-name
27 value: my-supergraph # Supergraph name for metrics queryDefine analysis templates
Analysis templates must be defined as separate Kubernetes resources. Here's an example that checks for success rate using Prometheus metrics:
1apiVersion: argoproj.io/v1alpha1
2kind: AnalysisTemplate
3metadata:
4 name: success-rate # Template name referenced in Supergraph
5spec:
6 metrics:
7 - name: success-rate
8 interval: 30s # Check metrics every 30 seconds
9 count: 5 # Run 5 checks (total 2.5 minutes)
10 successCondition: result[0] >= 0.95 # Success if 95%+ requests succeed
11 failureCondition: result[0] < 0.90 # Fail if less than 90% succeed
12 provider:
13 prometheus:
14 address: http://prometheus:9090 # Prometheus server address
15 query: |
16 # Calculate success rate: non-5xx requests / total requests
17 sum(rate(http_requests_total{service="{{args.service-name}}",status!~"5.."}[5m]))
18 /
19 sum(rate(http_requests_total{service="{{args.service-name}}"}[5m]))For more information on analysis templates, see the Argo Rollouts analysis documentation.
Configure networking integration
When using safe deployments with Argo Rollouts, the operator automatically creates the necessary Services and manages traffic distribution. Both stable and canary Services are configured based on your Supergraph's networking configuration.
Service configuration
The operator automatically creates two Services for canary deployments:
Stable Service: Points to the stable version of your Supergraph
Canary Service: Points to the canary version during rollouts
Both services use the networking settings from your Supergraph's spec.networking configuration (ports, service type, etc.).
Traffic splitting
Argo Rollouts uses these Services to split traffic between the stable and canary versions. The operator handles Service creation automatically, but you might need to configure your ingress or service mesh to route traffic appropriately.
For ingress-based traffic splitting, configure your ingress to route traffic to both Services based on your rollout strategy. If you're using a service mesh (Istio, Linkerd, etc.), Argo Rollouts can integrate with your mesh for traffic splitting.
Refer to the Argo Rollouts traffic management documentation for detailed configuration options.
Migrate from Deployment to Rollout
If you have an existing Supergraph using a standard Kubernetes Deployment, you can migrate to a Rollout strategy with zero downtime.
Perform a zero-downtime migration
To migrate from a Deployment to a Rollout without downtime, set migrate: true in your rollout configuration. The first time a Rollout is applied it will not run a Canary deployment.
1apiVersion: apollographql.com/v1alpha2
2kind: Supergraph
3metadata:
4 name: my-supergraph
5spec:
6 replicas: 3 # Number of replicas
7 podTemplate:
8 routerVersion: 2.7.0 # GraphOS Router version
9 deploymentStrategy:
10 rollout:
11 migrate: true # Enable zero-downtime migration
12 steps:
13 - setWeight: 20 # Shift 20% of traffic to Rollout
14 - pause:
15 duration: 5m # Wait before next step
16 - setWeight: 50 # Shift 50% of traffic to Rollout
17 - pause:
18 duration: 5m # Wait before next step
19 - setWeight: 80 # Shift 80% of traffic to Rollout
20 - pause:
21 duration: 5m # Wait before final step
22 - setWeight: 100 # Shift all traffic to Rollout
23 schema:
24 studio:
25 graphRef: my-graph@my-variant # GraphOS graph variant referenceWhen migrate: true is set:
The operator creates a Rollout with a
workloadRefpointing to your existing Deployment.Once the Rollout is
Healthy, theDeploymentis scaled down to0replicas.The operator automatically cleans up the
Deploymentafter rollout completes.
This ensures zero downtime during the migration.
Perform a standard migration
If you don't set migrate: true, the operator performs the following actions:
Create a new Rollout.
Delete the existing Deployment without waiting for Rollout to be
Healthy.
This approach might cause a brief service interruption during the transition.
Troubleshoot issues
Handle a Rollout stuck in Progressing
If your Rollout is stuck in Progressing phase, check for pause conditions or manual approval requirements. The operator automatically creates the necessary Services for traffic splitting, but issues can arise from rollout configuration or Argo Rollouts setup.
For general Argo Rollouts troubleshooting, see the Argo Rollouts troubleshooting documentation.
Traffic not shifting
If traffic isn't shifting as expected, verify that Services are created correctly:
kubectl get svcThe operator automatically creates stable and canary Services based on your Supergraph's networking configuration. If Services aren't created, check that your Supergraph spec is valid and the operator is running correctly.
For ingress or service mesh configuration issues, refer to the Argo Rollouts traffic management documentation.
Explore next steps
Learn more about Argo Rollouts
Explore Argo Rollouts traffic management for advanced networking