Introduction
GPA(GeneralPodAutoscaler) provides a mechanism based on Webhook to auto-scaling workload. E.g:
apiVersion: autoscaling.ocgi.dev/v1alpha1
kind: GeneralPodAutoscaler
metadata:
name: pa-squad
namespace: default
spec:
maxReplicas: 8
minReplicas: 1
scaleTargetRef:
apiVersion: carrier.ocgi.dev/v1alpha1
kind: Squad
name: squad-example
webhook:
parameters:
buffer: "2"
service:
name: gpa-webhook
namespace: kube-system
path: scale
port: 8000
Webhook Server is implemented by application, so that application can control the number of replicas of workload.
Implement Webhook Server
This is a Webhook Server example for Squad.
Webhook Request and Response
Webhook API defined as follows:
// AutoscaleRequest defines the request to webhook autoscaler endpoint
type AutoscaleRequest struct {
// UID is used for tracing the request and response.
UID types.UID `json:"uid"`
// Name is the name of the workload(Squad, Statefulset...) being scaled
Name string `json:"name"`
// Namespace is the workload namespace
Namespace string `json:"namespace"`
// Parameters are the parameter that required by webhook
Parameters map[string]string `json:"parameters"`
// CurrentReplicas is the current replicas
CurrentReplicas int32 `json:"currentReplicas"`
}
// AutoscaleResponse defines the response of webhook server
type AutoscaleResponse struct {
// UID is used for tracing the request and response.
// It should be same as it in the request.
UID types.UID `json:"uid"`
// Set to false if should not do scaling
Scale bool `json:"scale"`
// Replicas is targeted replica count from the webhookServer
Replicas int32 `json:"replicas"`
}
// AutoscaleReview is passed to the webhook with a populated Request value,
// and then returned with a populated Response.
type AutoscaleReview struct {
Request *AutoscaleRequest `json:"request"`
Response *AutoscaleResponse `json:"response"`
}
- The fields received by the Webhook server includes
workload name,namespace,parameters,currentReplicas. - The webhook should return the
AutoscaleResponsestructure based on the actual situation of the auto-scaling, includingscaleandreplicas. If thescaleis set tofalse, it means that the current does not need to be scaled.
Deploy the Webhook Server
We can deploy the Webhook server in K8s cluster), or out of the K8s cluster.
Auto-scaling based on Webhook
We shoud set the webhook field of GeneralPodAutoscaler when auto-scaling based on the Webhook mode.
- If webhook server deployed in K8s cluster, we set the
servicefield.
apiVersion: autoscaling.ocgi.dev/v1alpha1
kind: GeneralPodAutoscaler
metadata:
name: pa-test1
spec:
maxReplicas: 8
minReplicas: 2
scaleTargetRef:
apiVersion: carrier.ocgi.dev/v1alpha1
kind: GameServerSet
name: example
webhook:
service:
namespace: kube-system
name: demowebhook
port: 8000
path: scale
parameters:
buffer: "3"
- If webhook server deployed out of K8s cluster, we set the
urlfield.
apiVersion: autoscaling.ocgi.dev/v1alpha1
kind: GeneralPodAutoscaler
metadata:
name: pa-test1
spec:
maxReplicas: 8
minReplicas: 2
scaleTargetRef:
apiVersion: carrier.ocgi.dev/v1alpha1
kind: GameServerSet
name: example
webhook:
url: http://123.test.com:8080/scale
parameters:
buffer: "3"