Introduction
GPA(GeneralPodAutoscaler)
provides a mechanism based on Webhook to auto-scaling workload. E.g:
apiVersion: autoscaling.ocgi.dev/v1alpha1
kind: GeneralPodAutoscaler
metadata:
name: pa-squad
namespace: default
spec:
maxReplicas: 8
minReplicas: 1
scaleTargetRef:
apiVersion: carrier.ocgi.dev/v1alpha1
kind: Squad
name: squad-example
webhook:
parameters:
buffer: "2"
service:
name: gpa-webhook
namespace: kube-system
path: scale
port: 8000
Webhook Server is implemented by application, so that application can control the number of replicas of workload.
Implement Webhook Server
This is a Webhook Server example for Squad.
Webhook Request and Response
Webhook API defined as follows:
// AutoscaleRequest defines the request to webhook autoscaler endpoint
type AutoscaleRequest struct {
// UID is used for tracing the request and response.
UID types.UID `json:"uid"`
// Name is the name of the workload(Squad, Statefulset...) being scaled
Name string `json:"name"`
// Namespace is the workload namespace
Namespace string `json:"namespace"`
// Parameters are the parameter that required by webhook
Parameters map[string]string `json:"parameters"`
// CurrentReplicas is the current replicas
CurrentReplicas int32 `json:"currentReplicas"`
}
// AutoscaleResponse defines the response of webhook server
type AutoscaleResponse struct {
// UID is used for tracing the request and response.
// It should be same as it in the request.
UID types.UID `json:"uid"`
// Set to false if should not do scaling
Scale bool `json:"scale"`
// Replicas is targeted replica count from the webhookServer
Replicas int32 `json:"replicas"`
}
// AutoscaleReview is passed to the webhook with a populated Request value,
// and then returned with a populated Response.
type AutoscaleReview struct {
Request *AutoscaleRequest `json:"request"`
Response *AutoscaleResponse `json:"response"`
}
- The fields received by the Webhook server includes
workload name
,namespace
,parameters
,currentReplicas
. - The webhook should return the
AutoscaleResponse
structure based on the actual situation of the auto-scaling, includingscale
andreplicas
. If thescale
is set tofalse
, it means that the current does not need to be scaled.
Deploy the Webhook Server
We can deploy the Webhook server in K8s cluster), or out of the K8s cluster.
Auto-scaling based on Webhook
We shoud set the webhook
field of GeneralPodAutoscaler when auto-scaling based on the Webhook mode.
- If webhook server deployed in K8s cluster, we set the
service
field.
apiVersion: autoscaling.ocgi.dev/v1alpha1
kind: GeneralPodAutoscaler
metadata:
name: pa-test1
spec:
maxReplicas: 8
minReplicas: 2
scaleTargetRef:
apiVersion: carrier.ocgi.dev/v1alpha1
kind: GameServerSet
name: example
webhook:
service:
namespace: kube-system
name: demowebhook
port: 8000
path: scale
parameters:
buffer: "3"
- If webhook server deployed out of K8s cluster, we set the
url
field.
apiVersion: autoscaling.ocgi.dev/v1alpha1
kind: GeneralPodAutoscaler
metadata:
name: pa-test1
spec:
maxReplicas: 8
minReplicas: 2
scaleTargetRef:
apiVersion: carrier.ocgi.dev/v1alpha1
kind: GameServerSet
name: example
webhook:
url: http://123.test.com:8080/scale
parameters:
buffer: "3"