GPA Webhook example

Introduction

GPA(GeneralPodAutoscaler) provides a mechanism based on Webhook to auto-scaling workload. E.g:

apiVersion: autoscaling.ocgi.dev/v1alpha1
kind: GeneralPodAutoscaler
metadata:
  name: pa-squad
  namespace: default
spec:
  maxReplicas: 8
  minReplicas: 1
  scaleTargetRef:
    apiVersion: carrier.ocgi.dev/v1alpha1
    kind: Squad
    name: squad-example
  webhook:
    parameters:
      buffer: "2"
    service:
      name: gpa-webhook
      namespace: kube-system
      path: scale
      port: 8000

Webhook Server is implemented by application, so that application can control the number of replicas of workload.

Implement Webhook Server

This is a Webhook Server example for Squad.

Webhook Request and Response

Webhook API defined as follows:


// AutoscaleRequest defines the request to webhook autoscaler endpoint
type AutoscaleRequest struct {
	// UID is used for tracing the request and response.
	UID types.UID `json:"uid"`
	// Name is the name of the workload(Squad, Statefulset...) being scaled
	Name string `json:"name"`
	// Namespace is the workload namespace
	Namespace string `json:"namespace"`
	// Parameters are the parameter that required by webhook
	Parameters map[string]string `json:"parameters"`
	// CurrentReplicas is the current replicas
	CurrentReplicas int32 `json:"currentReplicas"`
}

// AutoscaleResponse defines the response of webhook server
type AutoscaleResponse struct {
	// UID is used for tracing the request and response.
	// It should be same as it in the request.
	UID types.UID `json:"uid"`
	// Set to false if should not do scaling
	Scale bool `json:"scale"`
	// Replicas is targeted replica count from the webhookServer
	Replicas int32 `json:"replicas"`
}

// AutoscaleReview is passed to the webhook with a populated Request value,
// and then returned with a populated Response.
type AutoscaleReview struct {
	Request  *AutoscaleRequest  `json:"request"`
	Response *AutoscaleResponse `json:"response"`
}

The fields received by the Webhook server includes workload name, namespace, parameters, currentReplicas .
The webhook should return the AutoscaleResponse structure based on the actual situation of the auto-scaling, including scale and replicas. If the scale is set to false, it means that the current does not need to be scaled.

Deploy the Webhook Server

We can deploy the Webhook server in K8s cluster), or out of the K8s cluster.

Auto-scaling based on Webhook

We shoud set the webhook field of GeneralPodAutoscaler when auto-scaling based on the Webhook mode.

If webhook server deployed in K8s cluster, we set the service field.

    apiVersion: autoscaling.ocgi.dev/v1alpha1
    kind: GeneralPodAutoscaler
    metadata:
      name: pa-test1
    spec:
      maxReplicas: 8
      minReplicas: 2
      scaleTargetRef:
        apiVersion: carrier.ocgi.dev/v1alpha1
        kind: GameServerSet
        name: example
      webhook:
        service:
          namespace: kube-system
          name: demowebhook
          port: 8000
          path: scale
        parameters:
          buffer: "3"

If webhook server deployed out of K8s cluster, we set the url field.

    apiVersion: autoscaling.ocgi.dev/v1alpha1
    kind: GeneralPodAutoscaler
    metadata:
      name: pa-test1
    spec:
      maxReplicas: 8
      minReplicas: 2
      scaleTargetRef:
        apiVersion: carrier.ocgi.dev/v1alpha1
        kind: GameServerSet
        name: example
      webhook:
        url: http://123.test.com:8080/scale
        parameters:
          buffer: "3"