Jump to: navigation, search

StarlingX/Containers/Applications/app-vault

Application: vault-armada-app

Source

Building

  • From the Debian Build environment:
VAULT_PKGS="python3-k8sapp-vault,vault-helm,vault-manager-helm,stx-vault-helm"
build-pkgs -c -p $VAULT

The packages contain:

  • python3-k8sapp-vault - sysinv integrations; helm and lifecycle
  • vault-helm - the build of upstream vault helm chart
  • vault-manager-helm - the build of Starlingx vault-manager helm chart
  • stx-vault-helm - the Starlingx application; metadata, fluxcd yaml

The final package stx-vault-helm contains the output of the others. The stx-vault-helm package is installed on the ISO. The installed application tarball is '/usr/local/share/applications/helm/vault-*.tgz'

Testing

Vault sanity should typically include:

  • application lifecycle: upload, apply, remove, abort, delete, update, and helm overrides
  • Vault with replicas=1 (AIO-SX) and replicas=3 (AIO-DX + worker or standard controller)
  • Configure vault with CLI and REST API
  • Workflow for applications that are Vault aware (REST API) and unaware (vault injector)
  • Pod recovery
  • Vault backup and restore
  • Vault-manager rekey feature
  • Download images from private registry; network isolation test

Application Lifecycle

Refer to Application Commands and Helm Overrides

Typically changes for the following should be accompanied with testing for the application lifecycle:

  • Sysinv integrations, package python3-k8sapp-vault
  • New and updated helm charts or chart overrides (values.yaml)

And also at the end of each release cycle the sanity for application lifecycle should be asserted.

Vault with replicas 1 and 3

When the vault app is applied on AIO-SX, the vault server statefulset is configured automatically with replicas=1. When there are three or more controller and worker nodes provisioned in the cluster then the vault server statefulset will be configured automatically with replicas=3.

Sanity for replicas=3 should include vault HA - observation that a standby vault server becomes active when the active server is restarted gracefully.

Refer also to "Pod Recovery" section.

Configure using CLI and REST API

Refer to the following documents for examples of configuring the vault using CLI and REST API:

Follow each configuration with sanity of the workflows in Vault Aware and Unaware test section.

Vault Aware and Unaware

Refer to the following documents for examples of these workflows:


Refer also to Configure Vault Using the Vault REST API which includes examples relevant to a vault aware application: "Create a secret", "Verify the secret"

Pod Recovery

The application pods should recover automatically when they are terminated gracefully or killed.

Kubernetes automatically restarts pods, while vault-manager will automatically unseal vault servers.

For example, delete the active vault server pod and watch the vault manager unseal it:

active="$( kubectl get pods -n vault \
    -o jsonpath='{.items[?(@.metadata.labels.vault-active=="true")].metadata.name}' )"
manager="$( kubectl get pods -n vault \
    -o jsonpath='{.items[?(@.metadata.labels.app\.kubernetes\.io/name=="vault-manager")].metadata.name}' )"
kubectl delete pods -n vault "$active"
kubectl logs -f -n vault "$manager"

Also delete the vault manager pod to observe it's recovery, and reassert that it can recover vault server pods that are deleted.

Vault Backup and Restore

TBD - the backup and restore feature of vault is new and not yet documented

Vault Rekey

The rekey feature of vault manager regenerates the unseal shards of the vault. Test the feature by creating a request, observing the vault manager rekey the vault and finally reasserting that vault server pods can be unsealed.

Create a rekey request and watch vault manager complete the request:

manager="$( kubectl get pods -n vault \
    -o jsonpath='{.items[?(@.metadata.labels.app\.kubernetes\.io/name=="vault-manager")].metadata.name}' )"

uuidgen | kubectl create secret generic -n vault cluster-rekey-request --from-file=strdata=/dev/stdin
kubectl logs -f -n vault "$manager"

Reassert the vault server pods will be unsealed by performing the "Pod Recovery" procedure.

Network Isolation Test

A network isolation test asserts the application is working correctly with configured registry overrides. The test asserts that the application's helm chart overrides (values.yaml) image/repository/tag references are compatible and recognized by the platform. These image references are automatically updated by the platform integration to refer to the platform's internal registry. Whereas the platform will download the images automatically from configured registry overrides, application pods running on the kubernetes nodes will download these images from the platform's registry instead of the external registries.

This example isolation test is sufficient for the vault application because the application requires the platform's deployment, and storage backend configuration, before the application can be applied.

Assumptions:

  • A private docker registry is available which contains the application's images
  • The application's images should not be pre-pulled before this test (start with a newly provisioned cluster)
  • The example of gnp-oam-overrides.yaml permits traffic from example infrastructure networks for services such as DNS and the private registry, and denies all other network traffic

See also Docker Registry Overrides in Ansible Bootstrap Configurations

See also: System configuration service-parameter

See also patch example for correction a helm chart to align with the platform's expectations for image/repository/tag references: Patch for agent injector image override

See also: Modify Firewall Options

Configure network isolation after platform deployment

Specify the networks for which traffic should be permitted:

# adjust these values for your network
export OAM_NET=10.10.26.0/24
export OAM_GATEWAY="10.10.26.1"
export REGISTRY_NET=10.10.27.0/24
export REGISTRY_SERVER_IP=10.10.27.123
export REGISTRY_SERVER=myregistry.example.com
export DNS1_NET=10.10.28.0/24
export DNS2_NET=10.10.29.0/24
export POLICY_NAME="stxtest-gnp-oam-overrides"

Create the yaml for GlobalNetworkPolicy:

cat <<EOF > gnp-oam-overrides.yaml
apiVersion: crd.projectcalico.org/v1
kind: GlobalNetworkPolicy
metadata:
  name: $POLICY_NAME
spec:
  egress:
  - action: Allow
    destination:
      nets:
      - $OAM_NET
      - $REGISTRY_NET
      - $DNS1_NET
      - $DNS2_NET
  - action: Deny
    destination:
      nets:
      - 0.0.0.0/0  # block all other traffic
  order: 50  # override controller-oam-if-gnp 100
  selector: has(iftype) && iftype == 'oam'
  # namespaceSelector: default is all()
  types:
  - Egress
EOF

Apply the policy:

kubectl apply -f gnp-oam-overrides.yaml

Confirm the network restrictions and permissions. These examples match the example above.

Tests that should fail:

ping -c1 -w5 8.8.8.8  # google
sudo docker pull debian:latest

Tests that should pass:

ping -c1 -w5 $OAM_GATEWAY
nslookup $REGISTRY_SERVER
ping -c1 -w5 $REGISTRY_SERVER

Confirm that the platform is pulling the application's images

Apply the vault application and confirm that it is able to apply. Confirm that all the application's pods are running.