StarlingX/Containers/StarlingXAppsInternals

This page should provide an insight about configuration, features and general guidelines of StarlingX Apps and interaction with the App Framework.

For a build perspective of StarlingX Apps this tutorial should cover the FluxCD apps: https://wiki.openstack.org/wiki/StarlingX/Containers/HowToAddNewFluxCDAppInSTX

StarlingX management commands: https://docs.starlingx.io/cli_ref/system.html#application-management

General directory structure of a StarlingX App at build time: https://wiki.openstack.org/wiki/StarlingX/Containers/HowToAddNewFluxCDAppInSTX#Step_7:_Develop_your_application_FluxCD_packaging

This is still under construction. Plan is to finish the metadata.yaml options. Plan is to finish app_lifecycle_actions. Possible add a diagram for app state transitions.

metadata.yaml
Referencing the general directory structure (https://wiki.openstack.org/wiki/StarlingX/Containers/HowToAddNewFluxCDAppInSTX#Step_7:_Develop_your_application_FluxCD_packaging), enabling features can be done by modifying a yaml file located at: stx-APPNAME-helm/stx-APPNAME-helm/files/metadata.yaml

There is a function definition called 'def find_metadata_file' in https://opendev.org/starlingx/config/src/branch/master/sysinv/sysinv/sysinv/sysinv/common/utils.py, that lists entries used by the framework.

Note: Currently there are a few entries missing from utils.py, this will be updated when code there will be updated. The guide in later steps may still explain and show examples of entries not listed here. For example this snapshot of https://opendev.org/starlingx/config/src/commit/e9705f5bc61f29618dd34b408a4608797422a7ad/sysinv/sysinv/sysinv/sysinv/common/utils.py#L2235, doesn't have `maintain_attributes` entry: app_name: app_version: upgrades: auto_update:  update_failure_no_rollback:  from_versions: -  -  supported_k8s_version: minimum: maximum: supported_releases: :     -  -  ...   repo: - optional: defaults to HELM_REPO_FOR_APPS disabled_charts: - optional: charts default to enabled -    -     ...    maintain_user_overrides:  - optional: defaults to false. Over an app update any user overrides are preserved for the new version of the application ...   behavior: - optional: describes the app behavior platform_managed_app:  - optional: when absent behaves as false desired_state:  - optional: state the app should reach evaluate_reapply: - optional: describe the reapply evaluation behaviour after: - optional: list of apps that should be evaluated before the current one -  -  triggers: - optional: list of what triggers the reapply evaluation - type:  filters: - optional: list of field:value, that aid filtering of the trigger events. All pairs in this list must be                   present in trigger dictionary that is passed in                    the calls (eg. trigger[field_name1]==value_name1 and                    trigger[field_name2]==value_name2). Function evaluate_apps_reapply takes a dictionary called 'trigger' as parameter. Depending on trigger type this may contain custom information used by apps, for example a field 'personality' corresponding to node personality. It is the duty of the app developer to enhance existing triggers with the required information. Hard to obtain information should be passed in the trigger. To use existing information it is as simple as defining the metadata. - :  - :  filter_field: <field_name> - optional: field name in trigger dictionary. If specified the filters are applied to trigger[filter_field] sub-dictionary instead of the root trigger dictionary. apply_progress_adjust: - optional: Positive integer value by which to adjust the percentage calculations for the progress of                                      a monitoring task. Default value is zero (no adjustment)

maintain_user_overrides
Currently if you create overrides for a helm chart, when you update an app the overrides will be lost. system helm-override-update

The overrides themselves are stored in sysinv database table helm_overrides in a column called 'user_overrides'.

If you want to keep the overrides during app update you can update the metadata.yaml ([1] example location for one app), adding at root level the following `maintain_user_overrides: true`

There is more. You can override the behavior present by adding a special flag during 'system application-update'. You can force the information either way: reuse(will keep the overrides) or not reuse(reset overrides). system application[sysadmin@controller-0 ~(keystone_admin)]$ system application-update usage: system application-update [-n ] [-v ] [--reuse-user-overrides <true/false>] [--reuse-attributes <true/false>]

system application[sysadmin@controller-0 ~(keystone_admin)]$ system application-update -n MY_APP -v MY_VERSION --reuse-user-overrides true /path/to/tar.gz

[1]: https://opendev.org/starlingx/platform-armada-app/src/branch/master/stx-platform-helm/stx-platform-helm/files/metadata.yaml

maintain_attributes
Currently if you disable a helm chart when you update an app it will be re-enabled by default on the newer version. [sysadmin@controller-0 ~(keystone_admin)]$ system helm-chart-attribute-modify usage: system helm-chart-attribute-modify [--enabled <true/false>]

system helm-chart-attribute-modify --enabled false MY_APP MY_CHART MY_NAMESPACE

The chart attribute(enabled/disabled) itself is stored in sysinv database table helm_overrides in a column called 'system_overrides' (bad naming, will be aligned later).enabled

If you want to keep the disabled status during app update you can update the metadata.yaml ([1] example location for one app), adding at root level the following `maintain_attributes: true`

There is more. You can override the behavior present by adding a special flag during 'system application-update'. You can force the information either way: reuse(will keep disabled the charts that were disabled) or not reuse(reset all the charts to be enabled).

system application[sysadmin@controller-0 ~(keystone_admin)]$ system application-update usage: system application-update [-n ] [-v ] [--reuse-user-overrides <true/false>] [--reuse-attributes <true/false>]

system application[sysadmin@controller-0 ~(keystone_admin)]$ system application-update -n MY_APP -v MY_VERSION --reuse-attributes true /path/to/tar.gz

[1]: https://opendev.org/starlingx/platform-armada-app/src/branch/master/stx-platform-helm/stx-platform-helm/files/metadata.yaml

This was introduced in stx.8.0 by: https://review.opendev.org/c/starlingx/config/+/865327

upgrades/auto_update
There is a mechanism to allow apps to be automatic updated. This can be used both when 1) delivering an updated app part of a platform patch and 2) after platform upgrades `system upgrade-complete` step. Probably a bad naming because it is present under a key named 'upgrades', but it was designed under an upgrades context, I suspect no one realized there is a `patching` context at that moment. For explaining the platform upgrades see Upgrade consideration.

The auto update will be triggered only when the app is in applied state.

If you want to enable the auto update feature you need to update the metadata.yaml ([1] example location for one app), adding at root level the following

upgrades: auto_update: true

[1]: https://opendev.org/starlingx/platform-armada-app/src/branch/master/stx-platform-helm/stx-platform-helm/files/metadata.yaml

In case of patching a live system, after the patch is applied, when a new version of an app is delivered via the patch and the new version has this mechanism enabled, it will get automatically updated(up-versioned) to the new version. In case of patching a live system, after the patch is applied, when a new version of an app is delivered via the patch and the new version does not have this mechanism enabled, it will not get automatically updated(up-versioned) to the new version.

In case of patching a live system, after the patch is removed, when an old version of an app is delivered via the patch removal and the old version has this mechanism enabled, it will get automatically updated(down-versioned) to the old version. In case of patching a live system, after the patch is removed, when an old version of an app is delivered via the patch removal and the old version does not have this mechanism enabled, it will not get automatically updated(down-versioned) to the old version.

This was introduced in stx.6.0 by https://review.opendev.org/c/starlingx/config/+/800821

behavior/platform_managed_app
There is a mechanism to tell the Framework the app should be managed by the Framework. We call this a platform managed app. The advantages of a platform managed app are: The functionalities are described later in their specific sections.
 * 1) the Framework can perform some automated tasks such as auto reapply of the app based on specific triggers.
 * 2) the Framework can achieve a desired state after unlocking the first controller.

If you want to enable the platform managed feature you need to update the metadata.yaml ([1] example location for one app), adding at root level the following

behavior: platform_managed_app: yes

[1]: https://opendev.org/starlingx/platform-armada-app/src/branch/master/stx-platform-helm/stx-platform-helm/files/metadata.yaml

This was introduced in stx.5.0 by https://review.opendev.org/c/starlingx/config/+/773451

behavior/desired_state
In case an app is declared a platform managed app the desired state to be achieve by the Framework can controlled using a variable called desired_state. Only uploaded and applied states are supported now.

Currently the Framework can achieve uploaded state by setting this metadata: behavior: platform_managed_app: yes desired_state: uploaded

Currently the Framework can achieve applied state by setting this metadata: behavior: platform_managed_app: yes desired_state: applied

This was introduced in stx.5.0 by https://review.opendev.org/c/starlingx/config/+/773451

behavior/evaluate_reapply
In case an app is declared a platform managed app doing automatic app re-applies by the Framework, based on specific conditions, can be controlled using a variable called evaluate_reapply. The Framework will determine(evaluate) if there is a change that require an app apply.

This was introduced in stx.5.0 by https://review.opendev.org/c/starlingx/config/+/773451

evaluate_reapply/after
We can control the order in which the Framework evaluates/re-applies the apps using a list variable called after. This will ensure current app will only be re-evaluated/re-applied after the apps listed there. For example current app will be re-evaluated/re-applied after another-app1 and another-app2

behavior: platform_managed_app: yes evaluate_reapply: after: - another-app1 - another-app2

A concrete example based on [1]: oidc-auth-apps will be always be evaluated after platform-integ-apps, such that if there is a change that requires platform-integ-apps to be applied, then platform-integ-apps will be applied before oidc-auth-apps.

[1]: https://opendev.org/starlingx/oidc-auth-armada-app/src/branch/master/stx-oidc-auth-helm/stx-oidc-auth-helm/files/metadata.yaml#L12-L14

evaluate_reapply/triggers
We can control what triggers the Framework to evaluate the re-apply for apps using a list variable called triggers. An app can be instructed to subscribe to some events by creating the necessary configuration in metadata.yaml for it.

behavior: platform_managed_app: yes evaluate_reapply: triggers: - trigger1 - trigger2

The definition is explained in [1]. A snapshot of trigger types can be found at [2].

Triggers are created and passed to evaluate_apps_reapply function inside conductor [3]. A real life example of such call is self.evaluate_apps_reapply(context, trigger={'type': constants.APP_EVALUATE_REAPPLY_HOST_AVAILABILITY,                                              'availability': availability})

The most basic trigger is evaluate_apps_reapply(context, trigger={'type': 'some-type'}) which can be subscribed to by this metadata: behavior: platform_managed_app: yes evaluate_reapply: triggers: - type: some-type

A list of filters can be applied to a trigger (all key:value pairs must match) evaluate_apps_reapply(context, trigger={'type': 'some-type', 'key1': 'value1', 'key2': 'value2'}) which can be subscribed to by this metadata: behavior: platform_managed_app: yes evaluate_reapply: triggers: - type: some-type filter: - key1: value1 - key2: value2

A list of filters can be applied to a trigger (all key:value pairs must match) on a subdictionary evaluate_apps_reapply(context, trigger={'type': 'some-type', 'subdict1': {'key1': 'value1', 'key2': 'value2'}}) which can be subscribed to by this metadata: behavior: platform_managed_app: yes evaluate_reapply: triggers: - type: some-type filter_field: subdict1 filter: - key1: value1 - key2: value2

[1]: https://opendev.org/starlingx/config/src/commit/6e832b47070ec980d3e31d564862beeb5dd0432d/sysinv/sysinv/sysinv/sysinv/common/utils.py#L2248 [2]: https://opendev.org/starlingx/config/src/commit/6e832b47070ec980d3e31d564862beeb5dd0432d/sysinv/sysinv/sysinv/sysinv/common/constants.py#L1922 [3]: https://opendev.org/starlingx/config/src/commit/e948a02f29fbd377c4663a91dbe33a688ff3f3c0/sysinv/sysinv/sysinv/sysinv/conductor/manager.py#L14019

This was introduced in stx.5.0 by https://review.opendev.org/c/starlingx/config/+/773451

app_lifecycle_actions
The intent is to allow apps to run custom code to do specific operation. One of such mechanisms are the lifecycle plugins. The interface between App Framework and apps is the app_lifecycle_actions function [1][2].

This code in [2] is where the transition from AppFramework to App lifecycle plugin happens.

lifecycle_op = self._helm.get_app_lifecycle_operator(app.name) lifecycle_op.app_lifecycle_actions(context, conductor_obj, self, app, hook_info)

For an app to implement its own lifecycle actions it needs to: systemconfig.app_lifecycle = = :<new class that extends AppLifecycleOperator>
 * create an entry point in setup.cfg with this format
 * extend the abstract interface AppLifecycleOperator[3]

For example this will create an entry point https://opendev.org/starlingx/platform-armada-app/src/commit/33d690327f6d3d199e143d280b5e9c3b21f8834c/python3-k8sapp-platform/k8sapp_platform/setup.cfg#L41-L42 and this will create the lifecycle https://opendev.org/starlingx/platform-armada-app/src/commit/33d690327f6d3d199e143d280b5e9c3b21f8834c/python3-k8sapp-platform/k8sapp_platform/k8sapp_platform/lifecycle/lifecycle_platform.py#L25

If an app doesn't provide its own lifecycle actions it will inherit the default class[4]. As we can see the default extends the abstract interface[3] without changing anything.

Ideally each app concrete implementation of AppLifecycleOperator.app_lifecycle_actions takes care of all the resources, but in some cases we see that the apps override default functionality then calling super.app_lifecycle_actions to leverage default behavior implemented in [3]. See for example [5].

This has pros and cons. The pro is we can have common code that apps would do in place, the con is if some behavior needs to change it will affect all apps. The current common code between apps is placed in [6].

Going back to the transition from AppFramework to App lifecycle plugin happens, the intent is to assemble and pass necessary information between in an object called LifecycleHookInfo [7]. This is used like a bridge, so when populating/reading information in/to the object it is desired to use constans defined in LifecycleConstants [8].

The intent is for the app_lifecycle_hook to raise specific exceptions to inform the framework of some event, this will be described here: Lifecycle exceptions

[1]: https://opendev.org/starlingx/config/src/commit/e948a02f29fbd377c4663a91dbe33a688ff3f3c0/sysinv/sysinv/sysinv/sysinv/conductor/manager.py#L14177 [2]: https://opendev.org/starlingx/config/src/commit/e948a02f29fbd377c4663a91dbe33a688ff3f3c0/sysinv/sysinv/sysinv/sysinv/conductor/kube_app.py#L2521 [3]: https://opendev.org/starlingx/config/src/commit/e948a02f29fbd377c4663a91dbe33a688ff3f3c0/sysinv/sysinv/sysinv/sysinv/helm/lifecycle_base.py#L22-24 [4]: https://opendev.org/starlingx/config/src/commit/e948a02f29fbd377c4663a91dbe33a688ff3f3c0/sysinv/sysinv/sysinv/sysinv/helm/lifecycle_generic.py#L17 [5]: https://opendev.org/starlingx/platform-armada-app/src/commit/33d690327f6d3d199e143d280b5e9c3b21f8834c/python3-k8sapp-platform/k8sapp_platform/k8sapp_platform/lifecycle/lifecycle_platform.py#L62 [6]: https://opendev.org/starlingx/config/src/commit/e948a02f29fbd377c4663a91dbe33a688ff3f3c0/sysinv/sysinv/sysinv/sysinv/helm/lifecycle_utils.py [7]: https://opendev.org/starlingx/config/src/commit/e948a02f29fbd377c4663a91dbe33a688ff3f3c0/sysinv/sysinv/sysinv/sysinv/helm/lifecycle_hook.py [8]: https://opendev.org/starlingx/config/src/commit/e948a02f29fbd377c4663a91dbe33a688ff3f3c0/sysinv/sysinv/sysinv/sysinv/helm/lifecycle_constants.py

LifecycleHookInfo
Searching for all occurrences of app_lifecycle_actions will reveal how the LifecycleHookInfo object is used. The gist would be something like this real example hook_info = LifecycleHookInfo hook_info.init(constants.APP_LIFECYCLE_MODE_AUTO,                          constants.APP_LIFECYCLE_TYPE_SEMANTIC_CHECK,                           constants.APP_LIFECYCLE_TIMING_PRE,                           constants.APP_UPDATE_OP) hook_info[LifecycleConstants.EXTRA][LifecycleConstants.FROM_APP] = True self.app_lifecycle_actions(context, app, hook_info)

TBD constants link. TBD lifecycle_utils TBD exceptions

LifecycleHookInfo/mode
Possible values:
 * APP_LIFECYCLE_MODE_MANUAL = 'manual'
 * APP_LIFECYCLE_MODE_AUTO = 'auto'

APP_LIFECYCLE_MODE_MANUAL
Manual means an action triggered through CLI/REST API (application upload/apply/remove/delete/update + lock/unlock actions).

APP_LIFECYCLE_MODE_AUTO
Auto means an action triggered by the framework internally.

LifecycleHookInfo/lifecycle_type
Possible values:
 * APP_LIFECYCLE_TYPE_SEMANTIC_CHECK = 'check'
 * APP_LIFECYCLE_TYPE_OPERATION = 'operation'
 * APP_LIFECYCLE_TYPE_RBD = 'rbd'
 * APP_LIFECYCLE_TYPE_RESOURCE = 'resource'
 * APP_LIFECYCLE_TYPE_MANIFEST = 'manifest'
 * APP_LIFECYCLE_TYPE_FLUXCD_REQUEST = 'fluxcd-request'

APP_LIFECYCLE_TYPE_SEMANTIC_CHECK
Before application operations + lock/unlock allow a semantic check to happen. An app can check specific conditions and reject the operation by raising Lifecycle exceptions

TBD the follwoing:

LifecycleHookInfo/relative_timing
Possible values: With a few exceptions (semantic check, and what else?) hooks are called around specific portions, meaning they are called before(pre) and after(post). Think of a hook before doing an app apply, and a hook after doing the app apply,.
 * APP_LIFECYCLE_TIMING_PRE = 'pre'
 * APP_LIFECYCLE_TIMING_POST = 'post'

LifecycleHookInfo/operation
Possible values:
 * APP_UPLOAD_OP = 'upload'
 * APP_APPLY_OP = 'apply'
 * APP_REMOVE_OP = 'remove'
 * APP_DELETE_OP = 'delete'
 * APP_UPDATE_OP = 'update'
 * APP_ABORT_OP = 'abort'
 * APP_EVALUATE_REAPPLY_OP = 'evaluate-reapply'
 * APP_BACKUP = 'backup'
 * APP_ETCD_BACKUP = 'etcd-backup'
 * APP_RESTORE = 'restore'
 * APP_LIFECYCLE_OPERATION_MTC_ACTION = 'mtc-action'

TBD info about them:

LifecycleHookInfo/extra
This field was intended to pass information to and from the lifecycle hook itself.

Lifecycle exceptions
Searching for all occurrences of class Lifecycle.*Exception regex will reveal the existing. An in-depth explanation of why we have 3 types of LifecycleSemanticCheck exceptions is needed. At first glance I see that the intent is to surpass some logging or add special logging.

The intent is to throw LifecycleSemanticCheckException if you want to reject the operation (deny the operation to happen). Searching for all occurrences of 'APP_LIFECYCLE_TYPE_SEMANTIC_CHECK' will show what semantic checks(for what operations) will be called by the App Framework.

Helm Chart Overrides
Apps have overrides per namespace per chart. There are sysinv commands that can be used to query the overrides.

Here is an example query for platform-integ-apps. [sysadmin@controller-0 ~(keystone_admin)]$ system helm-override-list platform-integ-apps ++--+   | chart name         | overrides namespaces | ++--+   | ceph-pools-audit   | ['kube-system']      | | cephfs-provisioner | ['kube-system']     | | rbd-provisioner   | ['kube-system']      | ++--+   [sysadmin@controller-0 ~(keystone_admin)]$ system helm-override-show platform-integ-apps rbd-provisioner kube-system ++--+   | Property           | Value                                                | ++--+   | attributes         | enabled: true                                        | |                   |                                                      |    | combined_overrides | classdefaults:                                       | EDITED | name              | rbd-provisioner                                      | | namespace         | kube-system                                          | | system_overrides  | classdefaults:                                       | EDITED | user_overrides    | None                                                 | ++--+

You can define behaviors by implementing BaseHelm class ([4]). An example of such implementation is RbdProvisionerHelm for platform-integ-armada app ([6] + [7]). While inspecting the code we observe [1] called in [2], [2] called in [3] and [3] is further used throughout some apps and sysinv.

[1]: https://opendev.org/starlingx/config/src/commit/c937f46ecee2802473d786ab8c0addddb9039abc/sysinv/sysinv/sysinv/sysinv/helm/base.py#L378-L385 [2]: https://opendev.org/starlingx/config/src/commit/c937f46ecee2802473d786ab8c0addddb9039abc/sysinv/sysinv/sysinv/sysinv/helm/helm.py#L379-L396 [3]: https://opendev.org/starlingx/config/src/commit/c937f46ecee2802473d786ab8c0addddb9039abc/sysinv/sysinv/sysinv/sysinv/helm/helm.py#L464-L495 [4]: https://opendev.org/starlingx/config/src/commit/c937f46ecee2802473d786ab8c0addddb9039abc/sysinv/sysinv/sysinv/sysinv/helm/base.py#L23 [5]: https://opendev.org/starlingx/platform-armada-app/src/commit/ef33b99009adb398486a75ee7198076bc1ba059e/python3-k8sapp-platform/k8sapp_platform/k8sapp_platform/helm/rbd_provisioner.py#L19 [6]: https://opendev.org/starlingx/platform-armada-app/src/commit/ef33b99009adb398486a75ee7198076bc1ba059e/python3-k8sapp-platform/k8sapp_platform/setup.cfg#L33 [7]: https://opendev.org/starlingx/platform-armada-app/src/commit/ef33b99009adb398486a75ee7198076bc1ba059e/python3-k8sapp-platform/k8sapp_platform/setup.cfg#L35

Guidelines
We strongly encourage you to enable the auto update feature unless there are special extra steps needed for an application update to happen. See `updates/auto_update` above.

We strongly encourage you to enable maintaining user overrides feature unless there are special extra steps needed for an application update to happen. See `maintain_user_overrides` above.

Rationale for updates and maintaining user overrides would be: Are there any steps required before `system application-update`when applying a patch on a live system? Are there any steps required before `system application-update` at `system upgrade-activate` time? Was there an override format change that requires a transformation between version N and N+1 of the format? Can I update helm-charts to allow both formats so that a transformation can be skipped?

We strongly encourage you to enable maintaining disabled helm-charts feature. See `maintain_attributes` above.

Resource Accounting Caveat
Most platform applications should be affined to the platform cores, but all other containerized workloads would use the application or application-isolated cores. For legacy applications this is controlled by namespace, and is managed by a customization to kubelet. Going forward (as of June 2023) we are modifying the system to use the "app.starlingx.io/component=platform" label on the application pod or namespace to signify that it should be run on the platform cores.

We also need to ensure the resources consumed by platform pods (cpu/memory requests) are not counted against the application node resources since they should be accounted against the platform resources. This includes both the Pod requests and the platform accounting [2]. In particular, containers running on platform cores must not request cpu resources from Kubernetes. Memory resources are less of a concern as we usually are not memory-constrained the way that we are CPU-constrained.

If your application should run on platform cores, it is important to minimize the amount of CPU time it uses. We have a limited amount of platform CPU and it needs to be carefully managed.

[2] Resource accounting: monitoring/collectd-extensions/src/plugin_common.py (K8S_NAMESPACE_SYSTEM)

Upgrade considerations
To update your app during platform upgrades currently an App Developer has to add the app in this list [1]. What will happen is one of the 2 things:
 * 1) If the app is in uploaded state an 'system application-delete' will happen for the N version, followed by an 'system application-upload' for N+1 version.
 * 2) If the app is in applied state an 'system application-delete' will happen for the N version, followed by an 'system application-update' to N+1 version.

In case the app is not added to [1], after the platform upgrade completes (system upgrade-complete was executed), the app may still be auto updated if metadata for such operation was added: upgrades/auto_update

[1]: https://opendev.org/starlingx/config/src/commit/67966ea4eb3ac5f5ee5ccc4941d5e41f9fe9ae1c/controllerconfig/controllerconfig/upgrade-scripts/65-k8s-app-upgrade.sh#L93

Related work
We continue with a wiki targeting people involved in App Framework area. This next page is not currently designed to be facing App Developers themselves, and probably never will: App Framework page