Packaging format - Understanding Operators | Operators

Bundle format
File-based catalogs

This guide outlines the packaging format for Operators supported by Operator Lifecycle Manager (OLM) in OpenShift Container Platform.

Support for the legacy package manifest format for Operators is removed in OpenShift Container Platform 4.8 and later. Existing Operator projects in the package manifest format can be migrated to the bundle format by using the Operator SDK pkgman-to-bundle command. See Migrating package manifest projects to bundle format for more details.

Bundle format

The bundle format for Operators is a packaging format introduced by the Operator Framework. To improve scalability and to better enable upstream users hosting their own catalogs, the bundle format specification simplifies the distribution of Operator metadata.

An Operator bundle represents a single version of an Operator. On-disk bundle manifests are containerized and shipped as a bundle image, which is a non-runnable container image that stores the Kubernetes manifests and Operator metadata. Storage and distribution of the bundle image is then managed using existing container tools like podman and docker and container registries such as Quay.

Operator metadata can include:

Information that identifies the Operator, for example its name and version.
Additional information that drives the UI, for example its icon and some example custom resources (CRs).
Required and provided APIs.
Related images.

When loading manifests into the Operator Registry database, the following requirements are validated:

The bundle must have at least one channel defined in the annotations.
Every bundle has exactly one cluster service version (CSV).
If a CSV owns a custom resource definition (CRD), that CRD must exist in the bundle.

Manifests

Bundle manifests refer to a set of Kubernetes manifests that define the deployment and RBAC model of the Operator.

A bundle includes one CSV per directory and typically the CRDs that define the owned APIs of the CSV in its /manifests directory.

Example bundle format layout

etcd
├── manifests
│   ├── etcdcluster.crd.yaml
│   └── etcdoperator.clusterserviceversion.yaml
│   └── secret.yaml
│   └── configmap.yaml
└── metadata
    └── annotations.yaml
    └── dependencies.yaml

Additionally supported objects

The following object types can also be optionally included in the /manifests directory of a bundle:

Supported optional object types

ClusterRole
ClusterRoleBinding
configmap
ConsoleCLIDownload
ConsoleLink
ConsoleQuickStart
ConsoleYamlSample
PodDisruptionBudget
PriorityClass
PrometheusRule
Role
RoleBinding
Secret
Service
ServiceAccount
ServiceMonitor
VerticalPodAutoscaler

When these optional objects are included in a bundle, Operator Lifecycle Manager (OLM) can create them from the bundle and manage their lifecycle along with the CSV:

Lifecycle for optional objects

When the CSV is deleted, OLM deletes the optional object.
When the CSV is upgraded:
- If the name of the optional object is the same, OLM updates it in place.
- If the name of the optional object has changed between versions, OLM deletes and recreates it.

Annotations

A bundle also includes an annotations.yaml file in its /metadata directory. This file defines higher level aggregate data that helps describe the format and package information about how the bundle should be added into an index of bundles:

Example annotations.yaml

annotations:
  operators.operatorframework.io.bundle.mediatype.v1: "registry+v1" (1)
  operators.operatorframework.io.bundle.manifests.v1: "manifests/" (2)
  operators.operatorframework.io.bundle.metadata.v1: "metadata/" (3)
  operators.operatorframework.io.bundle.package.v1: "test-operator" (4)
  operators.operatorframework.io.bundle.channels.v1: "beta,stable" (5)
  operators.operatorframework.io.bundle.channel.default.v1: "stable" (6)

1	The media type or format of the Operator bundle. The `registry+v1` format means it contains a CSV and its associated Kubernetes objects.
2	The path in the image to the directory that contains the Operator manifests. This label is reserved for future use and currently defaults to `manifests/`. The value `manifests.v1` implies that the bundle contains Operator manifests.
3	The path in the image to the directory that contains metadata files about the bundle. This label is reserved for future use and currently defaults to `metadata/`. The value `metadata.v1` implies that this bundle has Operator metadata.
4	The package name of the bundle.
5	The list of channels the bundle is subscribing to when added into an Operator Registry.
6	The default channel an Operator should be subscribed to when installed from a registry.

In case of a mismatch, the annotations.yaml file is authoritative because the on-cluster Operator Registry that relies on these annotations only has access to this file.

Dependencies

The dependencies of an Operator are listed in a dependencies.yaml file in the metadata/ folder of a bundle. This file is optional and currently only used to specify explicit Operator-version dependencies.

The dependency list contains a type field for each item to specify what kind of dependency this is. The following types of Operator dependencies are supported:

olm.package: This type indicates a dependency for a specific Operator version. The dependency information must include the package name and the version of the package in semver format. For example, you can specify an exact version such as 0.5.2 or a range of versions such as >0.5.1.
olm.gvk: With this type, the author can specify a dependency with group/version/kind (GVK) information, similar to existing CRD and API-based usage in a CSV. This is a path to enable Operator authors to consolidate all dependencies, API or explicit versions, to be in the same place.
olm.constraint: This type declares generic constraints on arbitrary Operator properties.

In the following example, dependencies are specified for a Prometheus Operator and etcd CRDs:

Example dependencies.yaml file

dependencies:
  - type: olm.package
    value:
      packageName: prometheus
      version: ">0.27.0"
  - type: olm.gvk
    value:
      group: etcd.database.coreos.com
      kind: EtcdCluster
      version: v1beta2

Additional resources

Operator Lifecycle Manager dependency resolution

About the opm CLI

The opm CLI tool is provided by the Operator Framework for use with the Operator bundle format. This tool allows you to create and maintain catalogs of Operators from a list of Operator bundles that are similar to software repositories. The result is a container image which can be stored in a container registry and then installed on a cluster.

A catalog contains a database of pointers to Operator manifest content that can be queried through an included API that is served when the container image is run. On OpenShift Container Platform, Operator Lifecycle Manager (OLM) can reference the image in a catalog source, defined by a CatalogSource object, which polls the image at regular intervals to enable frequent updates to installed Operators on the cluster.

See CLI tools for steps on installing the opm CLI.

File-based catalogs

File-based catalogs are the latest iteration of the catalog format in Operator Lifecycle Manager (OLM). It is a plain text-based (JSON or YAML) and declarative config evolution of the earlier SQLite database format, and it is fully backwards compatible. The goal of this format is to enable Operator catalog editing, composability, and extensibility.

The default Red Hat-provided Operator catalogs for OpenShift Container Platform 4.6 and later are currently still shipped in the SQLite database format.

Editing

With file-based catalogs, users interacting with the contents of a catalog are able to make direct changes to the format and verify that their changes are valid. Because this format is plain text JSON or YAML, catalog maintainers can easily manipulate catalog metadata by hand or with widely known and supported JSON or YAML tooling, such as the jq CLI.

This editability enables the following features and user-defined extensions:

Promoting an existing bundle to a new channel
Changing the default channel of a package
Custom algorithms for adding, updating, and removing upgrade edges

Composability

File-based catalogs are stored in an arbitrary directory hierarchy, which enables catalog composition. For example, consider two separate file-based catalog directories: catalogA and catalogB. A catalog maintainer can create a new combined catalog by making a new directory catalogC and copying catalogA and catalogB into it.

This composability enables decentralized catalogs. The format permits Operator authors to maintain Operator-specific catalogs, and it permits maintainers to trivially build a catalog composed of individual Operator catalogs. File-based catalogs can be composed by combining multiple other catalogs, by extracting subsets of one catalog, or a combination of both of these.

Duplicate packages and duplicate bundles within a package are not permitted. The opm validate command returns an error if any duplicates are found.

Because Operator authors are most familiar with their Operator, its dependencies, and its upgrade compatibility, they are able to maintain their own Operator-specific catalog and have direct control over its contents. With file-based catalogs, Operator authors own the task of building and maintaining their packages in a catalog. Composite catalog maintainers, however, only own the task of curating the packages in their catalog and publishing the catalog to users.

Extensibility

The file-based catalog specification is a low-level representation of a catalog. While it can be maintained directly in its low-level form, catalog maintainers can build interesting extensions on top that can be used by their own custom tooling to make any number of mutations.

For example, a tool could translate a high-level API, such as (mode=semver), down to the low-level, file-based catalog format for upgrade edges. Or a catalog maintainer might need to customize all of the bundle metadata by adding a new property to bundles that meet a certain criteria.

While this extensibility allows for additional official tooling to be developed on top of the low-level APIs for future OpenShift Container Platform releases, the major benefit is that catalog maintainers have this capability as well.

Directory structure

File-based catalogs can be stored and loaded from directory-based file systems. The opm CLI loads the catalog by walking the root directory and recursing into subdirectories. The CLI attempts to load every file it finds and fails if any errors occur.

Non-catalog files can be ignored using .indexignore files, which have the same rules for patterns and precedence as .gitignore files.

Example .indexignore file

# Ignore everything except non-object .json and .yaml files
**/*
!*.json
!*.yaml
**/objects/*.json
**/objects/*.yaml

Catalog maintainers have the flexibility to choose their desired layout, but it is recommended to store each package’s file-based catalog blobs in separate subdirectories. Each individual file can be either JSON or YAML; it is not necessary for every file in a catalog to use the same format.

Basic recommended structure

catalog
├── packageA
│   └── index.yaml
├── packageB
│   ├── .indexignore
│   ├── index.yaml
│   └── objects
│       └── packageB.v0.1.0.clusterserviceversion.yaml
└── packageC
    └── index.json

This recommended structure has the property that each subdirectory in the directory hierarchy is a self-contained catalog, which makes catalog composition, discovery, and navigation trivial file system operations. The catalog could also be included in a parent catalog by copying it into the parent catalog’s root directory.

Schemas

File-based catalogs use a format, based on the CUE language specification, that can be extended with arbitrary schemas. The following _Meta CUE schema defines the format that all file-based catalog blobs must adhere to:

_Meta schema

_Meta: {
  // schema is required and must be a non-empty string
  schema: string & !=""

  // package is optional, but if it's defined, it must be a non-empty string
  package?: string & !=""

  // properties is optional, but if it's defined, it must be a list of 0 or more properties
  properties?: [... #Property]
}

#Property: {
  // type is required
  type: string & !=""

  // value is required, and it must not be null
  value: !=null
}

No CUE schemas listed in this specification should be considered exhaustive. The opm validate command has additional validations that are difficult or impossible to express concisely in CUE.

An Operator Lifecycle Manager (OLM) catalog currently uses three schemas (olm.package, olm.channel, and olm.bundle), which correspond to OLM’s existing package and bundle concepts.

Each Operator package in a catalog requires exactly one olm.package blob, at least one olm.channel blob, and one or more olm.bundle blobs.

All olm.* schemas are reserved for OLM-defined schemas. Custom schemas must use a unique prefix, such as a domain that you own.

olm.package schema

The olm.package schema defines package-level metadata for an Operator. This includes its name, description, default channel, and icon.

olm.package schema

#Package: {
  schema: "olm.package"

  // Package name
  name: string & !=""

  // A description of the package
  description?: string

  // The package's default channel
  defaultChannel: string & !=""

  // An optional icon
  icon?: {
    base64data: string
    mediatype:  string
  }
}

olm.channel schema

The olm.channel schema defines a channel within a package, the bundle entries that are members of the channel, and the upgrade edges for those bundles.

A bundle can included as an entry in multiple olm.channel blobs, but it can have only one entry per channel.

It is valid for an entry’s replaces value to reference another bundle name that cannot be found in this catalog or another catalog. However, all other channel invariants must hold true, such as a channel not having multiple heads.

olm.channel schema

#Channel: {
  schema: "olm.channel"
  package: string & !=""
  name: string & !=""
  entries: [...#ChannelEntry]
}

#ChannelEntry: {
  // name is required. It is the name of an `olm.bundle` that
  // is present in the channel.
  name: string & !=""

  // replaces is optional. It is the name of bundle that is replaced
  // by this entry. It does not have to be present in the entry list.
  replaces?: string & !=""

  // skips is optional. It is a list of bundle names that are skipped by
  // this entry. The skipped bundles do not have to be present in the
  // entry list.
  skips?: [...string & !=""]

  // skipRange is optional. It is the semver range of bundle versions
  // that are skipped by this entry.
  skipRange?: string & !=""
}

olm.bundle schema

olm.bundle schema

#Bundle: {
  schema: "olm.bundle"
  package: string & !=""
  name: string & !=""
  image: string & !=""
  properties: [...#Property]
  relatedImages?: [...#RelatedImage]
}

#Property: {
  // type is required
  type: string & !=""

  // value is required, and it must not be null
  value: !=null
}

#RelatedImage: {
  // image is the image reference
  image: string & !=""

  // name is an optional descriptive name for an image that
  // helps identify its purpose in the context of the bundle
  name?: string & !=""
}

Properties

Properties are arbitrary pieces of metadata that can be attached to file-based catalog schemas. The type field is a string that effectively specifies the semantic and syntactic meaning of the value field. The value can be any arbitrary JSON or YAML.

OLM defines a handful of property types, again using the reserved olm.* prefix.

olm.package property

The olm.package property defines the package name and version. This is a required property on bundles, and there must be exactly one of these properties. The packageName field must match the bundle’s first-class package field, and the version field must be a valid semantic version.

olm.package property

#PropertyPackage: {
  type: "olm.package"
  value: {
    packageName: string & !=""
    version: string & !=""
  }
}

olm.gvk property

The olm.gvk property defines the group/version/kind (GVK) of a Kubernetes API that is provided by this bundle. This property is used by OLM to resolve a bundle with this property as a dependency for other bundles that list the same GVK as a required API. The GVK must adhere to Kubernetes GVK validations.

olm.gvk property

#PropertyGVK: {
  type: "olm.gvk"
  value: {
    group: string & !=""
    version: string & !=""
    kind: string & !=""
  }
}

olm.package.required

The olm.package.required property defines the package name and version range of another package that this bundle requires. For every required package property a bundle lists, OLM ensures there is an Operator installed on the cluster for the listed package and in the required version range. The versionRange field must be a valid semantic version (semver) range.

olm.package.required property

#PropertyPackageRequired: {
  type: "olm.package.required"
  value: {
    packageName: string & !=""
    versionRange: string & !=""
  }
}

olm.gvk.required

The olm.gvk.required property defines the group/version/kind (GVK) of a Kubernetes API that this bundle requires. For every required GVK property a bundle lists, OLM ensures there is an Operator installed on the cluster that provides it. The GVK must adhere to Kubernetes GVK validations.

olm.gvk.required property

#PropertyGVKRequired: {
  type: "olm.gvk.required"
  value: {
    group: string & !=""
    version: string & !=""
    kind: string & !=""
  }
}

Example catalog

With file-based catalogs, catalog maintainers can focus on Operator curation and compatibility. Because Operator authors have already produced Operator-specific catalogs for their Operators, catalog maintainers can build their catalog by rendering each Operator catalog into a subdirectory of the catalog’s root directory.

There are many possible ways to build a file-based catalog; the following steps outline a simple approach:

Maintain a single configuration file for the catalog, containing image references for each Operator in the catalog:

Example catalog configuration file

name: community-operators
repo: quay.io/community-operators/catalog
tag: latest
references:
- name: etcd-operator
  image: quay.io/etcd-operator/index@sha256:5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03
- name: prometheus-operator
  image: quay.io/prometheus-operator/index@sha256:e258d248fda94c63753607f7c4494ee0fcbe92f1a76bfdac795c9d84101eb317

Run a script that parses the configuration file and creates a new catalog from its references:

Example script

name=$(yq eval '.name' catalog.yaml)
mkdir "$name"
yq eval '.name + "/" + .references[].name' catalog.yaml | xargs mkdir
for l in $(yq e '.name as $catalog | .references[] | .image + "|" + $catalog + "/" + .name + "/index.yaml"' catalog.yaml); do
  image=$(echo $l | cut -d'|' -f1)
  file=$(echo $l | cut -d'|' -f2)
  opm render "$image" > "$file"
done
opm alpha generate dockerfile "$name"
indexImage=$(yq eval '.repo + ":" + .tag' catalog.yaml)
docker build -t "$indexImage" -f "$name.Dockerfile" .
docker push "$indexImage"

Guidelines

Consider the following guidelines when maintaining file-based catalogs.

Immutable bundles

The general advice with Operator Lifecycle Manager (OLM) is that bundle images and their metadata should be treated as immutable.

If a broken bundle has been pushed to a catalog, you must assume that at least one of your users has upgraded to that bundle. Based on that assumption, you must release another bundle with an upgrade edge from the broken bundle to ensure users with the broken bundle installed receive an upgrade. OLM will not reinstall an installed bundle if the contents of that bundle are updated in the catalog.

However, there are some cases where a change in the catalog metadata is preferred:

Channel promotion: If you already released a bundle and later decide that you would like to add it to another channel, you can add an entry for your bundle in another olm.channel blob.
New upgrade edges: If you release a new 1.2.z bundle version, for example 1.2.4, but 1.3.0 is already released, you can update the catalog metadata for 1.3.0 to skip 1.2.4.

Source control

Catalog metadata should be stored in source control and treated as the source of truth. Updates to catalog images should include the following steps:

Update the source-controlled catalog directory with a new commit.
Build and push the catalog image. Use a consistent tagging taxonomy, such as :latest or :<target_cluster_version>, so that users can receive updates to a catalog as they become available.

CLI usage

For instructions about creating file-based catalogs by using the opm CLI, see Managing custom catalogs.

For reference documentation about the opm CLI commands related to managing file-based catalogs, see CLI tools.

Automation

Operator authors and catalog maintainers are encouraged to automate their catalog maintenance with CI/CD workflows. Catalog maintainers can further improve on this by building GitOps automation to accomplish the following tasks:

Check that pull request (PR) authors are permitted to make the requested changes, for example by updating their package’s image reference.
Check that the catalog updates pass the opm validate command.
Check that the updated bundle or catalog image references exist, the catalog images run successfully in a cluster, and Operators from that package can be successfully installed.
Automatically merge PRs that pass the previous checks.
Automatically rebuild and republish the catalog image.

Operator Framework packaging format