/etc/openshift/master/master-config.yaml.<timestamp> /var/lib/openshift/etcd-backup-<timestamp>
Following an OpenShift Enterprise upgrade, it may be desirable in extreme cases to downgrade your cluster to a previous version. The following sections outline the required steps for each system in a cluster to perform such a downgrade, currently supported for the OpenShift Enterprise 3.1 to 3.0 downgrade path.
The Ansible playbook used during the upgrade process should have created a backup of the master-config.yaml file and the etcd data directory. Ensure these exist on your masters and etcd members:
/etc/openshift/master/master-config.yaml.<timestamp> /var/lib/openshift/etcd-backup-<timestamp>
If you use a separate etcd cluster instead of a single embedded etcd instance, the backup is likely created on all etcd members, though only one is required for the recovery process. You can run a separate etcd instance that is co-located with your master nodes.
The RPM downgrade process in a later step should create .rpmsave backups of the following files, but it may be a good idea to keep a separate copy regardless:
/etc/sysconfig/openshift-master /etc/etcd/etcd.conf (1)
1 | Only required if using a separate etcd cluster. |
On all masters, nodes, and etcd members, if you use a separate etcd cluster that runs on different nodes, ensure the relevant services are stopped:
# systemctl stop atomic-openshift-master # systemctl stop atomic-openshift-node # systemctl stop etcd (1)
1 | Only required if using external etcd. |
On all masters, nodes, and etcd members (if using an external etcd cluster), remove the following packages:
# yum remove atomic-openshift \ atomic-openshift-clients \ atomic-openshift-node \ atomic-openshift-master \ openvswitch \ atomic-openshift-sdn-ovs \ tuned-profiles-atomic-openshift-node
If you are using external etcd, also remove the etcd package:
# yum remove etcd
For embedded etcd, you can leave the etcd package installed, as the package is
only required so that the etcdctl
command is available to issue operations in
later steps.
Disable the OpenShift Enteprise 3.1 repositories, and re-enable the 3.0 repositories:
# subscription-manager repos \ --disable=rhel-7-server-ose-3.1-rpms \ --enable=rhel-7-server-ose-3.0-rpms
On each master, install the following packages:
# yum install openshift \ openshift-master \ openshift-node \ openshift-sdn-ovs
On each node, install the following packages:
# yum install openshift \ openshift-node \ openshift-sdn-ovs
If using a separate etcd cluster, install the following package on each etcd member:
# yum install etcd
Whether using embedded or external etcd, you must first restore the etcd backup by creating a new, single node etcd cluster. If using external etcd with multiple members, you must then also add any additional etcd members to the cluster one by one.
However, the details of the restoration process differ between embedded and external etcd. See the following section that matches your etcd configuration and follow the relevant steps before continuing to Bringing OpenShift Services Back Online.
Restore your etcd backup and configuration:
Run the following on the master with the embedded etcd:
# etcd_DIR=/var/lib/openshift/openshift.local.etcd # mv $etcd_DIR /var/lib/etcd.orig # cp -Rp /var/lib/openshift/etcd-backup-<timestamp>/ $etcd_DIR # chcon -R --reference /var/lib/etcd.orig/ $etcd_DIR # chown -R etcd:etcd $etcd_DIR
The |
Create the new, single node etcd cluster:
# etcd -data-dir=/var/lib/openshift/openshift.local.etcd \ -force-new-cluster
Verify etcd has started successfully by checking the output from the above command, which should look similar to the following at the end:
[...] 2016/01/8 13:24:21 etcdserver: starting server... [version: 2.1.1, cluster version: 2.1.0] 2016/01/8 13:24:22 raft: 5168c093630001 is starting a new election at term 13 2016/01/8 13:24:22 raft: 5168c093630001 became candidate at term 14 2016/01/8 13:24:22 raft: 5168c093630001 received vote from 5168c093630001 at term 14 2016/01/8 13:24:22 raft: 5168c093630001 became leader at term 14 2016/01/8 13:24:22 raft: raft.node: 5168c093630001 elected leader 5168c093630001 at term 14 2016/01/8 13:24:22 etcdserver: published {Name:default ClientURLs:[http://localhost:2379 http://localhost:4001]} to cluster 5168c093630002
Shut down the process by running the following from a separate terminal:
# pkill etcd
Continue to Bringing OpenShift Services Back Online.
Choose a system to be the initial etcd member, and restore its etcd backup and configuration:
Run the following on the etcd host:
# etcd_DIR=/var/lib/etcd/ # mv $etcd_DIR /var/lib/etcd.orig # cp -Rp /var/lib/openshift/etcd-backup-<timestamp>/ $etcd_DIR # chcon -R --reference /var/lib/etcd.orig/ $etcd_DIR # chown -R etcd:etcd $etcd_DIR
The |
Restore your /etc/etcd/etcd.conf file from backup or .rpmsave.
Create the new single node cluster using etcd’s --force-new-cluster
option. You can do this with a long complex command using the values from the
/etc/etcd/etcd.conf, or you can temporarily modify the systemd file and
start the service normally.
To do so, edit the /usr/lib/systemd/system/etcd.service and add
--force-new-cluster
:
# sed -i '/ExecStart/s/"$/ --force-new-cluster"/' /usr/lib/systemd/system/etcd.service # cat /usr/lib/systemd/system/etcd.service | grep ExecStart ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd --force-new-cluster"
Then restart the etcd service:
# systemctl daemon-reload # systemctl start etcd
Verify the etcd service started correctly, then re-edit the
/usr/lib/systemd/system/etcd.service file and remove the
--force-new-cluster
option:
# sed -i '/ExecStart/s/ --force-new-cluster//' /usr/lib/systemd/system/etcd.service # cat /usr/lib/systemd/system/etcd.service | grep ExecStart ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd"
Restart the etcd service, then verify the etcd cluster is running correctly and displays OpenShift’s configuration:
# systemctl daemon-reload # systemctl restart etcd # etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.16.4.18:2379,https://172.16.4.27:2379" \ ls /
If you have additional etcd members to add to your cluster, continue to Adding Additional etcd Members. Otherwise, if you only want a single node external etcd, continue to Bringing OpenShift Services Back Online.
To add additional etcd members to the cluster, you must first adjust the default
localhost peerURLs
for the first member:
Get the member ID for the first member using the member list
command:
# etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.18.1.18:2379,https://172.18.9.202:2379,https://172.18.0.75:2379" \ member list
Update the peerURLs
. In etcd 2.2 and beyond, this can be done with the
etcdctl member update
command. However, OpenShift Enterprise 3.1 uses etcd
2.1, so you must use curl
:
# curl --cacert /etc/etcd/ca.crt \ --cert /etc/etcd/peer.crt \ --key /etc/etcd/peer.key \ https://172.18.1.18:2379/v2/members/511b7fb6cc0001 \ -XPUT -H "Content-Type: application/json" \ -d '{"peerURLs":["https://172.18.1.18:2380"]}'
Re-run the member list
command and ensure the peerURLs
no longer points
to localhost.
Now add each additional member to the cluster, one at a time.
Each member must be fully added and brought online one at a time. When adding
each additional member to the cluster, the |
For each member, add it to the cluster using the values that can be found in that system’s etcd.conf file:
# etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.16.4.18:2379,https://172.16.4.27:2379" \ member add 10.3.9.222 https://172.16.4.27:2380 Added member named 10.3.9.222 with ID 4e1db163a21d7651 to cluster etcd_NAME="10.3.9.222" etcd_INITIAL_CLUSTER="10.3.9.221=https://172.16.4.18:2380,10.3.9.222=https://172.16.4.27:2380" etcd_INITIAL_CLUSTER_STATE="existing"
Using the environment variables provided in the output of the above etcdctl
member add
command, edit the /etc/etcd/etcd.conf file on the member system
itself and ensure these settings match.
Now start etcd on the new member:
# rm -rf /var/lib/etcd/member # systemctl enable etcd # systemctl start etcd
Ensure the service starts correctly and the etcd cluster is now healthy:
# etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.16.4.18:2379,https://172.16.4.27:2379" \ member list 51251b34b80001: name=10.3.9.221 peerURLs=https://172.16.4.18:2380 clientURLs=https://172.16.4.18:2379 d266df286a41a8a4: name=10.3.9.222 peerURLs=https://172.16.4.27:2380 clientURLs=https://172.16.4.27:2379 # etcdctl --cert-file=/etc/etcd/peer.crt \ --key-file=/etc/etcd/peer.key \ --ca-file=/etc/etcd/ca.crt \ --peers="https://172.16.4.18:2379,https://172.16.4.27:2379" \ cluster-health cluster is healthy member 51251b34b80001 is healthy member d266df286a41a8a4 is healthy
Now repeat this process for the next member to add to the cluster.
After all additional etcd members have been added, continue to Bringing OpenShift Services Back Online.
On each OpenShift master, restore your openshift-master configuration from backup and restart relevant services:
# cp /etc/sysconfig/openshift-master.rpmsave /etc/sysconfig/openshift-master # cp /etc/openshift/master/master-config.yaml.2015-11-20\@08\:36\:51~ /etc/openshift/master/master-config.yaml # systemctl enable openshift-master # systemctl enable openshift-node # systemctl start openshift-master # systemctl start openshift-node
On each OpenShift node, enable and restart the openshift-node service:
# systemctl enable openshift-node # systemctl start openshift-node
Your OpenShift cluster should now be back online.