A router can be assigned to a node to control traffic in an OKD cluster. OKD uses HAProxy as the default router, but options are available.
The HAProxy template router implementation is the reference implementation for a template router plug-in. It uses the openshift/origin-haproxy-router repository to run an HAProxy instance alongside the template router plug-in.
The template router has two components:
A wrapper that watches endpoints and routes and causes a HAProxy reload based on changes
A controller that builds the HAProxy configuration file based on routes and endpoints
The HAProxy router uses version 1.8.1. |
The controller and HAProxy are housed inside a pod, which is managed by a deployment configuration. The process of setting up the router is automated
by the oc adm router
command.
The controller watches the routes and endpoints for changes, as well as HAProxy’s health. When a change is detected, it builds a new haproxy-config file and restarts HAProxy. The haproxy-config file is constructed based on the router’s template file and information from OKD.
The HAProxy template file can be customized as needed to support features that are not currently supported by OKD. The HAProxy manual describes all of the features supported by HAProxy.
The following diagram illustrates how data flows from the master through the plug-in and finally into an HAProxy configuration:
HAProxy Template router Metrics
The HAProxy router exposes or publishes metrics in Prometheus format for consumption by external metrics collection and aggregation systems (e.g. Prometheus, statsd). The router can be configured to provide HAProxy CSV format metrics, or provide no router metrics at all.
The metrics are collected from both the router controller and from HAProxy every five seconds. The router metrics counters start at zero when the router is deployed and increase over time. The HAProxy metrics counters are reset to zero every time haproxy is reloaded. The router collects HAProxy statistics for each frontend, back end, and server. To reduce resource usage when there are more than 500 servers, the back ends are reported instead of the servers because a back end can have multiple servers.
The statistics are a subset of the available HAProxy statistics.
The following HAProxy metrics are collected on a periodic basis and converted to Prometheus format. For every front end the "F" counters are collected. When the counters are collected for each back end and the "S" server counters are collected for each server. Otherwise, the "B" counters are collected for each back end and no server counters are collected.
See router environment variables for more information.
In the following table:
Column 1 - Index from HAProxy CSV statistics
Column 2
F |
Front end metrics |
b |
Back end metrics when not showing Server metrics due to the Server Threshold, |
B |
Back end metrics when showing Server metrics |
S |
Server metrics. |
Column 3 - The counter
Column 4 - Counter description
Index |
Usage |
Counter |
Description |
2 |
bBS |
current_queue |
Current number of queued requests not assigned to any server. |
4 |
FbS |
current_sessions |
Current number of active sessions. |
5 |
FbS |
max_sessions |
Maximum observed number of active sessions. |
7 |
FbBS |
connections_total |
Total number of connections. |
8 |
FbS |
bytes_in_total |
Current total of incoming bytes. |
9 |
FbS |
bytes_out_total |
Current total of outgoing bytes. |
13 |
bS |
connection_errors_total |
Total of connection errors. |
14 |
bS |
response_errors_total |
Total of response errors. |
17 |
bBS |
up |
Current health status of the back end (1 = UP, 0 = DOWN). |
21 |
S |
check_failures_total |
Total number of failed health checks. |
24 |
S |
downtime_seconds_total |
Total downtime in seconds.", nil), |
33 |
FbS |
current_session_rate |
Current number of sessions per second over last elapsed second. |
35 |
FbS |
max_session_rate |
Maximum observed number of sessions per second. |
40 |
FbS |
http_responses_total |
Total of HTTP responses, code 2xx |
43 |
FbS |
http_responses_total |
Total of HTTP responses, code 5xx |
60 |
bS |
http_average_response_latency_milliseconds |
of the last 1024 requests in milliseconds. |
The router controller scrapes the following items. These are only available with Prometheus format metrics.
Name |
Description |
template_router_reload_seconds |
Measures the time spent reloading the router in seconds. |
template_router_write_config_seconds |
Measures the time spent writing out the router configuration to disk in seconds. |
haproxy_exporter_up |
Was the last scrape of haproxy successful. |
haproxy_exporter_csv_parse_failures |
Number of errors while parsing CSV. |
haproxy_exporter_scrape_interval |
The time in seconds before another scrape is allowed, proportional to size of data. |
haproxy_exporter_server_threshold |
Number of servers tracked and the current threshold value. |
haproxy_exporter_total_scrapes |
Current total HAProxy scrapes. |
http_request_duration_microseconds |
The HTTP request latencies in microseconds. |
http_request_size_bytes |
The HTTP request sizes in bytes. |
http_response_size_bytes |
The HTTP response sizes in bytes. |
openshift_build_info |
A metric with a constant '1' value labeled by major, minor, git commit & git version from which OpenShift was built. |
ssh_tunnel_open_count |
Counter of SSH tunnel total open attempts |
ssh_tunnel_open_fail_count |
Counter of SSH tunnel failed open attempts |
The F5 BIG-IP router plug-in is one of the available router plugins.
The F5 router plug-in integrates with an existing F5 BIG-IP system in your environment. F5 BIG-IP version 11.4 or newer is required in order to have the F5 iControl REST API. The F5 router supports unsecured, edge terminated, re-encryption terminated, and passthrough terminated routes matching on HTTP vhost and request path.
The F5 router plug-in has feature parity with the HAProxy template router. The F5 router plug-in additionally supports:
path-based routing (using policy rules),
re-encryption (implemented using client and server SSL profiles)
passthrough of encrypted connections (implemented using an iRule that parses the SNI protocol and uses a data group that is maintained by the F5 router for the servername lookup).
Passthrough routes are a special case: path-based routing is technically impossible with passthrough routes because F5 BIG-IP itself does not see the HTTP request, so it cannot examine the path. The same restriction applies to the template router; it is a technical limitation of passthrough encryption, not a technical limitation of OKD. |
Because F5 BIG-IP is external to the OpenShift SDN, a cluster administrator must create a peer-to-peer tunnel between F5 BIG-IP and a host that is on the SDN, typically an OKD node host. This ramp node can be configured as unschedulable for pods so that it will not be doing anything except act as a gateway for the F5 BIG-IP host. You can also configure multiple such hosts and use the OKD ipfailover feature for redundancy; the F5 BIG-IP host would then need to be configured to use the ipfailover VIP for its tunnel’s remote endpoint.
The operation of the F5 router plug-in is similar to that of the OKD routing-daemon used in earlier versions. Both use REST API calls to:
create and delete pools,
add endpoints to and delete them from those pools, and
configure policy rules to route to pools based on vhost.
Both also use scp
and ssh
commands to upload custom TLS/SSL certificates to
F5 BIG-IP.
The F5 router plug-in configures pools and policy rules on virtual servers as follows:
When a user creates or deletes a route on OKD, the router creates a pool to F5 BIG-IP for the route (if no pool already exists) and adds a rule to, or deletes a rule from, the policy of the appropriate vserver: the HTTP vserver for non-TLS routes, or the HTTPS vserver for edge or re-encrypt routes. In the case of edge and re-encrypt routes, the router also uploads and configures the TLS certificate and key. The router supports host- and path-based routes.
Passthrough routes are a special case: to support those, it is necessary to write an iRule that parses the SNI ClientHello handshake record and looks up the servername in an F5 data-group. The router creates this iRule, associates the iRule with the vserver, and updates the F5 data-group as passthrough routes are created and deleted. Other than this implementation detail, passthrough routes work the same way as other routes. |
When a user creates a service on OKD, the router adds a pool to F5 BIG-IP (if no pool already exists). As endpoints on that service are created and deleted, the router adds and removes corresponding pool members.
When a user deletes the route and all endpoints associated with a particular pool, the router deletes that pool.
With native integration of the F5 BIG-IP with OKD, you do not need to configure a ramp node for the F5 BIG-IP to be able to reach the pods on the overlay network as created by OpenShift SDN.
Also, only F5 BIG-IP appliance version 12.x and above works with the F5 router plug-in
presented in this section. You also need sdn-services
add-on license for the
integration to work properly.
For version 11.x, set up a ramp
node.
The F5 appliance can connect to the OKD cluster via an L3 connection. An L2 switch connectivity is not required between OKD nodes. On the appliance, you can use multiple interfaces to manage the integration:
Management interface - Reaches the web console of the F5 appliance.
External interface - Configures the virtual servers for inbound web traffic.
Internal interface - Programs the appliance and reaches out to the pods.
An F5 controller pod has admin
access to the appliance. The F5 image is
launched within the OKD cluster (scheduled on any node) that uses
iControl REST APIs to program the virtual servers with policies, and configure
the VxLAN device.
This section explains how the packets reach the pods, and vice versa. These actions are performed by the F5 router plug-in pod and the F5 appliance, not the user. |
When natively integrated, The F5 appliance reaches out to the pods directly using VxLAN encapsulation. This integration works only when OKD is using openshift-sdn as the network plug-in. The openshift-sdn plug-in employs VxLAN encapsulation for the overlay network that it creates.
To make a successful data path between a pod and the F5 appliance:
F5 needs to encapsulate the VxLAN packet meant for the pods. This requires the sdn-services license add-on. A VxLAN device needs to be created and the pod overlay network needs to be routed through this device.
F5 needs to know the VTEP IP address of the pod, which is the IP address of the node where the pod is located.
F5 needs to know which source-ip
to use for the overlay network when
encapsulating the packets meant for the pods. This is known as the gateway address.
OKD nodes need to know where the F5 gateway address is (the VTEP address for the return traffic). This needs to be the internal interface’s address. All nodes of the cluster must learn this automatically.
Since the overlay network is multi-tenant aware, F5 must use a VxLAN ID that is
representative of an admin
domain, ensuring that all tenants are reachable by
the F5. Ensure that F5 encapsulates all packets with a vnid
of 0
(the
default vnid
for the admin
namespace in OKD) by putting an
annotation on the manually created hostsubnet
-
pod.network.openshift.io/fixed-vnid-host: 0
.
A ghost hostsubnet
is manually created as part of the setup, which fulfills
the third and forth listed requirements. When the F5 router plug-in pod is launched,
this new ghost hostsubnet
is provided so that the F5 appliance can be
programmed suitably.
The term ghost |
The first requirement is fulfilled by the F5 router plug-in pod once it is launched.
The second requirement is also fulfilled by the F5 plug-in pod, but it is an
ongoing process. For each new node that is added to the cluster, the controller
pod creates an entry in the VxLAN device’s VTEP FDB. The controller pod needs
access to the nodes
resource in the cluster, which you can accomplish by
giving the service account appropriate privileges. Use the following command:
$ oc adm policy add-cluster-role-to-user system:sdn-reader system:serviceaccount:default:router
These actions are performed by the F5 router plug-in pod and the F5 appliance, not the user. |
The destination pod is identified by the F5 virtual server for a packet.
VxLAN dynamic FDB is looked up with pod’s IP address. If a MAC address is found, go to step 5.
Flood all entries in the VTEP FDB with ARP requests seeking the pod’s MAC address. An entry is made into the VxLAN dynamic FDB with the pod’s MAC address and the VTEP to be used as the value.
Encap an IP packet with VxLAN headers, where the MAC of the pod and the VTEP of the node is given as values from the VxLAN dynamic FDB.
Calculate the VTEP’s MAC address by sending out an ARP or checking the host’s neighbor cache.
Deliver the packet through the F5 host’s internal address.
These actions are performed by the F5 router plug-in pod and the F5 appliance, not the user. |
The pod sends back a packet with the destination as the F5 host’s VxLAN gateway address.
The openvswitch
at the node determines that the VTEP for this packet is the
F5 host’s internal interface address. This is learned from the ghost hostsubnet
creation.
A VxLAN packet is sent out to the internal interface of the F5 host.
During the entire data flow, the VNID is pre-fixed to be |