OpenShift 3.6 Installation
External
Internal
Overview
This document covers OpenShit Container Platform 3.6 advanced installation procedure. The procedure it is largely based on https://docs.openshift.com/container-platform/3.6/install_config/index.html and contains additional details observed during an actual installation process on KVM guests.
Location
noper430:/root/environments/ocp36 RackStation:/base/NovaOrdis/Archive/yyyy.mm.dd-ocp36
Diagram
Hardware Requirements
System and Environment Prerequisites
The pre-requisites and all preparatory steps presented in the following link are also addressed in this procedure. The links are provided for reference only:
Virtualization Host Preparation
Guest Template Preparation
Create VM templates, in this order:
Cloud Provider-Specific Configuration
- https://docs.openshift.com/container-platform/3.6/install_config/configuring_aws.html#install-config-configuring-aws
- https://docs.openshift.com/container-platform/3.6/install_config/configuring_openstack.html#install-config-configuring-openstack
- https://docs.openshift.com/container-platform/3.6/install_config/configuring_gce.html#install-config-configuring-gce
Guest Configuration
External DNS Setup
An external DNS server is required. Controlling a registrar (such as http://godaddy.com) DNS zone will provide all capabilities needed to set up external DNS address resolution. A valid alternative, though more complex, is to install a dedicated public DNS server. The DNS server must be available to all OpenShift environment nodes, and also to external clients than need to resolve public names such as the master public web console and API URL, the public wildcard name of the environment, the application router public DNS name, etc. A combination of a public DNS server that resolves public addresses and an internal DNS server, deployed as part of the environment, usually on the support node, tasked with resolving internal addresses is also a workable solution. This installation procedure includes installation of the bind DNS server on the support node - see bind DNS server installation section below.
Wildcard Domain for Application Traffic
The DNS server that resolves the public addresses will need to be capable to support wildcard sub-domains and resolve the public wildcard DNS entry to the public IP address of the node that executes the default router, or of the public ingress node that proxies to the default router. If the environment has multiple routers, an external load balancer is required, and the wildcard record must contain the public IP address of the host that runs the load balancer. The name of the wildcard domain will be later specified during the advanced installation procedure in the Ansible inventory file as openshift_master_default_subdomain.
Configuration procedures:
OpenShift Advanced Installation
The Support Node
Execute the installation as "ansible" from the support node.
As "root" on "support":
cd /usr/share/ansible chgrp -R ansible /usr/share/ansible chmod -R g+w /usr/share/ansible chgrp ansible /etc/ansible/ chmod g+w /etc/ansible/
The support node needs at least 1 GB or RAM to run the installation process.
Create an ansible key pair on "support" and disseminate it to all other nodes. As "ansible":
cd ssh-keygen -q -b 2048 -f ~/.ssh/id_rsa -t rsa
Configure Ansible Inventory File
Pre-Flight
On the support node, as "root":
cd /tmp rm -r tmp* yum* rm /usr/share/ansible/openshift-ansible/playbooks/byo/config.retry
As "ansible":
cd /etc/ansible ansible all -m ping ansible nodes -m shell -a "systemctl status docker" ansible nodes -m shell -a "docker version" ansible nodes -m shell -a "pvs" ansible nodes -m shell -a "vgs" ansible nodes -m shell -a "lvs" ansible nodes -m shell -a "nslookup something.apps.openshift.novaordis.io"
ansible nodes -m shell -a "mkdir /mnt/tmp" ansible nodes -m shell -a "mount -t nfs support.ocp36.local:/nfs /mnt/tmp" ansible nodes -m shell -a "hostname > /mnt/tmp/\$(hostname).txt" ansible nodes -m shell -a "umount /mnt/tmp" for i in /nfs/*.ocp36.local.txt; do echo $i; cat $i; done
Snapshot the Guest Images for the Entire Environment
On the virtualization host:
virsh vol-list main-storage-pool --details ./stop-ose36 virsh list --all
for i in \ /main-storage-pool/ocp36.ingress.qcow2 \ /main-storage-pool/ocp36.support.qcow2 \ /main-storage-pool/ocp36.master.qcow2 \ /main-storage-pool/ocp36.infranode.qcow2 \ /main-storage-pool/ocp36.node1.qcow2 \ /main-storage-pool/ocp36.node2.qcow2; \ do \ echo "$i snapshots:"; \ qemu-img snapshot -l $i; \ done
for i in \ /main-storage-pool/ocp36.ingress.qcow2 \ /main-storage-pool/ocp36.support.qcow2 \ /main-storage-pool/ocp36.master.qcow2 \ /main-storage-pool/ocp36.infranode.qcow2 \ /main-storage-pool/ocp36.node1.qcow2 \ /main-storage-pool/ocp36.node2.qcow2; \ do \ echo "taking snapshot of $i"; \ qemu-img snapshot -c before_ansible_installation $i; \ done
./start-ose36
Reverting the Environment to a Snapshot State
./stop-ose36
for i in \ /main-storage-pool/ocp36.ingress.qcow2 \ /main-storage-pool/ocp36.support.qcow2 \ /main-storage-pool/ocp36.master.qcow2 \ /main-storage-pool/ocp36.infranode.qcow2 \ /main-storage-pool/ocp36.node1.qcow2 \ /main-storage-pool/ocp36.node2.qcow2; \ do \ echo "reverting $i to snapshot"; \ qemu-img snapshot -a before_ansible_installation $i; \ done
Run the Advanced Installation
May want to run the pre-flight checks one more time after reboot.
As "ansible" on support:
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
For verbose installation use -vvv or -vvvv.
ansible-playbook -vvvv /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml > ansible.out
To use a different inventory file than /etc/ansible/hosts, run:
ansible-playbook -vvvv -i /custom/path/to/inventory/file /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml > ansible.out
Output of a Successful Run
PLAY RECAP ********************************************************************************************************************************************************************************************************************************************************************************************************************************* api-lb.ocp36.local : ok=97 changed=2 unreachable=0 failed=0 infranode.ocp36.local : ok=257 changed=58 unreachable=0 failed=0 localhost : ok=14 changed=0 unreachable=0 failed=0 master.ocp36.local : ok=1039 changed=290 unreachable=0 failed=0 node1.ocp36.local : ok=256 changed=58 unreachable=0 failed=0 node2.ocp36.local : ok=256 changed=58 unreachable=0 failed=0 support.ocp36.local : ok=100 changed=3 unreachable=0 failed=0
Uninstallation
In case the installation procedure runs into problems, troubleshoot and before re-starting the installation procedure, uninstall:
ansible-playbook [-i /custom/path/to/inventory/file] /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml
This a relative quick method to iterate over the installation configuration and come up with a stable configuration. However, in order to install a production deployment, you must start from a clean operating system installation. If you are using virtual machines, start from a fresh image. If you are using bare metal machines, run the following on all hosts:
yum -y remove openshift openshift-* etcd docker docker-common rm -rf /etc/origin /var/lib/openshift /etc/etcd \ /var/lib/etcd /etc/sysconfig/atomic-openshift* /etc/sysconfig/docker* \ /root/.kube/config /etc/ansible/facts.d /usr/share/openshift
Install oc on A Remote Client
Configure HTTP Proxying
Configure API Proxying
The installation procedure should have already configured routing on the API ingress node load balancer, which is referred to as "api-lb.ocp36.local".
However, the HAProxy instance logging configuration can be improved, and the defaults can be changes to help with troubleshooting, if necessary. It is recommended to reconfigure the HAProxy running on "api-lb" to log in /var/log/haproxy.log. The procedure is described here:
Configure Application Proxying
Typically, the infrastructure node(s) router pod(s) are deployed on do not expose a public address. To allow external application traffic to reach the router pods, to be re-routed to application pods, we need to stand up a new application HTTP proxy, or reconfigure the API HAProxy already deployed on api-lb.ose36.local to also proxy application traffic. The public application traffic is directed to the wildcard domain configured on the external DNS. We need to map that name to the public IP address exposed by the application HTTP proxy.
The current noper430 installation stands up two interfaces /dev/ens8 and /dev/ens9 configured with two distinct public IP addresses, and one of the network interfaces is used for API/master traffic, while the other is used for application traffic. This solution was chosen because no obvious solution was immediately available to configure HAProxy to route HTTP traffic based on its Host header in HAProxy tcp mode. It does not mean such a solution does not exist.
Another ingress node (i2) was configured with the sole purpose of running an application traffic HAProxy.
HAProxy Configuration
This is the /etc/haproxy/haproxy.cfg configuration file. Note that the file contains configuration allowing both HTTP and HTTPS traffic. HTTP/HTTPS traffic was enabled by creating two separate frontend/backend pairs, one for HTTP and one for HTTPS. I am not sure this is the most efficient way of doing it.
#--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global maxconn 20000 log 127.0.0.1:514 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid user haproxy group haproxy daemon # turn on stats unix socket stats socket /var/lib/haproxy/stats #--------------------------------------------------------------------- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode http log global option httplog option dontlognull option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 300s timeout server 300s timeout http-keep-alive 10s timeout check 10s maxconn 20000 listen stats :9000 mode http stats enable stats uri / frontend app_frontend_https bind <public-application-ip-address>:443 default_backend app_backend_https option tcplog mode tcp frontend app_frontend_http bind <public-application-ip-address>:80 default_backend app_backend_http option tcplog mode tcp backend app_backend_https balance source mode tcp server infranode 192.168.122.25:443 check backend app_backend_http balance source mode tcp server infranode 192.168.122.25:80 check
iptables Configuration
If both HTTP and HTTPS traffic is to be allowed on i2, iptables must be configured accordingly.
... -A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT -A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT ...
Configure the Default Project
Configure the default project's pods to deploy on infrastructure nodes:
oc edit namespace default
metadata: annotations: ... openshift.io/node-selector: env=infra
Reboot the Whole Environment
./stop-ose36 ./start-ose36
Base Installation Validation
Logging Installation
Metrics Installation
Post-Install
Configure the Number of Cores used by the Master and Node OpenShift Processes
Test oadm On Master
oadm top pod oadm top node
Test oc on an External Node
oc get status oc get pods
Snapshot the Working Environment
./stop-ose36 for i in \ /main-storage-pool/ocp36.ingress.qcow2 \ /main-storage-pool/ocp36.support.qcow2 \ /main-storage-pool/ocp36.master.qcow2 \ /main-storage-pool/ocp36.infranode.qcow2 \ /main-storage-pool/ocp36.node1.qcow2 \ /main-storage-pool/ocp36.node2.qcow2; \ do \ echo "taking snapshot of $i"; \ qemu-img snapshot -c ocp36_installed $i; \ done
Load Default Image Streams and Templates
The image definitions are loaded as part of the "openshift" project.