OpenShift 3.5 Installation
External
Internal
Overview
There are two installation methods: quick install, which uses a CLI tool available in the "atomic-openshift-utils" package, which, in turn, uses Ansible in the background, and advanced install. The advanced install assumes familiarity with Ansible. This document covers advanced install.
Prerequisites
External DNS Setup
An external DNS server is required. If you control a registrar DNS zone, such as http://godaddy.com, that will work. A valid alternative is to install a dedicated DNS server. The DNS server must be available to all OpenShift environment nodes, and also to external clients than need to resolve public names such as the master public web console and API URL, the application router public DNS name, etc.
If you have different means of resolving public names to clients, a DNS server deployed as part of the environment will work. It can be deployed on the support node.
bind DNS Server
Procedure to configure a bind server:
Wildcard Domain for Application Traffic
The DNS server will need to be capable to support wildcard sub-domains and resolve the public wildcard DNS entry to the public IP address of the node that executes the default router. If the environment has multiple routers, an external load balancer is required, and the wildcard record must contain the public IP address of the host that runs the load balancer. The name of the wildcard domain will be later specified during the advanced installation procedure in the Ansible inventory file as 'openshift_master_default_subdomain'.
For the configuration procedure see:
Minimum Hardware Requirements
A full RHEL7.3 master installation requires 121 MB in /boot and an average of 2.4 GB in /. A full RHEL7.3 node installation requires 121 MB in /boot and an average of 2.2 GB in /.
O/S Requirements and Configuration
Basic OS
Install RHEL 7.3 in "minimal" installation mode. This document describes the installation in top of a a VirtualBox or VMware Fusion virtual machine.
- Provision the VM. The procedure to build VirtualBox VMs is described here.
- OpenShift requires NetworkManager on all nodes (see https://docs.openshift.com/container-platform/3.5/install_config/install/prerequisites.html#prereq-networkmanager). Make sure it works:
nmcli g
- Assign a static IP address to the interface to be used by the OpenShift cluster, as described here: adding a Static Ethernet Connection with NetworkManager.
- Attach the node to the subscription, using subscription manager, as described here: registering a RHEL System with subscription manager. The support node(s) need only a Red Hat Enterprise Linux. The OpenShift nodes need an OpenShift subscription. For OpenShift, follow these steps: https://docs.openshift.com/container-platform/3.5/install_config/install/host_preparation.html#host-registration. This is the summary of the sequence of steps - the goal of these steps is to configure the following supported repositories on the system: "rhel-7-server-rpms", "rhel-7-server-extras-rpms", "rhel-7-server-ose-3.5-rpms", "rhel-7-fast-datapath-rpms":
subscription-manager register subscription-manager list --available --matches '*OpenShift*' subscription-manager attach --pool=<pool-id> --quantity=1 subscription-manager repos --disable="*" subscription-manager repos --list-enabled yum repolist yum-config-manager --disable <repo_id> subscription-manager repos --enable="rhel-7-server-rpms" --enable="rhel-7-server-extras-rpms" --enable="rhel-7-server-ose-3.5-rpms" --enable="rhel-7-fast-datapath-rpms" subscription-manager repos --list-enabled yum repolist
- Install base packages (https://docs.openshift.com/container-platform/3.5/install_config/install/host_preparation.html#installing-base-packages):
yum install wget git net-tools bind-utils iptables-services bridge-utils bash-completion yum update -y yum install atomic-openshift-utils
- Prevent accidental upgrades of OpenShiift and Docker, by installing "excluder" packages. The *-excluder packages add entries to the "exclude" directive in the host’s /etc/yum.conf file when installed. Those entries can be removed later when we explicitly want to upgrade OpenShift or Docker. More details in yum Exclusion.
yum install atomic-openshift-excluder atomic-openshift-docker-excluder
- If later we need to upgrade, we must run the following command:
atomic-openshift-excluder unexclude
- Reboot the system to make sure it starts correctly after package installation:
systemctl reboot
Installation User
Create an installation user that can log in remotely from support.openshift35.local, which is the host that will drive installation. Conventionally we name that user "ansible" and it must be able to passwordlessly ssh into itself and all other environment nodes, and do passwordless sudo. The user will be part of the shared image.
groupadd -g 1200 ansible useradd -m -g ansible -u 1200 ansible
Configure public/private key authentication and install the public key into its own authorized_keys file.
Allow passwordless sudo to "ansible".
Firewall Configuration
Turn off firewalld and configure the iptables service.
systemctl stop firewalld systemctl disable firewalld systemctl is-enabled firewalld
OpenShift needs iptables running:
systemctl enable iptables systemctl start iptables
NFS Client
Install NFS client dependencies.
Support Host Configuration
OpenShift Node Configuration
Configure the DNS client to use the DNS server that was installed as part of the procedure. See Manual /etc/resolv.conf Configuration and https://docs.openshift.com/container-platform/3.5/install_config/install/prerequisites.html#prereq-dns
Make sure SELinux is enabled on all hosts. If is not, enable SELinux and make sure SELINUXTYPE is "targeted" in /etc/selinux/config.
sestatus
Docker Installation
Install Docker. Docker is technically not required on masters, but it is easier to create a uniform image and only disable docker on masters. The binaries must be installed from the rhel-7-server-ose-3.*-rpms repository and have it running before installing OpenShift.
OpenShift 3.5 requires Docker 1.12.
yum install docker docker version
The advanced installation procedure will update /etc/sysconfig/docker on nodes with OpenShift-specific configuration.
After a full 3.5 HA installation, no "--insecure-registry 172.30.0.0/16" was present in the docker startup parameters, so until further notice, add it by hand in /etc/sysconfig/docker. as shown below:
OPTIONS='... --insecure-registry 172.30.0.0/16'
Provision storage for the Docker server. The default loopback storage is not appropriate for production, it should be replaced by a thin-pool logical volume. Follow https://docs.openshift.com/container-platform/3.5/install_config/install/host_preparation.html#configuring-docker-storage. Used Option A) "an additional block device". On VirtualBox, provision a new virtual disk of appropriate size and configure it as Docker storage backend.
The procedure consists in executing /usr/bin/docker-storage-setup with the base configuration read from /usr/lib/docker-storage-setup/docker-storage-setup and custom configuration specified in /etc/sysconfig/docker-storage-setup, similarly to:
STORAGE_DRIVER=devicemapper DEVS=/dev/sdb VG=docker_vg DATA_SIZE=500M MIN_DATA_SIZE=1M
Under some circumstances, /usr/bin/docker-storage-setup fails with:
[...] end of partition 1 has impossible value for cylinders: 65 (should be in 0-64) sfdisk: I don't like these partitions - nothing changed. (If you really want this, use the --force option.)
If that happens, follow the manual procedure of provisioning Docker storage on a dedicated block device:
After the script completes successfully, it creates a logical volume with an XFS filesystem mounted on docker root directory /var/lib/docker and the Docker storage configuration file /etc/sysconfig/docker-storage. The thin pool to be used by Docker should be visible in lvs:
# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert docker-pool docker_vg twi-a-t--- 500.00m 0.00 0.88 root main_vg -wi-ao---- 7.00g
Disable docker-storage-setup, is not needed, storage already setup.
systemctl disable docker-storage-setup
Enable Docker at boot and start it.
systemctl enable docker systemctl start docker
Reboot the system and then check Docker Server Runtime.
TODO: parse and NOKB this: https://docs.openshift.com/container-platform/3.5/scaling_performance/optimizing_storage.html#optimizing-storage
Generic Docker installation instructions Docker Installation.
Miscellaneous
Cloud-Provider Specific Configuration
- https://docs.openshift.com/container-platform/3.5/install_config/configuring_aws.html#install-config-configuring-aws
- https://docs.openshift.com/container-platform/3.5/install_config/configuring_openstack.html#install-config-configuring-openstack
- https://docs.openshift.com/container-platform/3.5/install_config/configuring_gce.html#install-config-configuring-gce
Common Image Post-Processing
- Adjust memory and CPU.
- Reconfigure Linux VM Guest Image.
- Tests:
docker version
lvs
From support.openshift35.local, try:
ssh -i ~ansible/.ssh/id_rsa ansible@master2
OpenShift Advanced Installation
The Support Node
Execute the installation as "ansible" from the support node.
cd /usr/share/ansible chgrp -R ansible /usr/share/ansible chmod -R g+w /usr/share/ansible
The support node needs at least 1 GB or RAM to run the installation process.
Configure Ansible Inventory File
The default Ansible inventory file is /etc/ansible/hosts. It is used by the Ansible playbook to install the OpenShift environment. The inventory file describes the configuration and the topology of the OpenShift cluster. Start from a template like https://github.com/NovaOrdis/playground/blob/master/openshift/3.5/hosts and customize it to match the environment.
If the target nodes have multiple network interfaces, and the network interface used to cluster OpenShift is NOT associated with the default route, modify the inventory file as follows:
... openshift_set_node_ip=true ... [nodes] master1.openshift35.local openshift_ip=172.23.0.4 ...
Patch Ansible Logic
External DNS Server Support
OpenShift 3.5 installer did not handle 'openshift_dns_ip' properly, dnsmasq/NewrokManager runtime ignored it. In order to fix it, has to use the following:
- /usr/share/ansible/openshift-ansible/roles/openshift_node_dnsmasq/tasks/main.yml
- /usr/share/ansible/openshift-ansible/roles/openshift_node_dnsmasq/files/networkmanager/99-origin-dns.sh
Pre-Flight
On the support node:
As "root":
cd /tmp rm -r tmp* yum* rm /usr/share/ansible/openshift-ansible/playbooks/byo/config.retry
As "ansible":
ansible all -m ping ansible nodes -m shell -a "docker version" ansible nodes -m shell -a "nslookup something.apps.openshift35.external"
Running the Advanced Installation
ansible-playbook -vvv /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
For verbose installation use -vvv or -vvvv.
To use a different inventory file than /etc/ansible/hosts, run:
ansible-playbook -vvv -i /custom/path/to/inventory/file /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
Output of a Successful Run
PLAY RECAP ********************************************************************* infranode1.openshift35.local : ok=222 changed=60 unreachable=0 failed=0 infranode2.openshift35.local : ok=222 changed=60 unreachable=0 failed=0 lb.openshift35.local : ok=75 changed=14 unreachable=0 failed=0 localhost : ok=12 changed=0 unreachable=0 failed=0 master1.openshift35.local : ok=1033 changed=274 unreachable=0 failed=0 master2.openshift35.local : ok=445 changed=132 unreachable=0 failed=0 master3.openshift35.local : ok=445 changed=132 unreachable=0 failed=0 node1.openshift35.local : ok=222 changed=60 unreachable=0 failed=0 node2.openshift35.local : ok=222 changed=60 unreachable=0 failed=0 support.openshift35.local : ok=77 changed=5 unreachable=0 failed=0
Verifying the Installation
Uninstallation
In case the installation procedure runs into problems, troubleshoot and before re-starting the installation procedure, uninstall:
ansible-playbook [-i /custom/path/to/inventory/file] /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml
This a relative quick method to iterate over the installation configuration and come up with a stable configuration. However, in order to install a production deployment, you must start from a clean operating system installation. If you are using virtual machines, start from a fresh image. If you are using bare metal machines, run the following on all hosts:
# yum -y remove openshift openshift-* etcd docker docker-common # rm -rf /etc/origin /var/lib/openshift /etc/etcd \ /var/lib/etcd /etc/sysconfig/atomic-openshift* /etc/sysconfig/docker* \ /root/.kube/config /etc/ansible/facts.d /usr/share/openshift
Post-Install
Deploy a HAProxy Router
In the case there is more than one router pod, the public application traffic, directed to the wildcard domain configured on the external DNS must be handled by a proxy that load balances between the router pods.
This load balancer must be deployed.
TODO
Provision Persistent Storage
TODO
Deploy the Integrated Docker Registry
TODO: isn't this automatically deployed? See "Internal Registry Configuration" section of the inventory file. How do I check?
Load Image Streams
TODO
Load Templates
TODO