OpenShift 3.6 Installation: Difference between revisions
Line 318: | Line 318: | ||
==Test oadm On Master== | ==Test oadm On Master== | ||
oadm | oadm top pod | ||
oadm top node | |||
==Test oc on an External Node== | ==Test oc on an External Node== |
Revision as of 09:43, 11 November 2017
External
Internal
Overview
This document covers OpenShit Container Platform 3.6 advanced installation procedure. The procedure it is largely based on https://docs.openshift.com/container-platform/3.6/install_config/index.html and contains additional details observed during an actual installation process on KVM guests.
Location
noper430:/root/environments/ocp36 RackStation:/base/NovaOrdis/Archive/yyyy.mm.dd-ocp36
Diagram
Hardware Requirements
System and Environment Prerequisites
The pre-requisites and all preparatory steps presented in the following link are also addressed in this procedure. The links are provided for reference only:
Virtualization Host Preparation
Guest Template Preparation
Create VM templates, in this order:
Cloud Provider-Specific Configuration
- https://docs.openshift.com/container-platform/3.6/install_config/configuring_aws.html#install-config-configuring-aws
- https://docs.openshift.com/container-platform/3.6/install_config/configuring_openstack.html#install-config-configuring-openstack
- https://docs.openshift.com/container-platform/3.6/install_config/configuring_gce.html#install-config-configuring-gce
Guest Configuration
External DNS Setup
An external DNS server is required. Controlling a registrar (such as http://godaddy.com) DNS zone will provide all capabilities needed to set up external DNS address resolution. A valid alternative, though more complex, is to install a dedicated public DNS server. The DNS server must be available to all OpenShift environment nodes, and also to external clients than need to resolve public names such as the master public web console and API URL, the public wildcard name of the environment, the application router public DNS name, etc. A combination of a public DNS server that resolves public addresses and an internal DNS server, deployed as part of the environment, usually on the support node, tasked with resolving internal addresses is also a workable solution. This installation procedure includes installation of the bind DNS server on the support node - see bind DNS server installation section below.
Wildcard Domain for Application Traffic
The DNS server that resolves the public addresses will need to be capable to support wildcard sub-domains and resolve the public wildcard DNS entry to the public IP address of the node that executes the default router, or of the public ingress node that proxies to the default router. If the environment has multiple routers, an external load balancer is required, and the wildcard record must contain the public IP address of the host that runs the load balancer. The name of the wildcard domain will be later specified during the advanced installation procedure in the Ansible inventory file as openshift_master_default_subdomain.
Configuration procedures:
OpenShift Advanced Installation
The Support Node
Execute the installation as "ansible" from the support node.
As "root" on "support":
cd /usr/share/ansible chgrp -R ansible /usr/share/ansible chmod -R g+w /usr/share/ansible chgrp ansible /etc/ansible/ chmod g+w /etc/ansible/
The support node needs at least 1 GB or RAM to run the installation process.
Create an ansible key pair on "support" and disseminate it to all other nodes. As "ansible":
cd ssh-keygen -q -b 2048 -f ~/.ssh/id_rsa -t rsa
Configure Ansible Inventory File
Pre-Flight
On the support node, as "root":
cd /tmp rm -r tmp* yum* rm /usr/share/ansible/openshift-ansible/playbooks/byo/config.retry
As "ansible":
cd /etc/ansible ansible all -m ping ansible nodes -m shell -a "systemctl status docker" ansible nodes -m shell -a "docker version" ansible nodes -m shell -a "pvs" ansible nodes -m shell -a "vgs" ansible nodes -m shell -a "lvs" ansible nodes -m shell -a "nslookup something.apps.openshift.novaordis.io"
ansible nodes -m shell -a "mkdir /mnt/tmp" ansible nodes -m shell -a "mount -t nfs support.ocp36.local:/nfs /mnt/tmp" ansible nodes -m shell -a "hostname > /mnt/tmp/\$(hostname).txt" ansible nodes -m shell -a "umount /mnt/tmp" for i in /nfs/*.ocp36.local.txt; do echo $i; cat $i; done
Snapshot the Guest Images for the Entire Environment
On the virtualization host:
virsh vol-list main-storage-pool --details ./stop-ose36 virsh list --all
for i in \ /main-storage-pool/ocp36.ingress.qcow2 \ /main-storage-pool/ocp36.support.qcow2 \ /main-storage-pool/ocp36.master.qcow2 \ /main-storage-pool/ocp36.infranode.qcow2 \ /main-storage-pool/ocp36.node1.qcow2 \ /main-storage-pool/ocp36.node2.qcow2; \ do \ echo "$i snapshots:"; \ qemu-img snapshot -l $i; \ done
for i in \ /main-storage-pool/ocp36.ingress.qcow2 \ /main-storage-pool/ocp36.support.qcow2 \ /main-storage-pool/ocp36.master.qcow2 \ /main-storage-pool/ocp36.infranode.qcow2 \ /main-storage-pool/ocp36.node1.qcow2 \ /main-storage-pool/ocp36.node2.qcow2; \ do \ echo "taking snapshot of $i"; \ qemu-img snapshot -c before_ansible_installation $i; \ done
./start-ose36
Reverting the Environment to a Snapshot State
./stop-ose36
for i in \ /main-storage-pool/ocp36.ingress.qcow2 \ /main-storage-pool/ocp36.support.qcow2 \ /main-storage-pool/ocp36.master.qcow2 \ /main-storage-pool/ocp36.infranode.qcow2 \ /main-storage-pool/ocp36.node1.qcow2 \ /main-storage-pool/ocp36.node2.qcow2; \ do \ echo "reverting $i to snapshot"; \ qemu-img snapshot -a before_ansible_installation $i; \ done
Run the Advanced Installation
May want to run the pre-flight checks one more time after reboot.
As "ansible" on support:
ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
For verbose installation use -vvv or -vvvv.
ansible-playbook -vvvv /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml > ansible.out
To use a different inventory file than /etc/ansible/hosts, run:
ansible-playbook -vvvv -i /custom/path/to/inventory/file /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml > ansible.out
Output of a Successful Run
PLAY RECAP ********************************************************************************************************************************************************************************************************************************************************************************************************************************* api-lb.ocp36.local : ok=97 changed=2 unreachable=0 failed=0 infranode.ocp36.local : ok=257 changed=58 unreachable=0 failed=0 localhost : ok=14 changed=0 unreachable=0 failed=0 master.ocp36.local : ok=1039 changed=290 unreachable=0 failed=0 node1.ocp36.local : ok=256 changed=58 unreachable=0 failed=0 node2.ocp36.local : ok=256 changed=58 unreachable=0 failed=0 support.ocp36.local : ok=100 changed=3 unreachable=0 failed=0
Uninstallation
In case the installation procedure runs into problems, troubleshoot and before re-starting the installation procedure, uninstall:
ansible-playbook [-i /custom/path/to/inventory/file] /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml
This a relative quick method to iterate over the installation configuration and come up with a stable configuration. However, in order to install a production deployment, you must start from a clean operating system installation. If you are using virtual machines, start from a fresh image. If you are using bare metal machines, run the following on all hosts:
yum -y remove openshift openshift-* etcd docker docker-common rm -rf /etc/origin /var/lib/openshift /etc/etcd \ /var/lib/etcd /etc/sysconfig/atomic-openshift* /etc/sysconfig/docker* \ /root/.kube/config /etc/ansible/facts.d /usr/share/openshift
Configure HTTP Proxying
The installation procedure should have already configured the ingress node load balancer, which is also aliased as "api-lb.ocp36.local".
Configure the load balancer HAProxy on the "api-lb" node to log in/var/log/haproxy.log. The procedure is described here:
Configure Application Proxying
In case the infrastructure node(s) the router pod(s) have been deployed on does not expose a public address, we need to stand up an application HTTP proxy, or reconfigure the existing HAProxy already deployed on api-lb.ose36.local to also proxy application traffic. The public application traffic is directed to the wildcard domain configured on the external DNS.
The current noper430 installation stands up two interfaces /dev/ens8 and /dev/ens9 configured with two distinct public IP addresses, and one of the network interfaces is used for API/master traffic, while the other is used for application traffic. This solution was chosen because no obvious solution was immediately available to configure HAProxy to route HTTP traffic based on its Host header in HAProxy tcp mode. It does not mean such a solution does not exist.
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
maxconn 20000
log 127.0.0.1:514 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
user haproxy
group haproxy
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 300s
timeout server 300s
timeout http-keep-alive 10s
timeout check 10s
maxconn 20000
listen stats :9000
mode http
stats enable
stats uri /
frontend master_external_frontend
bind 104.50.201.84:443
default_backend master_backend
mode tcp
frontend master_internal_frontend
bind 192.168.122.22:443
default_backend master_backend
mode tcp
frontend app_frontend
bind 104.50.201.85:443
default_backend app_backend
mode tcp
backend master_backend
balance source
mode tcp
server master0 192.168.122.24:443 check
backend app_backend
balance source
mode tcp
server infranode 192.168.122.25:443 check
Reboot the Whole Environment
./stop-ose36 ./start-ose36
Base Installation Validation
Logging Installation
Metrics Installation
Post-Install
Configure the Number of Cores used by the Master and Node OpenShift Processes
Test oadm On Master
oadm top pod oadm top node
Test oc on an External Node
oc get pods
Snapshot the Working Environment
./stop-ose36 for i in \ /main-storage-pool/ocp36.ingress.qcow2 \ /main-storage-pool/ocp36.support.qcow2 \ /main-storage-pool/ocp36.master.qcow2 \ /main-storage-pool/ocp36.infranode.qcow2 \ /main-storage-pool/ocp36.node1.qcow2 \ /main-storage-pool/ocp36.node2.qcow2; \ do \ echo "taking snapshot of $i"; \ qemu-img snapshot -c ocp36_installed $i; \ done