OpenShift 3.6 Installation: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(55 intermediate revisions by the same user not shown)
Line 17: Line 17:
  noper430:/root/environments/ocp36
  noper430:/root/environments/ocp36
  RackStation:/base/NovaOrdis/Archive/yyyy.mm.dd-ocp36
  RackStation:/base/NovaOrdis/Archive/yyyy.mm.dd-ocp36
=Diagram=
{{Internal|OpenShift 3.6 Installation Diagram|OpenShift 3.6 Installation Diagram}}


=Hardware Requirements=
=Hardware Requirements=
Line 171: Line 175:
=Run the Advanced Installation=
=Run the Advanced Installation=


May want to run [[#Pre-Flight|pre-flight]] one more time after reboot.
May want to run [[#Pre-Flight|the pre-flight checks]] one more time after reboot.


As "ansible" on support:
As "ansible" on support:
Line 209: Line 213:
     /root/.kube/config /etc/ansible/facts.d /usr/share/openshift
     /root/.kube/config /etc/ansible/facts.d /usr/share/openshift


=Installation Validation=
=Install oc on A Remote Client=
 
{{Internal|Oc#Installation|oc Installation}}
 
=Configure HTTP Proxying=
 
==Configure API Proxying==
 
The installation procedure should have already configured routing on the API ingress node load balancer, which is referred to as "api-lb.ocp36.local".
 
However, the HAProxy instance logging configuration can be improved, and the defaults can be changes to help with troubleshooting, if necessary. It is recommended to reconfigure the HAProxy running on "api-lb" to log in /var/log/haproxy.log. The procedure is described here: {{Internal|HAProxy_Configuration#Logging_Configuration|HAProxy Logging Configuration}}
 
==Configure Application Proxying==
 
Typically, the infrastructure node(s) [[OpenShift_Concepts#Router|router]] pod(s) are deployed on do not expose a public address. To allow external application traffic to reach the router pods, to be re-routed to application pods, we need to stand up a new application HTTP proxy, or reconfigure the API HAProxy already deployed on api-lb.ose36.local to also proxy application traffic. The public application traffic is directed to the [[OpenShift_3.5_Installation#Wildcard_Domain_for_Application_Traffic|wildcard domain configured on the external DNS]]. We need to map that name to the public IP address exposed by the application HTTP proxy.
 
The current noper430 installation stands up two interfaces /dev/ens8 and /dev/ens9 configured with two distinct public IP addresses, and one of the network interfaces is used for API/master traffic, while the other is used for application traffic. This solution was chosen because no obvious solution was immediately available to configure HAProxy to route HTTP traffic based on its Host header in HAProxy tcp mode. It does not mean such a solution does not exist.
 
Another ingress node (i2) was configured with the sole purpose of running an application traffic HAProxy.
 
===HAProxy Configuration===
 
This is the /etc/haproxy/haproxy.cfg configuration file. Note that the file contains configuration allowing both HTTP and HTTPS traffic. <font color=orange>HTTP/HTTPS traffic was enabled by creating two separate frontend/backend pairs, one for HTTP and one for HTTPS. I am not sure this is the most efficient way of doing it.</font>
 
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    maxconn    20000
    log        127.0.0.1:514 local2
    chroot      /var/lib/haproxy
    pidfile    /var/run/haproxy.pid
    user        haproxy
    group      haproxy
    daemon
    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                    global
    option                  httplog
    option                  dontlognull
    option forwardfor      except 127.0.0.0/8
    option                  redispatch
    retries                3
    timeout http-request    10s
    timeout queue          1m
    timeout connect        10s
    timeout client          300s
    timeout server          300s
    timeout http-keep-alive 10s
    timeout check          10s
    maxconn                20000
listen stats :9000
    mode http
    stats enable
    stats uri /
'''frontend''' <font color='teal'>app_frontend_https</font>
    '''bind''' <font color='teal'><''public-application-ip-address''>:443</font>
    '''default_backend''' <font color='teal'>app_backend_https</font>
    option tcplog
    mode tcp
'''frontend''' <font color='teal'>app_frontend_http</font>
    '''bind''' <font color='teal'><''public-application-ip-address''>:80</font>
    '''default_backend'''  <font color='teal'>app_backend_http</font>
    option tcplog
    mode tcp
 
'''backend''' <font color='teal'>app_backend_https</font>
    balance source
    mode tcp
    server infranode 192.168.122.25:<font color='teal'>443</font> check
'''backend''' <font color='teal'>app_backend_http</font>
    balance source
    mode tcp
    server infranode 192.168.122.25:<font color='teal'>80</font> check
 
===iptables Configuration===
 
If both HTTP and HTTPS traffic is to be allowed on i2, iptables must be configured accordingly.
 
...
-A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT
-A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT
...
 
=Configure the Default Project=
 
Configure the default project's pods to deploy on infrastructure nodes:
 
oc edit namespace default
 
metadata:
  annotations:
    ...
    openshift.io/node-selector: '''<font color=teal>env=infra</font>'''
 
=Reboot the Whole Environment=
 
./stop-ose36
./start-ose36
 
=Base Installation Validation=


{{Internal|OpenShift Installation Validation|OpenShift Installation Validation}}
{{Internal|OpenShift Installation Validation|OpenShift Installation Validation}}
Line 221: Line 337:
{{Internal|OpenShift Metrics Installation|Metrics Installation}}
{{Internal|OpenShift Metrics Installation|Metrics Installation}}


=Deplete=
=Post-Install=
 
==Configure the Number of Cores used by the Master and Node OpenShift Processes==
 
{{Internal|OpenShift Core Usage Configuration#Overview|configure the number of cores used by the master and node OpenShift processes}}
 
==Test oadm On Master==
 
oadm top pod
oadm top node
 
==Test oc on an External Node==
 
oc get status
oc get pods
 
=Snapshot the Working Environment=
 
./stop-ose36
for i in \
  /main-storage-pool/ocp36.ingress.qcow2 \
  /main-storage-pool/ocp36.support.qcow2 \
  /main-storage-pool/ocp36.master.qcow2 \
  /main-storage-pool/ocp36.infranode.qcow2 \
  /main-storage-pool/ocp36.node1.qcow2 \
  /main-storage-pool/ocp36.node2.qcow2; \
  do \
    echo "taking snapshot of $i"; \
    qemu-img snapshot -c ocp36_installed $i; \
  done
 
=Load Default Image Streams and Templates=
 
<font color=red>'''TODO'''</font> https://docs.openshift.com/container-platform/latest/install_config/imagestreams_templates.html#install-config-imagestreams-templates


{{Error|[[OSC3.6TODEPLETE]]}}
The image definitions are loaded as part of the [[OpenShift_Concepts#.22openshift.22_Project|"openshift" project]].

Latest revision as of 00:48, 24 January 2018

External

Internal

Overview

This document covers OpenShit Container Platform 3.6 advanced installation procedure. The procedure it is largely based on https://docs.openshift.com/container-platform/3.6/install_config/index.html and contains additional details observed during an actual installation process on KVM guests.

Location

noper430:/root/environments/ocp36
RackStation:/base/NovaOrdis/Archive/yyyy.mm.dd-ocp36

Diagram

OpenShift 3.6 Installation Diagram

Hardware Requirements

https://docs.openshift.com/container-platform/3.6/install_config/install/prerequisites.html#hardware

System and Environment Prerequisites

The pre-requisites and all preparatory steps presented in the following link are also addressed in this procedure. The links are provided for reference only:

https://docs.openshift.com/container-platform/3.6/install_config/install/prerequisites.html
https://docs.openshift.com/container-platform/3.6/install_config/install/host_preparation.html

Virtualization Host Preparation

Guest Template Preparation

Create VM templates, in this order:

Cloud Provider-Specific Configuration

Guest Configuration

External DNS Setup

An external DNS server is required. Controlling a registrar (such as http://godaddy.com) DNS zone will provide all capabilities needed to set up external DNS address resolution. A valid alternative, though more complex, is to install a dedicated public DNS server. The DNS server must be available to all OpenShift environment nodes, and also to external clients than need to resolve public names such as the master public web console and API URL, the public wildcard name of the environment, the application router public DNS name, etc. A combination of a public DNS server that resolves public addresses and an internal DNS server, deployed as part of the environment, usually on the support node, tasked with resolving internal addresses is also a workable solution. This installation procedure includes installation of the bind DNS server on the support node - see bind DNS server installation section below.

Wildcard Domain for Application Traffic

The DNS server that resolves the public addresses will need to be capable to support wildcard sub-domains and resolve the public wildcard DNS entry to the public IP address of the node that executes the default router, or of the public ingress node that proxies to the default router. If the environment has multiple routers, an external load balancer is required, and the wildcard record must contain the public IP address of the host that runs the load balancer. The name of the wildcard domain will be later specified during the advanced installation procedure in the Ansible inventory file as openshift_master_default_subdomain.

Configuration procedures:

OpenShift Advanced Installation

https://docs.openshift.com/container-platform/3.6/install_config/install/advanced_install.html#install-config-install-advanced-install

The Support Node

Execute the installation as "ansible" from the support node.

As "root" on "support":

cd /usr/share/ansible
chgrp -R ansible /usr/share/ansible
chmod -R g+w /usr/share/ansible
chgrp ansible /etc/ansible/
chmod g+w /etc/ansible/

The support node needs at least 1 GB or RAM to run the installation process.

Create an ansible key pair on "support" and disseminate it to all other nodes. As "ansible":

cd
ssh-keygen -q -b 2048 -f ~/.ssh/id_rsa -t rsa

Configure Ansible Inventory File

OpenShift Ansible hosts Inventory File

Pre-Flight

On the support node, as "root":

cd /tmp
rm -r tmp* yum*
rm /usr/share/ansible/openshift-ansible/playbooks/byo/config.retry

As "ansible":

cd /etc/ansible
ansible all -m ping
ansible nodes -m shell -a "systemctl status docker"
ansible nodes -m shell -a "docker version"
ansible nodes -m shell -a "pvs"
ansible nodes -m shell -a "vgs"
ansible nodes -m shell -a "lvs"
ansible nodes -m shell -a "nslookup something.apps.openshift.novaordis.io"
ansible nodes -m shell -a "mkdir /mnt/tmp"
ansible nodes -m shell -a "mount -t nfs support.ocp36.local:/nfs /mnt/tmp"
ansible nodes -m shell -a "hostname > /mnt/tmp/\$(hostname).txt"
ansible nodes -m shell -a "umount /mnt/tmp"
for i in /nfs/*.ocp36.local.txt; do echo $i; cat $i; done

Snapshot the Guest Images for the Entire Environment

On the virtualization host:

virsh vol-list main-storage-pool --details
./stop-ose36
virsh list --all
qemu-img snapshot
for i in \
 /main-storage-pool/ocp36.ingress.qcow2 \
 /main-storage-pool/ocp36.support.qcow2 \
 /main-storage-pool/ocp36.master.qcow2 \
 /main-storage-pool/ocp36.infranode.qcow2 \
 /main-storage-pool/ocp36.node1.qcow2 \
 /main-storage-pool/ocp36.node2.qcow2; \
 do \
   echo "$i snapshots:"; \
   qemu-img snapshot -l $i; \
 done
for i in \
 /main-storage-pool/ocp36.ingress.qcow2 \
 /main-storage-pool/ocp36.support.qcow2 \
 /main-storage-pool/ocp36.master.qcow2 \
 /main-storage-pool/ocp36.infranode.qcow2 \
 /main-storage-pool/ocp36.node1.qcow2 \
 /main-storage-pool/ocp36.node2.qcow2; \
 do \
   echo "taking snapshot of $i"; \
   qemu-img snapshot -c before_ansible_installation $i; \
 done
./start-ose36

Reverting the Environment to a Snapshot State

./stop-ose36
for i in \
 /main-storage-pool/ocp36.ingress.qcow2 \
 /main-storage-pool/ocp36.support.qcow2 \
 /main-storage-pool/ocp36.master.qcow2 \
 /main-storage-pool/ocp36.infranode.qcow2 \
 /main-storage-pool/ocp36.node1.qcow2 \
 /main-storage-pool/ocp36.node2.qcow2; \
 do \
   echo "reverting $i to snapshot"; \
   qemu-img snapshot -a before_ansible_installation $i; \
 done

Run the Advanced Installation

May want to run the pre-flight checks one more time after reboot.

As "ansible" on support:

ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml

For verbose installation use -vvv or -vvvv.

ansible-playbook -vvvv /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml > ansible.out

To use a different inventory file than /etc/ansible/hosts, run:

ansible-playbook -vvvv -i /custom/path/to/inventory/file /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml > ansible.out

Output of a Successful Run

PLAY RECAP *********************************************************************************************************************************************************************************************************************************************************************************************************************************
api-lb.ocp36.local         : ok=97   changed=2    unreachable=0    failed=0
infranode.ocp36.local      : ok=257  changed=58   unreachable=0    failed=0
localhost                  : ok=14   changed=0    unreachable=0    failed=0
master.ocp36.local         : ok=1039 changed=290  unreachable=0    failed=0
node1.ocp36.local          : ok=256  changed=58   unreachable=0    failed=0
node2.ocp36.local          : ok=256  changed=58   unreachable=0    failed=0
support.ocp36.local        : ok=100  changed=3    unreachable=0    failed=0

Uninstallation

In case the installation procedure runs into problems, troubleshoot and before re-starting the installation procedure, uninstall:

ansible-playbook [-i /custom/path/to/inventory/file] /usr/share/ansible/openshift-ansible/playbooks/adhoc/uninstall.yml

This a relative quick method to iterate over the installation configuration and come up with a stable configuration. However, in order to install a production deployment, you must start from a clean operating system installation. If you are using virtual machines, start from a fresh image. If you are using bare metal machines, run the following on all hosts:

yum -y remove openshift openshift-* etcd docker docker-common
rm -rf /etc/origin /var/lib/openshift /etc/etcd \
   /var/lib/etcd /etc/sysconfig/atomic-openshift* /etc/sysconfig/docker* \
   /root/.kube/config /etc/ansible/facts.d /usr/share/openshift

Install oc on A Remote Client

oc Installation

Configure HTTP Proxying

Configure API Proxying

The installation procedure should have already configured routing on the API ingress node load balancer, which is referred to as "api-lb.ocp36.local".

However, the HAProxy instance logging configuration can be improved, and the defaults can be changes to help with troubleshooting, if necessary. It is recommended to reconfigure the HAProxy running on "api-lb" to log in /var/log/haproxy.log. The procedure is described here:

HAProxy Logging Configuration

Configure Application Proxying

Typically, the infrastructure node(s) router pod(s) are deployed on do not expose a public address. To allow external application traffic to reach the router pods, to be re-routed to application pods, we need to stand up a new application HTTP proxy, or reconfigure the API HAProxy already deployed on api-lb.ose36.local to also proxy application traffic. The public application traffic is directed to the wildcard domain configured on the external DNS. We need to map that name to the public IP address exposed by the application HTTP proxy.

The current noper430 installation stands up two interfaces /dev/ens8 and /dev/ens9 configured with two distinct public IP addresses, and one of the network interfaces is used for API/master traffic, while the other is used for application traffic. This solution was chosen because no obvious solution was immediately available to configure HAProxy to route HTTP traffic based on its Host header in HAProxy tcp mode. It does not mean such a solution does not exist.

Another ingress node (i2) was configured with the sole purpose of running an application traffic HAProxy.

HAProxy Configuration

This is the /etc/haproxy/haproxy.cfg configuration file. Note that the file contains configuration allowing both HTTP and HTTPS traffic. HTTP/HTTPS traffic was enabled by creating two separate frontend/backend pairs, one for HTTP and one for HTTPS. I am not sure this is the most efficient way of doing it.

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
   maxconn     20000
   log         127.0.0.1:514 local2
   chroot      /var/lib/haproxy
   pidfile     /var/run/haproxy.pid
   user        haproxy
   group       haproxy
   daemon

   # turn on stats unix socket
   stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
   mode                    http
   log                     global
   option                  httplog
   option                  dontlognull
   option forwardfor       except 127.0.0.0/8
   option                  redispatch
   retries                 3
   timeout http-request    10s
   timeout queue           1m
   timeout connect         10s
   timeout client          300s
   timeout server          300s
   timeout http-keep-alive 10s
   timeout check           10s
   maxconn                 20000

listen stats :9000
   mode http
   stats enable
   stats uri /

frontend app_frontend_https
   bind <public-application-ip-address>:443
   default_backend app_backend_https
   option tcplog
   mode tcp

frontend app_frontend_http
   bind <public-application-ip-address>:80
   default_backend  app_backend_http
   option tcplog
   mode tcp
 
backend app_backend_https
   balance source
   mode tcp
   server infranode 192.168.122.25:443 check

backend app_backend_http
   balance source
   mode tcp
   server infranode 192.168.122.25:80 check

iptables Configuration

If both HTTP and HTTPS traffic is to be allowed on i2, iptables must be configured accordingly.

...
-A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 80 -j ACCEPT
-A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 443 -j ACCEPT
...

Configure the Default Project

Configure the default project's pods to deploy on infrastructure nodes:

oc edit namespace default
metadata:
  annotations:
    ...
    openshift.io/node-selector: env=infra

Reboot the Whole Environment

./stop-ose36
./start-ose36

Base Installation Validation

OpenShift Installation Validation

Logging Installation

Logging Installation

Metrics Installation

Metrics Installation

Post-Install

Configure the Number of Cores used by the Master and Node OpenShift Processes

configure the number of cores used by the master and node OpenShift processes

Test oadm On Master

oadm top pod
oadm top node

Test oc on an External Node

oc get status
oc get pods

Snapshot the Working Environment

./stop-ose36

for i in \
 /main-storage-pool/ocp36.ingress.qcow2 \
 /main-storage-pool/ocp36.support.qcow2 \
 /main-storage-pool/ocp36.master.qcow2 \
 /main-storage-pool/ocp36.infranode.qcow2 \
 /main-storage-pool/ocp36.node1.qcow2 \
 /main-storage-pool/ocp36.node2.qcow2; \
 do \
   echo "taking snapshot of $i"; \
   qemu-img snapshot -c ocp36_installed $i; \
 done

Load Default Image Streams and Templates

TODO https://docs.openshift.com/container-platform/latest/install_config/imagestreams_templates.html#install-config-imagestreams-templates

The image definitions are loaded as part of the "openshift" project.