OpenShift Installation Validation: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(67 intermediate revisions by the same user not shown)
Line 6: Line 6:


* [[OpenShift Operations#Subjects|OpenShift Operations]]
* [[OpenShift Operations#Subjects|OpenShift Operations]]
* [[OpenShift_3.5_Installation#Verifying_the_Installation|Verifying an OpenShift 3.5 Installation]]
* [[OpenShift_3.5_Installation#Verifying_the_Installation|OpenShift 3.5 Installation]]
* [[OpenShift_3.6_Installation#Base_Installation_Validation|OpenShift 3.6 Installation]]
 
=Connect to the Support Node=
 
As "ansible":
 
=On All Nodes=
 
====OpenShift Packages====
 
<pre>
ansible nodes -m shell -a "yum list installed | grep openshift"
</pre>
 
The desired OpenShift version must be installed.
 
====OpenShift Version====
 
ansible nodes -m shell -a "/usr/bin/openshift version"
master1.local | SUCCESS | ...
openshift v3.5.5.26
kubernetes v1.5.2+43a9be4
etcd 3.1.0
 
=Exported Filesystems=
 
On the support node run exportfs and make sure the following filesystems are exported:
exportfs
 
/nfs              192.168.122.0/255.255.255.0
/nfs/registry      <world>
/nfs/metrics        <world>
/nfs/logging        <world>
/nfs/logging-es-ops <world>
/nfs/etcd          <world>


=On Masters=
=On Masters=
Line 29: Line 65:
</pre>
</pre>


=Web Console=
=Verify etcd=


{{External|https://master.openshift35.external/}}
On nodes that run etcd, as root:


Use the administrative user defined as part of your "identity provider" declaration.
[[etcdctl#cluster-health|etcdctl cluster-health]]
[[etcdctl#list|etcdctl member list]]


=Verify etcd=
Note that etcdctl2 should be used on OCP 3.7 onward.


{{Internal|Etcd_Operations#Cluster_Health|etcd Cluster Health}}
=Docker Logs=
{{Internal|Etcd_Operations#Member_List|etcd Member List}}


=Exported Filesystems=
Log into a few nodes and take a look at the docker logs:
 
On the support node run exportfs and make sure the following filesystems are exported:


<pre>
<pre>
/storage/registry
journalctl -f -u docker
/storage/metrics
/storage/logging
/exports/logging-es-ops
</pre>
</pre>


=Docker Logs=
=Docker Startup Paramenters=


Log into a few nodes and take a look at the docker logs:
From the support/installation server, execute as "ansible":


<pre>
<pre>
journalctl -f -u docker
ansible nodes -m shell -a "ps -ef | grep dockerd | grep -v grep"
</pre>
</pre>
Make sure "--selinux-enabled" and "--insecure-registry 172.30.0.0/16" are present.
<font color=red>--insecure-registry does not seem to propagate, update /etc/sysconfig/docker manually on all docker nodes with '--insecure-registry 172.30.0.0/16'.</font>
=Master Web Console=
At this point the web console should be exposed on the external interface.
{{External|https://master.openshift.novaordis.io/}}
Use the administrative user defined as part of your "identity provider" declaration.
The API server should respond to curl:
curl -k https&#58;//master.openshift.novaordis.io/version
{
  "major": "1",
  "minor": "6",
  "gitVersion": "v1.6.1+5115d708d7",
  "gitCommit": "fff65cf",
  "gitTreeState": "clean",
  "buildDate": "2017-10-11T22:44:25Z",
  "goVersion": "go1.7.6",
  "compiler": "gc",
  "platform": "linux/amd64"
}
curl -k https&#58;//master.openshift.novaordis.io/healthz
ok
=DNS=
Verify name resolution:
dig +short docker-registry.default.svc.cluster.local
172.30.53.178
from masters, infrastructure nodes and nodes.
The answer must match the output of
oc get -n default svc/docker-registry
NAME              CLUSTER-IP      EXTERNAL-IP  PORT(S)    AGE
docker-registry  172.30.53.178  <none>        5000/TCP  88d
=MTU Size Verification=
<font color=red>TODO:</font> https://access.redhat.com/documentation/en-us/openshift_container_platform/3.7/html/day_two_operations_guide/day_two_environment_health_checks#day-two-guide-verifying_mtu
=Router Status=
oc -n default get deploymentconfigs/router
NAME      REVISION  DESIRED  CURRENT  TRIGGERED BY
router    5          1        1        config
The values in the DESIRED and CURRENT columns should match the number of nodes hosts.
Internal connectivity (both from master and a node):
curl -kv https://docker-registry.default.svc.cluster.local:5000/healthz
=Registry Status=
oc -n default get deploymentconfigs/docker-registry
NAME              REVISION  DESIRED  CURRENT  TRIGGERED BY
docker-registry  1          1        1        config
==Registry Console==
{{External|https://registry-console-default.apps.openshift.novaordis.io/}}
=oadm Diagnostics=
{{Internal|Oadm diagnostics|oadm diagnostics}}
=Per-project Validation=
==Logging Installation Validation==
Must be performed after [[OpenShift_Logging_Installation#Installation_During_the_Main_Procedure|logging installation and post-install configuration]]:
{{Internal|OpenShift_Logging_Installation#Installation_Validation|Loging Installation Validation}}
==Metrics Installation Validation==
Must be performed after [[OpenShift_Metrics_Installation#Installation_During_the_Main_Procedure|metrics installation and post-install configuration]]:
{{Internal|OpenShift_Metrics_Installation#Installation_Validation|Metrics Installation Validation}}
=Validation Resources=
* Day Two Operations Guide - Health Checks: https://access.redhat.com/documentation/en-us/openshift_container_platform/3.7/html/day_two_operations_guide/day_two_environment_health_checks

Latest revision as of 19:04, 7 February 2018

External

Internal

Connect to the Support Node

As "ansible":

On All Nodes

OpenShift Packages

ansible nodes -m shell -a "yum list installed | grep openshift"

The desired OpenShift version must be installed.

OpenShift Version

ansible nodes -m shell -a "/usr/bin/openshift version"

master1.local | SUCCESS | ...
openshift v3.5.5.26
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Exported Filesystems

On the support node run exportfs and make sure the following filesystems are exported:

exportfs
/nfs          	    192.168.122.0/255.255.255.0
/nfs/registry       <world>
/nfs/metrics        <world>
/nfs/logging        <world>
/nfs/logging-es-ops <world>
/nfs/etcd           <world>

On Masters

On each master node, run as root:

oc get nodes --show-labels

Output example:

NAME                           STATUS                     AGE       LABELS
infranode1.openshift35.local   Ready                      17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=infra,kubernetes.io/hostname=infranode1.openshift35.local,logging-infra-fluentd=true,logging=true
infranode2.openshift35.local   Ready                      17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=infra,kubernetes.io/hostname=infranode2.openshift35.local,logging-infra-fluentd=true,logging=true
master1.openshift35.local      Ready,SchedulingDisabled   17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master1.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False
master2.openshift35.local      Ready,SchedulingDisabled   17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master2.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False
master3.openshift35.local      Ready,SchedulingDisabled   17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master3.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False
node1.openshift35.local        Ready                      17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node1.openshift35.local,logging-infra-fluentd=true,logging=true
node2.openshift35.local        Ready                      17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node2.openshift35.local,logging-infra-fluentd=true,logging=true

Verify etcd

On nodes that run etcd, as root:

etcdctl cluster-health
etcdctl member list

Note that etcdctl2 should be used on OCP 3.7 onward.

Docker Logs

Log into a few nodes and take a look at the docker logs:

journalctl -f -u docker

Docker Startup Paramenters

From the support/installation server, execute as "ansible":

ansible nodes -m shell -a "ps -ef | grep dockerd | grep -v grep"

Make sure "--selinux-enabled" and "--insecure-registry 172.30.0.0/16" are present.

--insecure-registry does not seem to propagate, update /etc/sysconfig/docker manually on all docker nodes with '--insecure-registry 172.30.0.0/16'.

Master Web Console

At this point the web console should be exposed on the external interface.

https://master.openshift.novaordis.io/

Use the administrative user defined as part of your "identity provider" declaration.

The API server should respond to curl:

curl -k https://master.openshift.novaordis.io/version
{
  "major": "1",
  "minor": "6",
  "gitVersion": "v1.6.1+5115d708d7",
  "gitCommit": "fff65cf",
  "gitTreeState": "clean",
  "buildDate": "2017-10-11T22:44:25Z",
  "goVersion": "go1.7.6",
  "compiler": "gc",
  "platform": "linux/amd64"
}
curl -k https://master.openshift.novaordis.io/healthz
ok

DNS

Verify name resolution:

dig +short docker-registry.default.svc.cluster.local
172.30.53.178

from masters, infrastructure nodes and nodes.

The answer must match the output of

oc get -n default svc/docker-registry
NAME              CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
docker-registry   172.30.53.178   <none>        5000/TCP   88d

MTU Size Verification

TODO: https://access.redhat.com/documentation/en-us/openshift_container_platform/3.7/html/day_two_operations_guide/day_two_environment_health_checks#day-two-guide-verifying_mtu

Router Status

oc -n default get deploymentconfigs/router
NAME      REVISION   DESIRED   CURRENT   TRIGGERED BY
router    5          1         1         config

The values in the DESIRED and CURRENT columns should match the number of nodes hosts.

Internal connectivity (both from master and a node):

curl -kv https://docker-registry.default.svc.cluster.local:5000/healthz

Registry Status

oc -n default get deploymentconfigs/docker-registry
NAME              REVISION   DESIRED   CURRENT   TRIGGERED BY
docker-registry   1          1         1         config

Registry Console

https://registry-console-default.apps.openshift.novaordis.io/

oadm Diagnostics

oadm diagnostics

Per-project Validation

Logging Installation Validation

Must be performed after logging installation and post-install configuration:

Loging Installation Validation

Metrics Installation Validation

Must be performed after metrics installation and post-install configuration:

Metrics Installation Validation

Validation Resources