OpenShift Installation Validation: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(47 intermediate revisions by the same user not shown)
Line 6: Line 6:


* [[OpenShift Operations#Subjects|OpenShift Operations]]
* [[OpenShift Operations#Subjects|OpenShift Operations]]
* [[OpenShift_3.5_Installation#Verifying_the_Installation|Verifying an OpenShift 3.5 Installation]]
* [[OpenShift_3.5_Installation#Verifying_the_Installation|OpenShift 3.5 Installation]]
* [[OpenShift_3.6_Installation#Base_Installation_Validation|OpenShift 3.6 Installation]]
 
=Connect to the Support Node=
 
As "ansible":


=On All Nodes=
=On All Nodes=
Line 26: Line 31:
  kubernetes v1.5.2+43a9be4
  kubernetes v1.5.2+43a9be4
  etcd 3.1.0
  etcd 3.1.0
=Exported Filesystems=
On the support node run exportfs and make sure the following filesystems are exported:
exportfs
/nfs              192.168.122.0/255.255.255.0
/nfs/registry      <world>
/nfs/metrics        <world>
/nfs/logging        <world>
/nfs/logging-es-ops <world>
/nfs/etcd          <world>


=On Masters=
=On Masters=
Line 47: Line 64:
node2.openshift35.local        Ready                      17m      beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node2.openshift35.local,logging-infra-fluentd=true,logging=true
node2.openshift35.local        Ready                      17m      beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node2.openshift35.local,logging-infra-fluentd=true,logging=true
</pre>
</pre>
=Web Console=
At this point the web console should be exposed on the external interface.
{{External|https://master.openshift35.external/}}
{{External|https://master.openshift.novaordis.io/}}
Use the administrative user defined as part of your "identity provider" declaration.


=Verify etcd=
=Verify etcd=


* [[Etcd_Operations#Cluster_Health|etcd Cluster Health]]
On nodes that run etcd, as root:
* [[Etcd_Operations#Member_List|etcd Member List]]
 
=Exported Filesystems=
 


<font color=red>
[[etcdctl#cluster-health|etcdctl cluster-health]]
* Clarify the NFS directory/volume and update hosts/hosts-min
[[etcdctl#list|etcdctl member list]]
</font>


On the support node run exportfs and make sure the following filesystems are exported:
Note that etcdctl2 should be used on OCP 3.7 onward.
 
<pre>
/storage/registry
/storage/metrics
/storage/logging
/exports/logging-es-ops
</pre>


=Docker Logs=
=Docker Logs=
Line 96: Line 92:
Make sure "--selinux-enabled" and "--insecure-registry 172.30.0.0/16" are present.
Make sure "--selinux-enabled" and "--insecure-registry 172.30.0.0/16" are present.


=Logging=
<font color=red>--insecure-registry does not seem to propagate, update /etc/sysconfig/docker manually on all docker nodes with '--insecure-registry 172.30.0.0/16'.</font>
 
=Master Web Console=
 
At this point the web console should be exposed on the external interface.
 
{{External|https://master.openshift.novaordis.io/}}
 
Use the administrative user defined as part of your "identity provider" declaration.
 
The API server should respond to curl:
 
curl -k https&#58;//master.openshift.novaordis.io/version
{
  "major": "1",
  "minor": "6",
  "gitVersion": "v1.6.1+5115d708d7",
  "gitCommit": "fff65cf",
  "gitTreeState": "clean",
  "buildDate": "2017-10-11T22:44:25Z",
  "goVersion": "go1.7.6",
  "compiler": "gc",
  "platform": "linux/amd64"
}
 
curl -k https&#58;//master.openshift.novaordis.io/healthz
ok
 
=DNS=
 
Verify name resolution:
 
dig +short docker-registry.default.svc.cluster.local
172.30.53.178
 
from masters, infrastructure nodes and nodes.
 
The answer must match the output of
 
oc get -n default svc/docker-registry
NAME              CLUSTER-IP      EXTERNAL-IP  PORT(S)    AGE
docker-registry  172.30.53.178  <none>        5000/TCP  88d
 
=MTU Size Verification=
 
<font color=red>TODO:</font> https://access.redhat.com/documentation/en-us/openshift_container_platform/3.7/html/day_two_operations_guide/day_two_environment_health_checks#day-two-guide-verifying_mtu
 
=Router Status=
 
oc -n default get deploymentconfigs/router
NAME      REVISION  DESIRED  CURRENT  TRIGGERED BY
router    5          1        1        config
 
The values in the DESIRED and CURRENT columns should match the number of nodes hosts.
 
Internal connectivity (both from master and a node):
 
curl -kv https://docker-registry.default.svc.cluster.local:5000/healthz
 
=Registry Status=
 
oc -n default get deploymentconfigs/docker-registry
NAME              REVISION  DESIRED  CURRENT  TRIGGERED BY
docker-registry  1          1        1        config
 
==Registry Console==
 
{{External|https://registry-console-default.apps.openshift.novaordis.io/}}
 
=oadm Diagnostics=
 
{{Internal|Oadm diagnostics|oadm diagnostics}}
 
=Per-project Validation=
 
==Logging Installation Validation==
 
Must be performed after [[OpenShift_Logging_Installation#Installation_During_the_Main_Procedure|logging installation and post-install configuration]]:
 
{{Internal|OpenShift_Logging_Installation#Installation_Validation|Loging Installation Validation}}
 
==Metrics Installation Validation==
 
Must be performed after [[OpenShift_Metrics_Installation#Installation_During_the_Main_Procedure|metrics installation and post-install configuration]]:


<font color=red>
{{Internal|OpenShift_Metrics_Installation#Installation_Validation|Metrics Installation Validation}}
Test https://kibana.openshift35.external
</font>


=Metrics=
=Validation Resources=


<font color=red>
* Day Two Operations Guide - Health Checks: https://access.redhat.com/documentation/en-us/openshift_container_platform/3.7/html/day_two_operations_guide/day_two_environment_health_checks
Test https://hawkular-metrics.openshift35.external/hawkular/metrics
</font>

Latest revision as of 19:04, 7 February 2018

External

Internal

Connect to the Support Node

As "ansible":

On All Nodes

OpenShift Packages

ansible nodes -m shell -a "yum list installed | grep openshift"

The desired OpenShift version must be installed.

OpenShift Version

ansible nodes -m shell -a "/usr/bin/openshift version"

master1.local | SUCCESS | ...
openshift v3.5.5.26
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Exported Filesystems

On the support node run exportfs and make sure the following filesystems are exported:

exportfs
/nfs          	    192.168.122.0/255.255.255.0
/nfs/registry       <world>
/nfs/metrics        <world>
/nfs/logging        <world>
/nfs/logging-es-ops <world>
/nfs/etcd           <world>

On Masters

On each master node, run as root:

oc get nodes --show-labels

Output example:

NAME                           STATUS                     AGE       LABELS
infranode1.openshift35.local   Ready                      17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=infra,kubernetes.io/hostname=infranode1.openshift35.local,logging-infra-fluentd=true,logging=true
infranode2.openshift35.local   Ready                      17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=infra,kubernetes.io/hostname=infranode2.openshift35.local,logging-infra-fluentd=true,logging=true
master1.openshift35.local      Ready,SchedulingDisabled   17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master1.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False
master2.openshift35.local      Ready,SchedulingDisabled   17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master2.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False
master3.openshift35.local      Ready,SchedulingDisabled   17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master3.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False
node1.openshift35.local        Ready                      17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node1.openshift35.local,logging-infra-fluentd=true,logging=true
node2.openshift35.local        Ready                      17m       beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node2.openshift35.local,logging-infra-fluentd=true,logging=true

Verify etcd

On nodes that run etcd, as root:

etcdctl cluster-health
etcdctl member list

Note that etcdctl2 should be used on OCP 3.7 onward.

Docker Logs

Log into a few nodes and take a look at the docker logs:

journalctl -f -u docker

Docker Startup Paramenters

From the support/installation server, execute as "ansible":

ansible nodes -m shell -a "ps -ef | grep dockerd | grep -v grep"

Make sure "--selinux-enabled" and "--insecure-registry 172.30.0.0/16" are present.

--insecure-registry does not seem to propagate, update /etc/sysconfig/docker manually on all docker nodes with '--insecure-registry 172.30.0.0/16'.

Master Web Console

At this point the web console should be exposed on the external interface.

https://master.openshift.novaordis.io/

Use the administrative user defined as part of your "identity provider" declaration.

The API server should respond to curl:

curl -k https://master.openshift.novaordis.io/version
{
  "major": "1",
  "minor": "6",
  "gitVersion": "v1.6.1+5115d708d7",
  "gitCommit": "fff65cf",
  "gitTreeState": "clean",
  "buildDate": "2017-10-11T22:44:25Z",
  "goVersion": "go1.7.6",
  "compiler": "gc",
  "platform": "linux/amd64"
}
curl -k https://master.openshift.novaordis.io/healthz
ok

DNS

Verify name resolution:

dig +short docker-registry.default.svc.cluster.local
172.30.53.178

from masters, infrastructure nodes and nodes.

The answer must match the output of

oc get -n default svc/docker-registry
NAME              CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
docker-registry   172.30.53.178   <none>        5000/TCP   88d

MTU Size Verification

TODO: https://access.redhat.com/documentation/en-us/openshift_container_platform/3.7/html/day_two_operations_guide/day_two_environment_health_checks#day-two-guide-verifying_mtu

Router Status

oc -n default get deploymentconfigs/router
NAME      REVISION   DESIRED   CURRENT   TRIGGERED BY
router    5          1         1         config

The values in the DESIRED and CURRENT columns should match the number of nodes hosts.

Internal connectivity (both from master and a node):

curl -kv https://docker-registry.default.svc.cluster.local:5000/healthz

Registry Status

oc -n default get deploymentconfigs/docker-registry
NAME              REVISION   DESIRED   CURRENT   TRIGGERED BY
docker-registry   1          1         1         config

Registry Console

https://registry-console-default.apps.openshift.novaordis.io/

oadm Diagnostics

oadm diagnostics

Per-project Validation

Logging Installation Validation

Must be performed after logging installation and post-install configuration:

Loging Installation Validation

Metrics Installation Validation

Must be performed after metrics installation and post-install configuration:

Metrics Installation Validation

Validation Resources