OpenShift Installation Validation: Difference between revisions
(→DNS) |
|||
(75 intermediate revisions by the same user not shown) | |||
Line 6: | Line 6: | ||
* [[OpenShift Operations#Subjects|OpenShift Operations]] | * [[OpenShift Operations#Subjects|OpenShift Operations]] | ||
* [[OpenShift_3.5_Installation#Verifying_the_Installation|OpenShift 3.5 Installation]] | |||
* [[OpenShift_3.6_Installation#Base_Installation_Validation|OpenShift 3.6 Installation]] | |||
=Connect to the Support Node= | |||
As "ansible": | |||
=On All Nodes= | |||
====OpenShift Packages==== | |||
<pre> | |||
ansible nodes -m shell -a "yum list installed | grep openshift" | |||
</pre> | |||
The desired OpenShift version must be installed. | |||
====OpenShift Version==== | |||
ansible nodes -m shell -a "/usr/bin/openshift version" | |||
master1.local | SUCCESS | ... | |||
openshift v3.5.5.26 | |||
kubernetes v1.5.2+43a9be4 | |||
etcd 3.1.0 | |||
=Exported Filesystems= | |||
On the support node run exportfs and make sure the following filesystems are exported: | |||
exportfs | |||
/nfs 192.168.122.0/255.255.255.0 | |||
/nfs/registry <world> | |||
/nfs/metrics <world> | |||
/nfs/logging <world> | |||
/nfs/logging-es-ops <world> | |||
/nfs/etcd <world> | |||
=On Masters= | =On Masters= | ||
Line 12: | Line 49: | ||
<pre> | <pre> | ||
oc get nodes | oc get nodes --show-labels | ||
</pre> | </pre> | ||
Line 18: | Line 55: | ||
<pre> | <pre> | ||
NAME | NAME STATUS AGE LABELS | ||
infranode1. | infranode1.openshift35.local Ready 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=infra,kubernetes.io/hostname=infranode1.openshift35.local,logging-infra-fluentd=true,logging=true | ||
infranode2. | infranode2.openshift35.local Ready 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=infra,kubernetes.io/hostname=infranode2.openshift35.local,logging-infra-fluentd=true,logging=true | ||
master1. | master1.openshift35.local Ready,SchedulingDisabled 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master1.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False | ||
master2. | master2.openshift35.local Ready,SchedulingDisabled 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master2.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False | ||
master3. | master3.openshift35.local Ready,SchedulingDisabled 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master3.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False | ||
node1. | node1.openshift35.local Ready 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node1.openshift35.local,logging-infra-fluentd=true,logging=true | ||
node2. | node2.openshift35.local Ready 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node2.openshift35.local,logging-infra-fluentd=true,logging=true | ||
</pre> | </pre> | ||
=Web Console= | =Verify etcd= | ||
On nodes that run etcd, as root: | |||
[[etcdctl#cluster-health|etcdctl cluster-health]] | |||
[[etcdctl#list|etcdctl member list]] | |||
Note that etcdctl2 should be used on OCP 3.7 onward. | |||
=Docker Logs= | |||
Log into a few nodes and take a look at the docker logs: | |||
<pre> | |||
journalctl -f -u docker | |||
</pre> | |||
=Docker Startup Paramenters= | |||
From the support/installation server, execute as "ansible": | |||
<pre> | |||
ansible nodes -m shell -a "ps -ef | grep dockerd | grep -v grep" | |||
</pre> | |||
Make sure "--selinux-enabled" and "--insecure-registry 172.30.0.0/16" are present. | |||
<font color=red>--insecure-registry does not seem to propagate, update /etc/sysconfig/docker manually on all docker nodes with '--insecure-registry 172.30.0.0/16'.</font> | |||
=Master Web Console= | |||
At this point the web console should be exposed on the external interface. | |||
{{External|https://master.openshift.novaordis.io/}} | |||
Use the administrative user defined as part of your "identity provider" declaration. | |||
The API server should respond to curl: | |||
curl -k https://master.openshift.novaordis.io/version | |||
{ | |||
"major": "1", | |||
"minor": "6", | |||
"gitVersion": "v1.6.1+5115d708d7", | |||
"gitCommit": "fff65cf", | |||
"gitTreeState": "clean", | |||
"buildDate": "2017-10-11T22:44:25Z", | |||
"goVersion": "go1.7.6", | |||
"compiler": "gc", | |||
"platform": "linux/amd64" | |||
} | |||
curl -k https://master.openshift.novaordis.io/healthz | |||
ok | |||
=DNS= | |||
Verify name resolution: | |||
dig +short docker-registry.default.svc.cluster.local | |||
172.30.53.178 | |||
from masters, infrastructure nodes and nodes. | |||
The answer must match the output of | |||
oc get -n default svc/docker-registry | |||
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE | |||
docker-registry 172.30.53.178 <none> 5000/TCP 88d | |||
=MTU Size Verification= | |||
<font color=red>TODO:</font> https://access.redhat.com/documentation/en-us/openshift_container_platform/3.7/html/day_two_operations_guide/day_two_environment_health_checks#day-two-guide-verifying_mtu | |||
=Router Status= | |||
oc -n default get deploymentconfigs/router | |||
NAME REVISION DESIRED CURRENT TRIGGERED BY | |||
router 5 1 1 config | |||
The values in the DESIRED and CURRENT columns should match the number of nodes hosts. | |||
Internal connectivity (both from master and a node): | |||
curl -kv https://docker-registry.default.svc.cluster.local:5000/healthz | |||
=Registry Status= | |||
oc -n default get deploymentconfigs/docker-registry | |||
NAME REVISION DESIRED CURRENT TRIGGERED BY | |||
docker-registry 1 1 1 config | |||
==Registry Console== | |||
{{External|https://registry-console-default.apps.openshift.novaordis.io/}} | |||
=oadm Diagnostics= | |||
{{Internal|Oadm diagnostics|oadm diagnostics}} | |||
=Per-project Validation= | |||
==Logging Installation Validation== | |||
Must be performed after [[OpenShift_Logging_Installation#Installation_During_the_Main_Procedure|logging installation and post-install configuration]]: | |||
{{Internal|OpenShift_Logging_Installation#Installation_Validation|Loging Installation Validation}} | |||
==Metrics Installation Validation== | |||
Must be performed after [[OpenShift_Metrics_Installation#Installation_During_the_Main_Procedure|metrics installation and post-install configuration]]: | |||
{{Internal|OpenShift_Metrics_Installation#Installation_Validation|Metrics Installation Validation}} | |||
=Validation Resources= | |||
https:// | * Day Two Operations Guide - Health Checks: https://access.redhat.com/documentation/en-us/openshift_container_platform/3.7/html/day_two_operations_guide/day_two_environment_health_checks |
Latest revision as of 19:04, 7 February 2018
External
Internal
Connect to the Support Node
As "ansible":
On All Nodes
OpenShift Packages
ansible nodes -m shell -a "yum list installed | grep openshift"
The desired OpenShift version must be installed.
OpenShift Version
ansible nodes -m shell -a "/usr/bin/openshift version" master1.local | SUCCESS | ... openshift v3.5.5.26 kubernetes v1.5.2+43a9be4 etcd 3.1.0
Exported Filesystems
On the support node run exportfs and make sure the following filesystems are exported:
exportfs
/nfs 192.168.122.0/255.255.255.0 /nfs/registry <world> /nfs/metrics <world> /nfs/logging <world> /nfs/logging-es-ops <world> /nfs/etcd <world>
On Masters
On each master node, run as root:
oc get nodes --show-labels
Output example:
NAME STATUS AGE LABELS infranode1.openshift35.local Ready 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=infra,kubernetes.io/hostname=infranode1.openshift35.local,logging-infra-fluentd=true,logging=true infranode2.openshift35.local Ready 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=infra,kubernetes.io/hostname=infranode2.openshift35.local,logging-infra-fluentd=true,logging=true master1.openshift35.local Ready,SchedulingDisabled 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master1.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False master2.openshift35.local Ready,SchedulingDisabled 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master2.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False master3.openshift35.local Ready,SchedulingDisabled 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,kubernetes.io/hostname=master3.openshift35.local,logging-infra-fluentd=true,logging=true,openshift_schedulable=False node1.openshift35.local Ready 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node1.openshift35.local,logging-infra-fluentd=true,logging=true node2.openshift35.local Ready 17m beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,cluster=hadron,env=app,kubernetes.io/hostname=node2.openshift35.local,logging-infra-fluentd=true,logging=true
Verify etcd
On nodes that run etcd, as root:
etcdctl cluster-health etcdctl member list
Note that etcdctl2 should be used on OCP 3.7 onward.
Docker Logs
Log into a few nodes and take a look at the docker logs:
journalctl -f -u docker
Docker Startup Paramenters
From the support/installation server, execute as "ansible":
ansible nodes -m shell -a "ps -ef | grep dockerd | grep -v grep"
Make sure "--selinux-enabled" and "--insecure-registry 172.30.0.0/16" are present.
--insecure-registry does not seem to propagate, update /etc/sysconfig/docker manually on all docker nodes with '--insecure-registry 172.30.0.0/16'.
Master Web Console
At this point the web console should be exposed on the external interface.
Use the administrative user defined as part of your "identity provider" declaration.
The API server should respond to curl:
curl -k https://master.openshift.novaordis.io/version { "major": "1", "minor": "6", "gitVersion": "v1.6.1+5115d708d7", "gitCommit": "fff65cf", "gitTreeState": "clean", "buildDate": "2017-10-11T22:44:25Z", "goVersion": "go1.7.6", "compiler": "gc", "platform": "linux/amd64" }
curl -k https://master.openshift.novaordis.io/healthz ok
DNS
Verify name resolution:
dig +short docker-registry.default.svc.cluster.local 172.30.53.178
from masters, infrastructure nodes and nodes.
The answer must match the output of
oc get -n default svc/docker-registry NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE docker-registry 172.30.53.178 <none> 5000/TCP 88d
MTU Size Verification
Router Status
oc -n default get deploymentconfigs/router NAME REVISION DESIRED CURRENT TRIGGERED BY router 5 1 1 config
The values in the DESIRED and CURRENT columns should match the number of nodes hosts.
Internal connectivity (both from master and a node):
curl -kv https://docker-registry.default.svc.cluster.local:5000/healthz
Registry Status
oc -n default get deploymentconfigs/docker-registry NAME REVISION DESIRED CURRENT TRIGGERED BY docker-registry 1 1 1 config
Registry Console
oadm Diagnostics
Per-project Validation
Logging Installation Validation
Must be performed after logging installation and post-install configuration:
Metrics Installation Validation
Must be performed after metrics installation and post-install configuration:
Validation Resources
- Day Two Operations Guide - Health Checks: https://access.redhat.com/documentation/en-us/openshift_container_platform/3.7/html/day_two_operations_guide/day_two_environment_health_checks