修改时间 14-FEB-2011 类型 BULLETIN 状态 PUBLISHED |
In this Document
Purpose
Scope and Application
11gR2 Clusterware and Grid Home - What You Need to Know
11gR2 Clusterware Key Facts
Clusterware Startup Sequence
Important Log Locations
Clusterware Resource Status Check
Clusterware Resource Administration
OCRCONFIG Options:
OLSNODES Options
Cluster Verification Options
References
Applies to:
Oracle Server - Enterprise Edition - Version: 11.2.0.1 to 11.2.0.1 - Release: 11.2 to 11.2Information in this document applies to any platform.
Purpose
The 11gR2 Clusterware has undergone numerous changes since the previous release. For information on the previous release(s), see Note: 259301.1 "CRS and 10g Real Application Clusters". This document is intended to go over the 11.2 Clusterware which has some similarities and some differences from the previous version(s).Scope and Application
This document is intended for RAC Database Administrators and Oracle support engineers.
11gR2 Clusterware and Grid Home - What You Need to Know
11gR2 Clusterware Key Facts
- 11gR2 Clusterware is required to be up and running prior to installing a 11gR2 Real Application Clusters database.
- The GRID home consists of the Oracle Clusterware and ASM. ASM should not be in a seperate home.
- The 11gR2 Clusterware can be installed in "Standalone" mode for ASM and/or "Oracle Restart" single node support. This clusterware is a subset of the full clusterware described in this document.
- The 11gR2 Clusterware can be run by itself or on top of vendor clusterware. See the certification matrix for certified combinations. Ref: Note: 184875.1 "How To Check The Certification Matrix for Real Application Clusters"
- The GRID Home and the RAC/DB Home must be installed in different locations.
- The 11gR2 Clusterware requires a shared OCR files and voting files. These can be stored on ASM or a cluster filesystem.
- The OCR is backed up automatically every 4 hours to
/cdata/ / and can be restored via ocrconfig. - The voting file is backed up into the OCR at every configuration change and can be restored via crsctl.
- The 11gR2 Clusterware requires at least one private network for inter-node communication and at least one public network for external communication. Several virtual IPs need to be registered with DNS. This includes the node VIPs (one per node), SCAN VIPs (three). This can be done manually via your network administrator or optionally you could configure the "GNS" (Grid Naming Service) in the Oracle clusterware to handle this for you (note that GNS requires its own VIP).
- A SCAN (Single Client Access Name) is provided to clients to connect to. For more info on SCAN see Note: 887522.1
- The root.sh script. at the end of the clusterware installation starts the clusterware stack. For information on troubleshooting root.sh issues see Note: 1053970.1
- Only one set of clusterware daemons can be running per node.
- On Unix, the clusterware stack is started via the init.ohasd script. referenced in /etc/inittab with "respawn".
- A node can be evicted (rebooted) if a node is deemed to be unhealthy. This is done so that the health of the entire cluster can be maintained. For more information on this see: Note: 1050693.1 "Troubleshooting 11.2 Clusterware Node Evictions (Reboots)"
- Either have vendor time synchronization software (like NTP) fully configured and running or have it not configured at all and let CTSS handle time synchonization. See Note: 1054006.1 for more infomation.
- If installing DB homes for a lower version, you will need to pin the nodes in the clusterware or you will see ORA-29702 errors. See Note 946332.1 and Note:948456.1 for more info.
- The clusterware stack can be started by either booting the machine, running "crsctl start crs" to start the clusterware stack, or by running "crsctl start cluster" to start the clusterware on all nodes. Note that crsctl is in the
/bin directory. Note that "crsctl start cluster" will only work if ohasd is running. - The clusterware stack can be stopped by either shutting down the machine, running "crsctl stop crs" to stop the clusterware stack, or by running "crsctl stop cluster" to stop the clusterware on all nodes. Note that crsctl is in the
/bin directory. - Killing clusterware daemons is not supported.
Clusterware Startup Sequence
The following is the Clusterware startup sequence (image from the "Oracle Clusterware Administration and Deployment Guide):Don't let this picture scare you too much. You aren't responsible for managing all of these processes, that is the Clusterware's job!
Short summary of the startup sequence: INIT spawns init.ohasd (with respawn) which in turn starts the OHASD process (Oracle High Availability Services Daemon). This daemon spawns 4 processes.
Level 1: OHASD Spawns:
- cssdagent - Agent responsible for spawning CSSD.
- orarootagent - Agent responsible for managing all root owned ohasd resources.
- oraagent - Agent responsible for managing all oracle owned ohasd resources.
- cssdmonitor - Monitors CSSD and node health (along wth the cssdagent).
Level 2: OHASD rootagent spawns:
- CRSD - Primary daemon responsible for managing cluster resources.
- CTSSD - Cluster Time Synchronization Services Daemon
- Diskmon
- ACFS (ASM Cluster File System) Drivers
Level 2: OHASD oraagent spawns:
- MDNSD - Used for DNS lookup
- GIPCD - Used for inter-process and inter-node communication
- GPNPD - Grid Plug & Play Profile Daemon
- EVMD - Event Monitor Daemon
- ASM - Resource for monitoring ASM instances
Level 3: CRSD spawns:
- orarootagent - Agent responsible for managing all root owned crsd resources.
- oraagent - Agent responsible for managing all oracle owned crsd resources.
Level 4: CRSD rootagent spawns:
- Network resource - To monitor the public network
- SCAN VIP(s) - Single Client Access Name Virtual IPs
- Node VIPs - One per node
- ACFS Registery - For mounting ASM Cluster File System
- GNS VIP (optional) - VIP for GNS
Level 4: CRSD oraagent spawns:
- ASM Resouce - ASM Instance(s) resource
- Diskgroup - Used for managing/monitoring ASM diskgroups.
- DB Resource - Used for monitoring and managing the DB and instances
- SCAN Listener - Listener for single client access name, listening on SCAN VIP
- Listener - Node listener listening on the Node VIP
- Services - Used for monitoring and managing services
- ONS - Oracle Notification Service
- eONS - Enhanced Oracle Notification Service
- GSD - For 9i backward compatibility
- GNS (optional) - Grid Naming Service - Performs name resolution
This image shows the various levels more clearly:
Important Log Locations
Clusterware daemon logs are all underalert
./admin:
./agent:
./agent/crsd:
./agent/crsd/oraagent_oracle:
./agent/crsd/ora_oc4j_type_oracle:
./agent/crsd/orarootagent_root:
./agent/ohasd:
./agent/ohasd/oraagent_oracle:
./agent/ohasd/oracssdagent_root:
./agent/ohasd/oracssdmonitor_root:
./agent/ohasd/orarootagent_root:
./client:
./crsd:
./cssd:
./ctssd:
./diskmon:
./evmd:
./gipcd:
./gnsd:
./gpnpd:
./mdnsd:
./ohasd:
./racg:
./racg/racgeut:
./racg/racgevtf:
./racg/racgmain:
./srvm:
The cfgtoollogs dir under
ASM logs live under $ORACLE_BASE/diag/asm/+asm/
The diagcollection.pl script. under
Clusterware Resource Status Check
The following command will display the status of all cluster resources:$ ./crsctl status resource -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATADG.dg
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.LISTENER.lsnr
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.SYSTEMDG.dg
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.asm
ONLINE ONLINE racbde1 Started
ONLINE ONLINE racbde2 Started
ora.eons
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.gsd
OFFLINE OFFLINE racbde1
OFFLINE OFFLINE racbde2
ora.net1.network
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.ons
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
ora.registry.acfs
ONLINE ONLINE racbde1
ONLINE ONLINE racbde2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE racbde1
ora.LISTENER_SCAN2.lsnr
1 ONLINE ONLINE racbde2
ora.LISTENER_SCAN3.lsnr
1 ONLINE ONLINE racbde2
ora.oc4j
1 OFFLINE OFFLINE
ora.rac.db
1 ONLINE ONLINE racbde1 Open
2 ONLINE ONLINE racbde2 Open
ora.racbde1.vip
1 ONLINE ONLINE racbde1
ora.racbde2.vip
1 ONLINE ONLINE racbde2
ora.scan1.vip
1 ONLINE ONLINE racbde1
ora.scan2.vip
1 ONLINE ONLINE racbde2
ora.scan3.vip
1 ONLINE ONLINE racbde2
Clusterware Resource Administration
Srvctl and crsctl are used to manage clusterware resources. The general rule is to use srvctl for whatever resource management you can. Crsctl should only be used for things that you cannot do with srvctl (like start the cluster). Both have a help feature to see the available syntax.Srvctl syntax:
Usage: srvctl [-V]
Usage: srvctl add database -d
Usage: srvctl config database [-d
Usage: srvctl start database -d
Usage: srvctl stop database -d
Usage: srvctl status database -d
Usage: srvctl enable database -d
Usage: srvctl disable database -d
Usage: srvctl modify database -d
Usage: srvctl remove database -d
Usage: srvctl getenv database -d
Usage: srvctl setenv database -d
Usage: srvctl unsetenv database -d
Usage: srvctl add instance -d
Usage: srvctl start instance -d
Usage: srvctl stop instance -d
Usage: srvctl status instance -d
Usage: srvctl enable instance -d
Usage: srvctl disable instance -d
Usage: srvctl modify instance -d
Usage: srvctl remove instance -d
Usage: srvctl add service -d
Usage: srvctl add service -d
Usage: srvctl config service -d
Usage: srvctl enable service -d
Usage: srvctl disable service -d
Usage: srvctl status service -d
Usage: srvctl modify service -d
Usage: srvctl modify service -d
Usage: srvctl modify service -d
Usage: srvctl modify service -d
Usage: srvctl relocate service -d
Specify instances for an administrator-managed database, or nodes for a policy managed database
Usage: srvctl remove service -d
Usage: srvctl start service -d
Usage: srvctl stop service -d
Usage: srvctl add nodeapps { { -n
Usage: srvctl config nodeapps [-a] [-g] [-s] [-e]
Usage: srvctl modify nodeapps {[-n
Usage: srvctl start nodeapps [-n
Usage: srvctl stop nodeapps [-n
Usage: srvctl status nodeapps
Usage: srvctl enable nodeapps [-v]
Usage: srvctl disable nodeapps [-v]
Usage: srvctl remove nodeapps [-f] [-y] [-v]
Usage: srvctl getenv nodeapps [-a] [-g] [-s] [-e] [-t "
Usage: srvctl setenv nodeapps {-t "
Usage: srvctl unsetenv nodeapps -t "
Usage: srvctl add vip -n
Usage: srvctl config vip { -n
Usage: srvctl disable vip -i
Usage: srvctl enable vip -i
Usage: srvctl remove vip -i "
Usage: srvctl getenv vip -i
Usage: srvctl start vip { -n
Usage: srvctl stop vip { -n
Usage: srvctl status vip { -n
Usage: srvctl setenv vip -i
Usage: srvctl unsetenv vip -i
Usage: srvctl add asm [-l
Usage: srvctl start asm [-n
Usage: srvctl stop asm [-n
Usage: srvctl config asm [-a]
Usage: srvctl status asm [-n
Usage: srvctl enable asm [-n
Usage: srvctl disable asm [-n
Usage: srvctl modify asm [-l
Usage: srvctl remove asm [-f]
Usage: srvctl getenv asm [-t
Usage: srvctl setenv asm -t "
Usage: srvctl unsetenv asm -t "
Usage: srvctl start diskgroup -g
Usage: srvctl stop diskgroup -g
Usage: srvctl status diskgroup -g
Usage: srvctl enable diskgroup -g
Usage: srvctl disable diskgroup -g
Usage: srvctl remove diskgroup -g
Usage: srvctl add listener [-l
Usage: srvctl config listener [-l
Usage: srvctl start listener [-l
Usage: srvctl stop listener [-l
Usage: srvctl status listener [-l
Usage: srvctl enable listener [-l
Usage: srvctl disable listener [-l
Usage: srvctl modify listener [-l