Friday, January 7, 2011

Oracle Clusterware processes for 10g on Unix and Linux

What are Oracle Clusterware processes for 10g on Unix and Linux



Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.

Cluster Synchronization Services(OCSSD):

OCSSD is part of RAC and Single Instance with ASM
Provides access to node membership, group services, basic cluster locking
Integrates with vendor clusterware, when present
Can also runs without integration to vendor clusterware
Runs as Oracle
Failure exit causes machine reboot
Prevents data corruption in event of a split brain.


Shared storage is also required for a voting (or quorum) disk, which is used to determine the nodes that are currently available within the cluster. The voting disk is used by the OCSSD to detect when nodes join and leave the cluster and is therefore also known as the Cluster Synchronization Services (CSS) voting disk.

==> Log files stores:

Cluster Synchronization Services (CSS) Log Files You can find CSS information that the OCSSD generates in log files in the following locations:

CRS Home/css/log/ocssdnumber.log
CRS Home/css/init/node_name.log



==>>>>

Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user


Functional =>>

CRSD: Engine for HA operation
Manages (start/stop/respawn) application resources
Maintains configuration profiles in the OCR (Oracle Configuration Repository)
Stores current known state in the OCR
Runs as root
Is restarted automatically on failure.


Logfiles store : ==>>

Cluster Ready Services Log Files Cluster Ready Services (CRS) has daemon processes that generate log information. Log files for the CRS daemon (crsd) can be found in the following directories:

CRS Home/crs/init
CRS Home/crs/node name.log



Event manager daemon (evmd) —A background process that publishes events that crs creates.

Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.

RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.


==>>> RAC Specific background processes

What are Oracle database background processes specific to RAC

•LMS—Global Cache Service Process

•LMD—Global Enqueue Service Daemon

•LMON—Global Enqueue Service Monitor

•LCK0—Instance Enqueue Process

To ensure that each Oracle RAC database instance obtains the block that it needs to satisfy a query or transaction, Oracle RAC instances use two processes, the Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and GES maintain records of the statuses of each data file and each cached block using a Global Resource Directory (GRD). The GRD contents are distributed across all of the active instances.


Global Cache Service (GCS) :

In a RAC database each instance has its own database buffer cache, which is located in the SGA on the local node. However, all instances share the same set of datafiles.

It is therefore possible that one or more instances might attempt to read and/or update the same block at the same time.

So access to the data blocks across the cluster must be managed in order to guarantee only one instance can modify the block at a time. In addition, any changes must be made visible to all other instances immediately once the transaction is committed. This is managed by the GCS, which coordinates requests for data access between the instances of the cluster.


LMSn ====>

The Global Cache Service background processes (LMSn) manage requests for data access between the nodes of the cluster.

Each block is assigned to a specific instance using the same hash algorithm that is used for global resources.

The instance managing the block is known as the resource master. When an instance requires access to a specific block, a request is sent to an LMS process on the resource master requesting access to the block.

The LMS process can build a read-consistent image of the block and return it to the requesting instance, or it can forward the request to the instance currently holding the block.


The LMS processes coordinate block updates, allowing only one instance at a time to make changes to a block and ensuring that those changes are made to the most recent version of the block.

The LMS process on the resource master is responsible for maintaining a record of the current status of the block, including whether it has been updated.
In Oracle 9.0.1 and Oracle 9.2 there can be up to 10 LMSn background processes (LMS0 to LMS9) per instance; in Oracle 10.1 there can be up to 20 LMSn background processes (LMS0 to LMS9, LMSa to LMSj) per instance; in Oracle 10.2 there can be up to 36 LMSn background processes (LMS0 to LMS9, LMSa to LMSz).

The number of required LMSn processes varies depending on the amount of messaging between the nodes in the cluster.






Global Enqueue Service (GES) :

In a RAC database, the GES is responsible for interinstance resource coordination. The GES manages all non-Cache Fusion intra-instance resource operations.

It tracks the status of all Oracle enqueue mechanisms for resources that are accessed by more than one instance.

Oracle uses GES to manage concurrency for resources operating on transactions, tables, and other structures within a RAC environment.



LMON ==>>>

In a single-instance database, access to database resources is controlled using enqueues that ensure that only one session has access to a resource at a time and that other sessions wait on a first in, first out (FIFO) queue until the resource becomes free. In a single-instance database, all locks are local to the instance.

In a RAC database there are global resources, including locks and enqueues that need to be visible to all instances. For example, the database mount lock that is used to control which instances can concurrently mount the database is a global enqueue, as are library cache locks, which are used to signal changes in object definitions that might invalidate objects currently in the library cache.

The Global Enqueue Service Monitor (LMON) background process is responsible for managing global enqueues and resources. It also manages the Global Enqueue Service Daemon (LMD) processes and their associated memory areas. LMON is similar to PMON in that it also manages instance and process expirations and performs recovery processing on global enqueues.

In Oracle 10.1 and below there is only one lock monitor background process.


LMDn ==>>>

The current status of each global enqueue is maintained in a memory structure in the SGA of one of the instances.

For each global resource, three lists of locks are held, indicating which instances are granted, converting, and waiting for the lock.

The LMD background process is responsible for managing requests for global enqueues and updating the status of the enqueues as requests are granted.

Each global resource is assigned to a specific instance using a hash algorithm. When an instance requests a lock, the LMD process of the local instance sends a request to the LMD process of the remote instance managing the resource. If the resource is available, then the remote LMD process updates the enqueue status and notifies the local LMD process.


If the enqueue is currently in use by another instance, the remote LMD process will queue the request until the resource becomes available. It will then update the enqueue status and inform the local LMD process that the lock is available.
The LMD processes also detect and resolve deadlocks that may occur if two or more instances attempt to access the two or more enqueues concurrently.

In Oracle 10.1 and below there is only one lock monitor daemon background process named LMD0.


LCK0 ==>>

The instance enqueue background process (LCK0) is part of GES. It manages requests for resources other than data blocks—for example, library and row cache objects. LCK processes handle all resource transfers not requiring Cache Fusion. It also handles cross-instance call operations. In Oracle 9.0.1 there could be up to ten LCK processes (LCK0 to LCK9). In Oracle 9.2 and Oracle 10.1 and 10.2 there is only one LCK process (LCK0).



DIAG ===>>

The DIAG background process captures diagnostic information when either a process or the entire instance fails. This information is written to a subdirectory within the directory specified by the BACKGROUND_DUMP_DEST initialization parameter. The files generated by this process can be forwarded to Oracle Support for further analysis.
There is one DIAG background process per instance. It should not be disabled or removed. In the event that the DIAG background process itself fails, it can be automatically restarted by other background processes.









What are Oracle Clusterware Components

Voting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.

Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster

How do you troubleshoot node reboot

Please check metalink ...

Note 265769.1 Troubleshooting CRS Reboots
Note.559365.1 Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions.

How do you backup the OCR

There is an automatic backup mechanism for OCR. The default location is : $ORA_CRS_HOME\cdata\"clustername"\

To display backups :
#ocrconfig -showbackup
To restore a backup :
#ocrconfig -restore

With Oracle RAC 10g Release 2 or later, you can also use the export command:
#ocrconfig -export -s online, and use -import option to restore the contents back.
With Oracle RAC 11g Release 1, you can do a manaual backup of the OCR with the command:
# ocrconfig -manualbackup

How do you backup voting disk

#dd if=voting_disk_name of=backup_file_name

How do I identify the voting disk location

#crsctl query css votedisk

How do I identify the OCR file location

check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)
or
#ocrcheck

Is ssh required for normal Oracle RAC operation ?

"ssh" are not required for normal Oracle RAC operation. However "ssh" should be enabled for Oracle RAC and patchset installation.

What is SCAN?

Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g Release 2 feature that provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster.

Click here for more details from Oracle

What is the purpose of Private Interconnect ?

Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster.

Why do we have a Virtual IP (VIP) in Oracle RAC?

Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs.
When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.

What do you do if you see GC CR BLOCK LOST in top 5 Timed Events in AWR Report?

This is most likely due to a fault in interconnect network.
Check netstat -s
if you see "fragments dropped" or "packet reassemblies failed" , Work with your system administrator find the fault with network.

How many nodes are supported in a RAC Database?

10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a RAC database.

Srvctl cannot start instance, I get the following error PRKP-1001 CRS-0215, however sqlplus can start it on both nodes? How do you identify the problem?

Set the environmental variable SRVM_TRACE to true.. And start the instance with srvctl. Now you will get detailed error stack.



what is the purpose of the ONS daemon ==>>

Oracle Notification Service (ONS) is used by Oracle Clusterware to propagate messages both within the RAC cluster and to clients and application-tier systems. ONS uses a publish-and-subscribe method to generate and deliver event messages to both local and remote consumers.


ONS is automatically installed as a node application on each node in the cluster. In Oracle 10.1 and above, it is configured as part of the Oracle Clusterware installation process. ONS daemons run locally, sending and receiving messages from ONS daemons on other nodes in the cluster. The daemons are started automatically by Oracle Clusterware during the reboot process.


ONS provides the foundation for Fast Application Notification (FAN), which in turn provides the basis for Fast Connection Failover (FCF).


The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service daemon receive a subset of published clusterware events via the local evmd and racgimon clusterware daemons and forward those events to application subscribers and to the local listeners.


Fast Connection Failover (FCF):

Fast Connection Failover (FCF) was introduced in Oracle 10.1 and relies on the ONS infrastructure.

It works with integrated connection pools in application servers and clients and is used to prevent new connections being directed to failed nodes or instances.

When a failure occurs, the application is immediately notified of the change in cluster configuration by ONS, and the connection pool can react by directing new connections to surviving instances. This behavior is performed internally by the connection pool and is transparent to both the developer and the application.
Oracle clients that provide FCF include Java Database Connectivity (JDBC), Oracle Call Interface (OCI), and the ODP.NET CLI.





FAN ==> which was introduced in Oracle 10.1, allows databases, listeners, application servers, and clients to receive rapid notification of database events, such as the starting and stopping of the database, instances, or services.

This allows the application to respond in a timely fashion to the event. It may be possible for a well-written application to reconnect to another instance without the end user ever being aware that the event has occurred.

a. The FAN or Fast Application Notification feature or allowing applications to respond to database state changes.

b. The 10gR2 Load Balancing Advisory, the feature that permit load balancing accross different rac nodes dependent of the load on the different nodes. The rdbms MMON is creating an advisory for distribution of work every 30seconds and forward it via racgimon and ONS to listeners and applications.

No comments:

Post a Comment