Pre-Installation tasks of Installing Oracle 10g Real
Application Cluster (RAC) (10.2.0.1) 32-bit on Redhat AS 3 x86 (RHEL3) / CentOS
3 x86
By Bhavin Hingu
Pre-Installation
Task:
Task List:
Minimum
Hardware required
Technical
Architecture of 2-node RAC
Download
Oracle 10g RDBMS softwares from OTN
Software Redhat Packages
Required
Memory and Swap
Space
Setting up Kernel Paramemeter
Configuring
the Public, Private and Virtual Hosts / Network
Creating
oracle User account.
Creating
required Directories for Oracle 10g R2 RAC software and setting up correct
Permission
Setup shell Limits for the oracle user
Enable
SSH oracle user Equivalency on all the cluster nodes.
Configuring
System For Firewire Shared Disk
Partitioning
Shared Disk
Installing
and Configuring OCFS (Oracle Cluster File System)
Creating ASM disks using oracleasm (ASMLib IO) For
the Clustered Database
Checking
the Configuration of hangcheck-timer module
Required
Hardware:
To create 2-node RAC, one would require 2 machine
with the following hardware installed on it.
Per Node:
1 GB RAM, at least 8 GB of hard drive, 1 GHz CPU,
1 Firewire Controller, 1 Firewire Cable
2 NIC Ethernet card (one for public and another for private / interconnect
network)
Per Cluster:
1 Shared Hard drive
1 Firewire HUB + 1 Firewire cable (for cluster with more than 2 node)
1 Network HUB + 1 network cable (for cluster with more than 2 node)
1 crossover network cable (for cluster with 2 node)
n number of network cable for private network for internode communication
(for cluster with n nodes where n >=3)
n number of network cable for public network (for cluster with n nodes where n
>=3)
I used the below hardware to build my 2-node rac.
Server
1 |
Dell
Intel PIII 1.3 GHz, 256 MB RAM, 20 GB HD |
$200
- Used one |
Server
2 |
Dell
Intel PIII 1.3 GHz, 256 MB RAM, 20 GB HD |
$200
- Used one |
Upgrade
Memory to 512MB |
256
MB x 2 for Both the Server |
$110 |
Firewire
Hard Drive |
LaCie
Firewire Hard Drive 120 GB |
$160 |
Firewire
Controllers |
Adaptec
AFW-4300 x 2 (for both the server) - Texas Instrument chipset |
$98 |
Firewire
HUB |
Belkin's
Firewire 6-Port Hub |
$55 |
Firewire
Cables |
1
extra Firewire cable for other node |
$15 |
NICs
|
D-Link
Ethernet card x2 |
$30 |
Network
Hub |
"NETWORK
Everywhere"10/100 5-Port Hub |
$30 |
Crossover
cable |
------- |
$15 |
Total Cost: $913.00
Technical
Architecture of 2 node RAC:
Clustered
Database Name: RacDB
Node1:
SID:
RacDB1
Public Network name (hostname): node1-pub, eth0
Private network Name (for Interconnect): node1-prv, eth1
ORACLE_BASE: /u01/app/oracle
DB file location: +ASM/{DB_NAME}/
CRS file Location: /u02/oradata/ocr mounted on /dev/sda1 (ocfs)
Node2:
SID: RacDB2
Public Network name (hostname): node2-pub, eth0
Private network Name (for Interconnect): node2-prv, eth1
ORACLE_BASE: /u01/app/oracle
DB file location: +ASM/{DB_NAME}/
CRS
file Location: /u02/oradata/ocr mounted on /dev/sda1 (ocfs)
Goto otn.oracle.com and download the appropriate
Oracle 10g Softwares into the /tmp. Make Sure You have enough space under this
mount point. You can check this using df command. I downloaded 10201_database_linux32.zip (668,734,007
bytes) (cksum - 2737423041) file for my 32-bit Linux box. As I am going
to create multi-Instance Database (RAC), I also needed to download 10g
clusterware 10201_clusterware_linux32.zip (228,239,016
bytes) (cksum - 2639036338). These files come with a .zip
extension which needs to be unzipped using the unzip utility which is installed
as part of CentOS. In case you do not have one; you can get it from here.
After Unzipping this file, you can optionally write them on the CD. I generally
prefer cdrecord command.
I used a CD media of 700MB capacity to copy
10g (10.2.0.1) on it.
[root@localhost
root]# unzip /tmp/0201_database_linux32.zip
[root@localhost root]# mkisofs -r /tmp/databases | cdrecord -v dev=1,1,0
speed=20 –
If you are installing the software from disc,
mount the first disc if it is not already mounted. Some platforms
automatically mount the disc when you insert the disc into the drive.
After you install the Linux system and before you
start installing Oracle10g software, please make sure that you have the below
packages installed on your Linux box, else you will get error(s) during
the installation process.
make-3.79.1
gcc-3.2.3-34
glibc-2.3.2-95.20
compat-db-4.0.14-5
compat-gcc-7.3-2.96.128
compat-gcc-c++-7.3-2.96.128
compat-libstdc++-7.3-2.96.128
compat-libstdc++-devel-7.3-2.96.128
openmotif21-2.1.30-8
setarch-1.3-1
libaio-0.3.103-3
Please execute the below command as root to make sure that you
have this rpms installed. If not installed, then download them from appropriate
Linux site.
rpm
-q make gcc glibc compat-db compat-gcc compat-gcc-c++ compat-libstdc++ \
compat-libstdc++-devel openmotif21 setarch
libaio libaio-devel
Perform
this step on all the nodes.
Oracle 10g RAC requires to have 1GB of RAM available
on each node to sucessfully install 10g RAC. Well, I have somehow managed to
install it with 512 MB RAM. You will get warning during checking of
pre-requisite step of installation step which you can ignore. Please go to Adding an Extra Swapspace if you want to have
an extra Swapspace added.
Kernel Parameters:
Please go to Setting
Up kernel Parameter to set the kernel parameters.
Each node in the cluster must have 2 network adapter (eth0, eth1) one for the
public and another for the private network interface (internode communication,
interconnect). You make sure that if you configure eth1 as the private
interface for node1 then, eth1 must be configured as private interface for the
node2.
Follow the below steps to configure these networks:
(1)
Change
the hostname value by executing the below command:
For
Node node1-pub:
[root@localhost root]# hostname node1-pub
For Node node2-pub:
[root@localhost root]# hostname node2-pub
(2)
Edit
the /etc/hosts file as shown below
[root@localhost
root]# cat /etc/hosts
# Do not
remove the following line, or various programs
# that
require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
#
Hostname for the Public Nodes in the RAC (eth0)
216.160.37.154 node1-pub.oracledba.org node1-pub
216.160.37.156 node2-pub.oracledba.org node2-pub
#
Hostname for the Private Nodes in the RAC (eth1)
192.168.203.1 node1-prv.oracledba.org node1-prv
192.168.203.2 node2-prv.oracledba.org node2-prv
#
Hostname for the Virtual IP in the RAC (eth1)
192.168.203.11 node1-vip.oracledba.org node1-vip
192.168.203.22 node2-vip.oracledba.org node2-vip
[root@node2-pub
root]#
(3)
Edit
OR create the /etc/sysconfig/network-scripts/ifcfg-eth0 as shown below:
If
you have static IPs:
create the same file on both the nodes as shown below.
[root@localhost root]# cat
/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Ethernet
If you DO NOT have static IPs:
Add the entries like below into
/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
IPADDR=192.168.10.1 -- For node1-pub Node
IPADDR=192.168.10.2 -- For node2-pub Node
(4)
Edit
OR create the /etc/sysconfig/network-scripts/ifcfg-eth1 as shown below:
For Node node1-pub:
[root@localhost root]# cat
/etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
PEERDNS=no
IPADDR=192.168.203.1
For Node node2-pub:
[root@localhost root]# cat /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
PEERDNS=no
IPADDR=192.168.203.2
(5)
Edit
the /etc/sysconfig/network file with the below contents:
For Node
node1-pub:
[root@localhost root]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=node1-pub
For Node node2-pub:
[root@localhost root]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=node2-pub
(6)
Restart
the network service OR reboot the nodes:
After I rebooted both the nodes, I verified the network interface
configurations by running the ifconfig command as shown below.
[root@node2-pub
root]# ifconfig
You need OS “oracle” user account created which owns
the Oracle software. Oracle Software installation needs to be proceeds by this
account. Oracle software installation (without Companion CD) requires 6 GB of
free space available for the ORACLE_BASE directory. Please make sure that the
mount point where you plan
to install Software has required free space available. You can use “df –k” to
check this out.
[root@node2-pub root]#
df -k
Filesystem
1K-blocks Used Available Use% Mounted on
/dev/hda2
18113556 3923072 13270364 23% /
/dev/hda1
101089 14036 81834 15%
/boot
none
126080 0
126080 0% /dev/shm
I
had about 13GB of free space available on “/” mount point. So I decided to
install Oracle under this mount point. RAC requires to have oracle user account
created on all the nodes with the same
user id and group id. So, create oracle user account with this property by
executing the below series of command on all the RAC nodes.
groupadd
-g 900 dba
groupadd
-g 901 oinstall
useradd -u 900 -g oinstall -G dba oracle
passwd oracle
Please
verify that oracle user has same gid and uid on all the RAC nodes by executing
the this command
[oracle@node2-pub oracle]$ id
uid=900(oracle) gid=901(oinstall) groups=901(oinstall),900(dba)
[oracle@node1-pub oracle]$ id
uid=900(oracle) gid=901(oinstall) groups=901(oinstall),900(dba)
Creating Oracle Software
Directories:
Perform the below steps on all the nodes in cluster.
[root@node2-pub root]# mkdir -p /u01/app/oracle
[root@node2-pub root]# mkdir -p /u02/oradata/ocr -- mountpoint for the
OCR files
[root@node2-pub root]# chown -R oracle:oinstall /u01
[root@node2-pub root]# chown -R oracle:oinstall /u02
[root@node2-pub root]# chmod -R 775 /u01/app/oracle
[root@node2-pub root]# chmod -R 775
/u02
Setting
Shell Limits for the Oracle User:
Please go to Setting up shell limits for the oracle user to
set the shell limits for the oracle user.
To configure SSH user equivalency, you must create RSA and DSA keys on each cluster
node and copy these keys from all the cluster node members into an authorized
key file on each node. Follow the below steps to achieve this task.
su - oracle
mkdir ~/.ssh
chmod 700 ~/.ssh
(A) Generate the RSA and DSA keys on all the RAC Nodes:
/usr/bin/ssh-keygen -t
rsa
/usr/bin/ssh-keygen -t dsa
(B) Add keys to the authorized key file and then send the same file to every
nodes in cluster:
touch
~/.ssh/authorized_keys
cd ~/.ssh
(1)
ssh node1-pub cat
/home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh node1-pub cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
ssh node2-pub cat /home/oracle/.ssh/id_rsa.pub >> authorized_keys
ssh node2-pub cat /home/oracle/.ssh/id_dsa.pub >> authorized_keys
(2)
Copy the authorized keys file to every nodes. For e.g, from node2-pub, I used
the below command to copy the node2-pub's authorized key file to node1-pub
node.
[oracle@node2-pub
.ssh]$ scp authorized_keys node1-pub:/home/oracle/.ssh/
(C)
Repeat Step B on each node one by one.
(D) Change the Permission of authorized_keys file (on each node)
[oracle@node2-pub .ssh]$ chmod 600
~/.ssh/authorized_keys
During executing step B - (1), you may be prompted as show below. Enter
"yes" and continue.
[oracle@node2-pub .ssh]$ ssh node1-pub cat
/home/oracle/.ssh/id_rsa.pub >> authorized_keys
The authenticity of host 'node1-pub (216.160.37.154)' can't be established.
RSA key fingerprint is <**********>.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node1-pub,216.160.37.154' (RSA) to the list of
known hosts.
Warning: No xauth data; using fake authentication data for X11 forwarding.
Now, try executing the date (or any other command) on all the nodes to make
sure that oracle is not asked for the password. You should not receive any
error message while you execute these commands on all the nodes. If you get any
error, first fix them before you go further.
ssh node1-prv date
ssh node2-prv date
ssh node1-pub date
ssh node2-pub date
Errors / Warnings
during the network configurations:
I got the below warning when I tried below command.
[oracle@node2-pub .ssh]$ ssh node1-pub date
Warning: No xauth data; using fake authentication data for X11 forwarding.
Sun Dec 18 02:04:52 CST 2005
To fix the above warning, create the /home/oracle/.ssh/config file (logged in
as oracle user) and make the below entry in it. Then run the same command again
and the above warning would not show up.
[oracle@node2-pub oracle]$ cat .ssh/config
Host *
Forwardx11 no
It is observed that when you execute the below command, you will be prompted to
enter 'yes' or 'no'. Simply enter yes and continue. Afterwards, when oracle
connect to the remote node, it won’t be asked for the password. this is
explained as below where oracle received below message when it tried to run the
date command on the remote node using ssh very first time. Afterwards,
it does not get any message like this.
[oracle@node2-pub oracle]$ ssh node1-prv date
The authenticity of host 'node1-prv (192.168.203.1)' can't be established.
RSA key fingerprint is <********************************************>
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node1-prv,192.168.203.1' (RSA) to the list of known
hosts.
Sun Dec 18 20:01:09 CST 2005
[oracle@node2-pub oracle]$ ssh node1-prv date
Sun Dec 18 20:01:13 CST 2005
[oracle@node2-pub oracle]$
[oracle@node2-pub oracle]$ ssh node2-prv date
Warning: Permanently added the RSA host key for IP address '192.168.203.2' to
the list of known hosts.
Sun Dec 18 20:14:16 CST 2005
[oracle@node2-pub oracle]$ ssh node2-pub date
Sun Dec 18 20:15:05 CST 2005
If you get then below error message when try to connect to remote node, please
make sure that the firewall is disabled on the remote node.
[root@node2-pub root]# telnet node1-prv
Trying 192.168.203.1...
telnet: Unable to connect to remote host: No route to host
Configuring System for Shared
Disk Storage Device (Firewire):
Every node in the cluster must have access to the
shared disk. So the shared disk must support the concurrent access to all nodes
in cluster in order to successfully build 10g RAC. I chose Firewire Disk as a
shared storage media because it is a cost effective solution if you just want
to have hands-on practice on 10g RAC without investing more money. After you
install the Redhat LINUX AS 3 system into both the node, please go to the
http://oss.oracle.com/projects/firewire/files and download the
appropriate Firewire kernel to support Firewire HD. I downloaded and installed
the below rpms.
[root@localhost root]# uname -r
2.4.21-37.EL
kernel-2.4.21-27.0.2.ELorafw1.i686.rpm
[root@localhost root]# rpm -ivh --force
kernel-2.4.21-27.0.2.ELorafw1.i686.rpm
This will also update the /etc/grub.conf file
with the added entry of this new Firewire kernel. in the below file, default is
set to 1 which means that the system will use the original kernel by
default. If you want to make the newly added Firewire kernel as default, you
can simply change the default=1 to default=0. It is required to set this kernel
to default in the situation if this node is restarted by hangcheck-timer
or for any other reason, then it should be rebooted with the right kernel.
[root@node2-pub root]# cat /etc/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this
file
# NOTICE: You have a /boot partition. This means that
# all kernel and
initrd paths are relative to /boot/, eg.
# root (hd0,0)
# kernel
/vmlinuz-version ro root=/dev/hda2
# initrd
/initrd-version.img
#boot=/dev/hda
default=0
timeout=10
splashimage=(hd0,0)/grub/splash.xpm.gz
title CentOS (2.4.21-27.0.2.ELorafw1)
root (hd0,0)
kernel
/vmlinuz-2.4.21-27.0.2.ELorafw1 ro root=LABEL=/
initrd
/initrd-2.4.21-27.0.2.ELorafw1.img
title CentOS-3 (2.4.21-37.EL)
root (hd0,0)
kernel /vmlinuz-2.4.21-37.EL ro
root=LABEL=/
initrd /initrd-2.4.21-37.EL.img
Also update the /etc/modules.conf file and add the below lines at the end
of file on BOTH THE NODES. This will load the Firewire kernel modules and
drivers at reboot.
alias ieee1394-controller ohci1394
options sbp2 sbp2_exclusive_login=0
post-install sbp2 insmod sd_mod
post-install sbp2 insmod ohci1394
post-remove sbp2 rmmod sd_mod
Now, Shutdown both the nodes and then connect the Firewire shared disk to them.
power on the Firewire disk and then restart both the nodes using the new Firewire
kernel 2.4.21-27.0.2.ELorafw1 one by one. Confirm that the Firewire disk is
visible from both the nodes by running the below command as root on both the
node.
[root@localhost root]# dmesg | grep ieee1394
ieee1394: Host added: Node[00:1023] GUID[0000d1008016f8e8] [Linux
OHCI-1394]
ieee1394: Device added: Node[01:1023] GUID[00d04b3b1905e049] [LaCie
Group SA ]
ieee1394: sbp2: Query logins to SBP-2 device successful
ieee1394: sbp2: Maximum concurrent logins supported: 4
ieee1394: sbp2: Number of active logins: 0
ieee1394: sbp2: Logged into SBP-2 device
ieee1394: sbp2: Node[01:1023]: Max speed [S400] - Max payload [2048]
ieee1394: Device added: Node[00:1023] GUID[00309500a0042ef9] [Linux
OHCI-1394]
ieee1394: Node 00:1023 changed to 01:1023
ieee1394: Node 01:1023 changed to 02:1023
ieee1394: sbp2: Reconnected to SBP-2 device
ieee1394: sbp2: Node[02:1023]: Max speed [S400] - Max payload [2048]
[root@localhost root]# dmesg | grep sda
Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
SCSI device sda: 240121728 512-byte hdwr sectors (122942 MB)
sda:
Partitioning
the Shared disk:
You
need atleast two partitions to be created on if you want to go for ASM as an
storage options. One or more partition(s) will be used as ASM disk(s) and one
partition is required to store Oracle's CRS (Cluster Ready Service) files. You
cannot create these files under ASM. They need to be either created under the
raw partition (device) or can be on ocfs. This document covers all the options
of storing the database files on shared disk. If you want to use the entire
disk as ocfs or raw device (volume), then no need to create separate partition
for the CRS files. I partitioned the disks as shown below by connecting to any
one of the nodes.
As I am going to use ASM for the database files and so I will use ocfs for the
CRS files.
[root@node2-pub root]# fdisk /dev/sda
The number of cylinders for this disk is set to 24792.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4):
Value out of range.
Partition number (1-4): 1
First cylinder (1-24792, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-24792, default 24792): +300M
Command (m for help): p
Disk /dev/sda: 203.9 GB, 203928109056 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot
Start End
Blocks Id System
/dev/sda1
1 37
297171 83 Linux
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (38-24792, default 38):
Using default value 38
Last cylinder or +size or +sizeM or +sizeK (38-24792, default 24792): +70000M
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (8549-24792, default 8549):
Using default value 8549
Last cylinder or +size or +sizeM or +sizeK (8549-24792, default 24792): +70000M
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Selected partition 4
First cylinder (17060-24792, default 17060):
Using default value 17060
Last cylinder or +size or +sizeM or +sizeK (17060-24792, default 24792):
Using default value 24792
Command (m for help): p
Disk /dev/sda: 203.9 GB, 203928109056 bytes
255 heads, 63 sectors/track, 24792 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot
Start End
Blocks Id System
/dev/sda1
1 37
297171 83 Linux -- will be used by CRS files
(ocfs)
/dev/sda2
38 8548 68364607+ 83
Linux -- will be used for ASM DSK1
/dev/sda3
8549 17059 68364607+ 83
Linux -- reserved for another Clustered database on ocfs
/dev/sda4
17060 24792 62115322+ 83
Linux -- reserved for another Clustered database on raw device
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
WARNING: Re-reading the partition table failed with error 16: Device or
resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.
[root@node2-pub root]# partprobe -- [ Perform this step on all the nodes in
cluster]
[root@node2-pub root]#
Installing and Configuring
OCFS (Oracle Cluster File System):
We
have 3 storage options to store the clustered database file on the shared disk.
(1) Traditional raw device option (9i).
(2) ocfs (also available in 9i).
(3) ASM (only in 10g and above).
I prefer to use the ASM over the ocfs file system. I will only use ocfs
partition to store OCR file as this file needs to be on the shared disk. You
can also use the raw device option to store them.
Download
and Install the required rpms:
Please download the below rpms from Oracle's website and install them as shown.
ocfs-2.4.21-EL-1.0.14-1.i686.rpm (For UniProcessor)
ocfs-2.4.21-EL-smp-1.0.14-1.i686.rpm (For SMPs)
ocfs-tools-1.0.10-1.i386.rpm
ocfs-support-1.1.5-1.i386.rpm
[root@node2-pub root]# rpm -Uvh
/rpms/ocfs-2.4.21-EL-1.0.14-1.i686.rpm \
>
/rpms/ocfs-tools-1.0.10-1.i386.rpm \
>
/rpms/ocfs-support-1.1.5-1.i386.rpm
Preparing...
########################################### [100%]
1:ocfs-support
########################################### [ 33%]
2:ocfs-2.4.21-EL
########################################### [ 67%]
Linking OCFS module into the module path [ OK ]
3:ocfs-tools
########################################### [100%]
[root@node2-pub root]#
[root@node2-pub root]# cat /etc/ocfs.conf
#
# ocfs config
# Ensure this file exists in /etc
#
node_name = node2-prv
ip_address = 192.168.203.2
ip_port = 7000
comm_voting = 1
guid =
238426EC6845F952C83A00065BAEAE7F
Loading OCFS Module:
[root@node2-pub root]# load_ocfs
/sbin/modprobe ocfs node_name=node2-pub ip_address=192.168.203.2 cs=1843
guid=238426EC6845F952C83A00065BAEAE7F ip_port=7000 comm_voting=1
modprobe: Can't locate module ocfs
load_ocfs: insmod failed
If you get the above error follow the below steps to fix this:
Verify that you have ocfs.o module under /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ directory.
[root@node2-pub root]# ls -l
/lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o
lrwxrwxrwx 1 root
root 38 Dec 19
23:14 /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o
-> /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o
IF THIS FILE EXISTS THEN:
open the /sbin/load_ocfs file using vi or
another editor and change the below line as shown.
(Line Number 93)
# If you must hardcode an absolute module path
for testing, do it HERE.
# MODULE=/path/to/test/module/ocfsX.o
Change to:
# If you must hardcode an absolute module path for testing, do it HERE.
MODULE=/lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o
IF THIS FILE DOES NOT EXISTS THEN:
Create a symbolic link as shown below.
mkdir
/lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/
ln -s /lib/modules/2.4.21-EL-ABI/ocfs/ocfs.o /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o
Now try again to load the same module
[root@node2-pub root]# load_ocfs
If you get the error again then modify the /sbin/load_ocfs file as shown in the
above step after creating the symbolic link
[root@node2-pub root]# load_ocfs
/sbin/insmod
/lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o
node_name=node2-prv ip_address=192.168.203.2 cs=1843
guid=238426EC6845F952C83A00065BAEAE7F ip_port=7000 comm_voting=1
Warning: kernel-module version mismatch
/lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o was
compiled for kernel version 2.4.21-27.EL
while this kernel is version
2.4.21-27.0.2.ELorafw1
Warning: loading /lib/modules/2.4.21-27.0.2.ELorafw1/kernel/drivers/addon/ocfs/ocfs.o
will taint the kernel: forced load
See http://www.tux.org/lkml/#export-tainted for information about
tainted modules
Module ocfs loaded, with warnings
You may get the above warning but this may be OK. Verify that the ocfs module
is loaded by executing the below command.
[root@node2-pub root]# lsmod | grep ocfs
ocfs
299104 0 (unused)
Creating and
Mounting OCFS (Oracle Cluster File System):
Create the file system using mkfs:
Execute the below series of command from any one node to format the /dev/sda1
partition with the ocfs.
[root@node2-pub root]# mkfs.ocfs -F -b 128 -L
/u02/oradata/ocr -m /u02/oradata/ocr -u 900 -g 901 -p 0755 /dev/sda1
Where b= blocksize
m= mountpoint
u= UID of oracle user
g=GID of oinstall group
p=permission
Mounting OCFS (Oracle Cluster File System): (Do this on both the node)
[root@node2-pub root]# mount -t ocfs /dev/sda1
/u02/oradata/ocr
Add the below line into the /etc/fstab file to mount the ocfs automatically on
every reboot of system.
/dev/sda1
/u02/oradata/ocr ocfs _netdev 0 0
[root@node2-pub root]# service ocfs start
Loading
OCFS:
[ OK ]
[root@node2-pub root]# chkconfig ocfs on
Create the file system using "ocfstool" command line utility:
Please follow the GUI screenshots of creating and mounting ocfs file
system. run the ocfstool from command as
shown below:
Perform this step on Both the Nodes.
root@node1-pub root]# ocfstool
Check on the "Tasks" Button --> Select "Generate
Config".
---> Select interface = eth1
port =
7000 and
Node
Name = node1-prv (For node2, it would be node2-prv)
Confirm changes by looking into the /etc/ocfs.conf file. The contents
of this file should be like this:
Now, Click on the "Tasks" button and select
'Format". You will see the screen like below. Select the appropriate value
and click OK button. You need to perform this step only from one node.
Now the /dev/sda1 is formatted with the ocfs. Now this is the time to mount this
file system. Please perform this step on both the nodes. Click on the
"Mount" button. You should see the
/dev/sda1 is mounted under /u02/oradata/ocr mountpoint. Also confirm that you
see both the nodes in the "Configured Nodes" section.
Add the below line into the /etc/fstab file on both the nodes to mount the ocfs
automatically on every reboot.
/dev/sda1 /u02/oradata/ocr
ocfs _netdev 0 0
Creating
Automatic Storage Management (ASM) Disks for the Clustered Database (both the
RAC Nodes):
I will show you how to create the asm disks (stamping the disks
as asm) on the FireWire device. I am going to use one partition /dev/sda2 for
ASM disks. I am going to use ASMLib IO for the ASM disks and for that I need oracleasm
kernel drivers as well as binaries and support downloaded from Oracle's site.
Please go to Creating
and Configuring ASM instance and Database to get detailed
information on how to create ASM instance and diskgroups and how to use it into
the existing database OR new database. I
downloaded the below rpms and installed them as root user on both the nodes.
[root@node2-pub rpms]# rpm -Uvh
oracleasm-support-2.0.1-1.i386.rpm \
>
oracleasm-2.4.21-27.0.2.ELorafw1-1.0.4-1.i686.rpm \
>
oracleasmlib-2.0.1-1.i386.rpm
Preparing...
########################################### [100%]
1:oracleasm-support
########################################### [ 33%]
2:oracleasm-2.4.21-27.0.2###########################################
[ 67%]
3:oracleasmlib
########################################### [100%]
[root@node2-pub rpms]#
Enter the following command to run oracleasm init script with configure option
on both the nodes.
[root@node2-pub root]# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting <ENTER> without typing an
answer will keep that current value. Ctrl-C will abort.
Default user to own the driver interface []: oracle
Default group to own the driver interface []: oinstall
Start Oracle ASM library driver on boot (y/n) [n]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]:
Writing Oracle ASM library driver
configuration:
[ OK ]
Creating /dev/oracleasm mount
point:
[ OK ]
Loading module
"oracleasm":
[ OK ]
Mounting ASMlib driver file
system: [
OK ]
Scanning system for ASM
disks:
[ OK ]
[root@node2-pub root]#
ONLY on one NODE:
[root@node2-pub root]# /etc/init.d/oracleasm createdisk DSK1
/dev/sda2
Marking disk "/dev/sda2" as an ASM
disk:
[ OK ]
[root@node2-pub root]# /etc/init.d/oracleasm listdisks
DSK1
On the Remaining Nodes:
you only need
to execute the below command to show these disks up there.
[root@node1-pub root]# /etc/init.d/oracleasm scandisks
[root@node1-pub root]# /etc/init.d/oracleasm listdisks
DSK1
Binding
the partitions with the raw devices (Both the RAC Nodes):
Add
the below lines into the /etc/sysconfig/rawdevices and restarted the rawdevices
service (On both the nodes).
[root@node2-pub root]# cat /etc/sysconfig/rawdevices
# raw device bindings
# format: <rawdev> <major> <minor>
# <rawdev> <blockdev>
# example: /dev/raw/raw1 /dev/sda1
# /dev/raw/raw2 8 5
/dev/raw/raw2 /dev/sda2
[root@shree ~]# service rawdevices restart
Also, you need to
change the ownership of these devices to oracle user.
[root@node2-pub root]# chown oracle.dba /dev/raw/raw2
[root@node2-pub root]# chmod 660 /dev/raw/raw2
Please add the below lines to the /etc/rc.local so
that these are set back at reboot.
for i in `seq 2 2`
do
chown oracle.dba /dev/raw/raw$i
chmod 660 /dev/raw/raw$i
done
Checking
the Configuration of the hangcheck-timer Module:
Before Installing Oracle Real Application Cluster,
We need to verify that the hangcheck-timer module is loaded and configured
correctly. The hangcheck-timer module monitors the Linux
kernel for extended operating system hangs that could affect
the reliability of a RAC node and cause a database corruption. If a hang
occurs than the module restarts the node in seconds. There are
hangcheck_tick and hangcheck_margin parameter that governs the
behavior of the modules:
The
hangcheck_tick parameter defines
how often, in seconds, the hangcheck-timer check the node for hang. The default
value is 60 seconds.
The hangcheck_margin parameter defines
how long the hangcheck-timer waits, in seconds, for a response from Kernel. The
Default value is 180 seconds.
If
the Kernel fails to respond within the total of (hangcheck_tick + hangcheck_margin) seconds, the
hangcheck-timer module restarts the system.
Verify that the hangcheck-timer module is running:
(1) Enter the below command on each node.
[root@node2-pub
root]# lsmod | grep hangcheck-timer
hangcheck-timer
2648 0 (unused)
(2)
If the module is not listed by the above command, then enter the below command
to load the module on all the nodes.
[root@node2-pub
root]# insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
(3)
Also add the below line at the end of /etc/rc.local file to ensure that this
module is loaded at every reboot.
insmod
hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
Alternatively,
you could also add the same into the /etc/modules.conf file as shown
below.
options
hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
REFERENCES: