朋友搭建11g两节点的rac的时候,节点1能够正常用sqlplus / as sysdba登录数据库,但是节点2使用该命令报错,如下:
-
[oracle@oracle ~]$ sqlplus / as sysdba
-
Error 6 initializing SQL*Plus
-
SP2-0667: Message file sp1<lang>.msb not found
- SP2-0750: You may need to set ORACLE_HOME to your Oracle software directory
解决过程:
说明:由于是远程解决,解决过程并没有记录,不过该问题很容易复现。
后序会有另外一个复现该错误的方法,和此处遇到的案例不同。
-
[oracle@oracle ~]$ sqlplus / as sysdba
-
Error 6 initializing SQL*Plus
-
SP2-0667: Message file sp1<lang>.msb not found
-
SP2-0750: You may need to set ORACLE_HOME to your Oracle software directory
-
[oracle@oracle ~]$ echo $ORACLE_HOME
-
/u01/app/oracle/product/11.2.0/db_1
-
[oracle@oracle ~]$ cat .bash_profile
-
# .bash_profile
-
-
# Get the aliases and functions
-
if [ -f ~/.bashrc ]; then
-
. ~/.bashrc
-
fi
-
-
# User specific environment and startup programs
-
-
PATH=$PATH:$HOME/bin
-
-
export PATH
-
export ORACLE_SID=proc
-
export ORACLE_BASE=/u01/app/oracle
-
export ORACLE_HOME=$ORACLE_BASE/product/11.2.0/db_1
-
export PATH=$PATH:$ORACLE_HOME/bin
-
[oracle@oracle ~]$ which sqlplus
-
/u01/app/oracle/product/11.2.0/db_1/bin/sqlplus
-
[oracle@oracle ~]$ oerr sp2 667
-
oerr: Cannot access the message file /u01/app/oracle/product/11.2.0/db_1/sqlplus/mesg/sp2us.msg
-
[oracle@oracle ~]$ cd $ORACLE_HOME/sqlplus/mesg
- [oracle@oracle mesg]$ ls
后边红色底色的报错却是提供了思路,进去指定目录ls查看,果然没有文件,后边解决方法就很简单:
-
[oracle@oracle mesg]$ cp ~/bak/* ./
-
[oracle@oracle mesg]$ ll
-
total 216
-
-rw-r--r--. 1 oracle oinstall 4096 Dec 30 01:13 cpyus.msb
-
-rw-r--r--. 1 oracle oinstall 4369 Dec 30 01:13 cpyus.msg
-
-rw-r--r--. 1 oracle oinstall 12288 Dec 30 01:13 sp1us.msb
-
-rw-r--r--. 1 oracle oinstall 20123 Dec 30 01:13 sp1us.msg
-
-rw-r--r--. 1 oracle oinstall 34816 Dec 30 01:13 sp2us.msb
-
-rw-r--r--. 1 oracle oinstall 137350 Dec 30 01:13 sp2us.msg
-
[oracle@oracle mesg]$ sqlplus / as sysdba
-
-
SQL*Plus: Release 11.2.0.4.0 Production on Fri Dec 30 01:13:38 2016
-
-
Copyright (c) 1982, 2013, Oracle. All rights reserved.
-
-
-
Connected to:
-
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
-
With the Partitioning, OLAP, Data Mining and Real Application Testing options
-
-
SYS@proc> !oerr sp2 667
-
00667, 0, "Message file %s
.msb not found\n"
-
// *Cause: The SP1, SP2, or CPY message file could not be
-
// found. SQL*Plus cannot run.
-
// *Action: Check the Oracle platform specific documentation to make
-
// sure SQL*Plus is installed correctly. This may occur
-
// because the ORACLE_HOME environment variable or registry
-
// equivalent is not set to the location of the Oracle
-
// software. Make sure this value is set correctly. Check
-
// that the SQL*Plus binary message files exist in the
-
// SQL*Plus message directory, for example
-
// $ORACLE_HOME/sqplus/mesg. Check the value of NLS_LANG
-
// environment variable or registry equivalent is correct.
-
- SYS@proc>
有一个问题是搭建rac的过程为何节点2会缺失文件,这里也没法弄清楚。
该案例的报错有误导作用,因为该出并非是由于环境变量而引起的问题。
下边探究下环境变量引起该问题的情况。
数据库环境:
-
SYS@proc> select * from v$version where rownum=1;
-
-
BANNER
-
--------------------------------------------------------------------------------
- Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
构造之后的故障环境:
- [oracle@oracle ~]$ echo $ORACLE_HOME
- /u01/app/oracle/product/11.2.0/db_1
-
[oracle@oracle ~]$ sqlplus / as sysdba
-
Error 6 initializing SQL*Plus
-
SP2-0667: Message file sp1<lang>.msb not found
-
SP2-0750: You may need to set ORACLE_HOME to your Oracle software directory
-
[oracle@oracle ~]$ ls $ORACLE_HOME/sqlplus/mesg
-
cpyus.msb cpyus.msg sp1us.msb sp1us.msg sp2us.msb sp2us.msg
- [oracle@oracle ~]$
以下从一般的解决思路入手:
-
[oracle@oracle ~]$ cat .bash_profile
-
# .bash_profile
-
-
# Get the aliases and functions
-
if [ -f ~/.bashrc ]; then
-
. ~/.bashrc
-
fi
-
-
# User specific environment and startup programs
-
-
PATH=$PATH:$HOME/bin
-
-
export PATH
-
export ORACLE_SID=proc
-
export ORACLE_BASE=/u01/app/oracle
-
ORACLE_HOME=$ORACLE_BASE/product/11.2.0/db_1
- export PATH=$PATH:$ORACLE_HOME/bin
这里引出两个疑问:
1.在构造故障的时候,修改完.bash_profile文件之后,source .bash_profile之后使用sqlplus / as sysdba是不会报错的,需要重新开启一个会话。
为什么?这里先留作第一个问题。
2.就算是去掉export,echo $ORACLE_HOME还是会有值,为什么还是报了错误。
网友们也可以自己构造环境去思考。
以下是我尝试解决故障的过程,会有一些不太有用处的过程:
-
[oracle@oracle ~]$ strace sqlplus
-
execve("/u01/app/oracle/product/11.2.0/db_1/bin/sqlplus", ["sqlplus"], [/* 35 vars */]) = 0
-
brk(0) = 0x1533000
-
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5f88dab000
-
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/tls/x86_64/libsqlplus.so", O_RDONLY) = -1 ENOENT (No such file or directory)
-
stat("/u01/app/oracle/product/11.2.0/db_1/lib/tls/x86_64", 0x7fff8cb17860) = -1 ENOENT (No such file or directory)
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/tls/libsqlplus.so", O_RDONLY) = -1 ENOENT (No such file or directory)
-
stat("/u01/app/oracle/product/11.2.0/db_1/lib/tls", 0x7fff8cb17860) = -1 ENOENT (No such file or directory)
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/x86_64/libsqlplus.so", O_RDONLY) = -1 ENOENT (No such file or directory)
-
stat("/u01/app/oracle/product/11.2.0/db_1/lib/x86_64", 0x7fff8cb17860) = -1 ENOENT (No such file or directory)
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/libsqlplus.so", O_RDONLY) = 3
-
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\370\1\0\0\0\0\0"..., 832) = 832
-
fstat(3, {st_mode=S_IFREG|0644, st_size=1469542, ...}) = 0
-
mmap(NULL, 1985056, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f5f88bc6000
-
mprotect(0x7f5f88c9c000, 1048576, PROT_NONE) = 0
-
mmap(0x7f5f88d9c000, 57344, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xd6000) = 0x7f5f88d9c000
-
mmap(0x7f5f88daa000, 2592, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5f88daa000
-
close(3) = 0
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/libclntsh.so.11.1", O_RDONLY) = 3
-
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\326G\0\0\0\0\0"..., 832) = 832
- ...省略部分内容...
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/libsqlplusic.so", O_RDONLY) = -1 ENOENT (No such file or directory)
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/libociicus.so", O_RDONLY) = -1 ENOENT (No such file or directory)
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/libociei.so", O_RDONLY) = -1 ENOENT (No such file or directory)
-
write(2, "Error 6 initializing SQL*Plus\n", 30Error 6 initializing SQL*Plus
-
) = 30
-
write(2, "SP2-0667: Message file sp1
" ..., 47SP2-0667: Message file sp1<lang>.msb not found
-
) = 47
-
write(2, "SP2-0750: You may need to set OR"..., 76SP2-0750: You may need to set ORACLE_HOME to your Oracle software directory
-
) = 76
- exit_group(1) = ?
-
[oracle@oracle bin]$ strace ./sqlplus
-
execve("./sqlplus", ["./sqlplus"], [/* 36 vars */]) = 0
-
brk(0) = 0x1a0d000
-
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f4f0a3c0000
-
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
- ...省略部分内容...
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/libsqlplusic.so", O_RDONLY) = -1 ENOENT (No such file or directory)
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/libociicus.so", O_RDONLY) = -1 ENOENT (No such file or directory)
-
open("/u01/app/oracle/product/11.2.0/db_1/lib/libociei.so", O_RDONLY) = -1 ENOENT (No such file or directory)
-
write(2, "Error 6 initializing SQL*Plus\n", 30Error 6 initializing SQL*Plus
-
) = 30
-
write(2, "SP2-0667: Message file sp1
" ..., 47SP2-0667: Message file sp1<lang>.msb not found
-
) = 47
-
write(2, "SP2-0750: You may need to set OR"..., 76SP2-0750: You may need to set ORACLE_HOME to your Oracle software directory
-
) = 76
- exit_group(1) = ?
2.通过对比正常时候的strace sqlplus输出文本发现,前边部分的调用文本几乎相同,除了一些函数的参数值不同。因此strace并不能找出异常时候的信息,strace并不能更详细的追踪出不同之处,比较遗憾。
只能从另外一个角度重新思考了,由于ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1引起报错是因为export有无的原因,于是:
- [oracle@oracle ~]$ env | grep ORACLE_HOME
- [oracle@oracle ~]$ export | grep ORACLE_HOME
-
[oracle@oracle ~]$ export | grep ORACLE_BASE
-
declare -x ORACLE_BASE="/u01/app/oracle"
-
[oracle@oracle ~]$ env | grep ORACLE_BASE
- ORACLE_BASE=/u01/app/oracle
-
[oracle@oracle ~]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/db_1
首先,ORACLE从ORACLE_HOME获取值,而ORACLE_HOME也确实有值,报错说明这里的值没起来作用,于是尝试通过调用脚本获取ORACLE_HOME的值,再去观察结果。
-
[oracle@oracle piscescanon]$ ll
-
total 4
-
-rwxr--r--. 1 oracle oinstall 18 Dec 30 00:32 test.sh
-
[oracle@oracle piscescanon]$ cat test.sh
-
echo $ORACLE_HOME
- [oracle@oracle piscescanon]$ ./test.sh
-
那么上边的两个问题也很明显了,第一个问题是因为ORACLE_HOME已经在环境变量里边了,就算是更改之后再次source也还是一样,因此需要重新开启会话,第二个问题就很明显了。
从该案例看的话,ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1,ORACLE_HOME只是局部变量,局部变量只对当前shell生效。
而export ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1使得ORACLE_HOME变成系统环境变量,系统环境变量对当前shell以及子shell、子子shell等下一级有效。
-
[oracle@oracle piscescanon]$ export ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1
-
[oracle@oracle piscescanon]$ ./test.sh
- /u01/app/oracle/product/11.2.0/db_1
网上的资料:
用户登录到Linux系统后,系统将启动一个用户shell。在这个shell中,可以使用shell命令或声明变量,也可以创建并运行 shell脚本程序。运行shell脚本程序时,系统将创建一个子shell。
此时,系统中将有两个shell,一个是登录时系统启动的shell,另一 个是系统为运行脚本程序创建的shell。当一个脚本程序运行完毕,它的脚本shell将终止,可以返回到执行该脚本之前的shell。
从这种意义上来 说,用户可以有许多 shell,每个shell都是由某个shell(称为父shell)派生的。
在子 shell中定义的变量只在该子shell内有效。如果在一个shell脚本程序中定义了一个变量,当该脚本程序运行时,这个定义的变量只是该脚本程序内 的一个局部变量,其他的shell不能引用它,要使某个变量的值可以在其他shell中被改变,可以使用export命令对已定义的变量进行输出。
export命令将使系统在创建每一个新的shell时定义这个变量的一个拷贝。这个过程称之为变量输出。
对于export和shell的关系,有结论如下:
1、执行脚本时是在一个子shell环境运行的,脚本执行完后该子shell自动退出;
2、一个shell中的系统环境变量才会被复制到子 shell中(用export定义的变量);
3、一个shell中的系统环境变量只对该shell或者它的子shell有效,该shell结束时变量消失 (并不能返回到父shell中);
3、不用export定义的变量只对该shell有效,对子shell也是无效的。