root.sh fails to start oc4j on node1
This question has been Answered.
User11343197-Oracle Jun 6, 2016 6:13 AM
Hello all,
I am trying to install RAC 12.1.0 onto two nodes running OL 6.4 using nfs shares. I was following instructions at https://oracle-base.com/articles/11g/oracle-db-11gr2-rac-installation-on-linux-using-nfs for this. During grid installation running root.sh has failed as below
CRS-4123: Oracle High Availability Services has been started.
2016/06/03 09:57:24 CLSRSC-343: Successfully started Oracle Clusterware stack
2016/06/03 10:03:55 CLSRSC-1003: Failed to start resource OC4J
2016/06/03 10:03:57 CLSRSC-287: FirstNode configuration failed
Died at /u01/app/12.1.0/grid/crs/install/crsinstall.pm line 2398.
The command '/u01/app/12.1.0/grid/perl/bin/perl -I/u01/app/12.1.0/grid/perl/lib -I/u01/app/12.1.0/grid/crs/install /u01/app/12.1.0/grid/crs/install/rootcrs.pl ' execution failed
Looking at oc4j logs found the root cause as given below for which I added dms.jar and dmsapp.jar which fixed this issue.
16/06/03 09:59:06 oracle.classloader.util.AnnotatedClassNotFoundException:
Missing class: oracle.dms.jmx.AggreMBean
Dependent class: com.evermind.server.Application
Loader: oc4j:10.1.3
Code-Source: /u01/app/12.1.0/grid/oc4j/j2ee/home/lib/oc4j-internal.jar
Configuration: <code-source> in META-INF/boot.xml in /u01/app/12.1.0/grid/oc4j/j2ee/home/oc4j.jar
Missing class: oracle.dms.jmx.AggreMBean
Dependent class: com.evermind.server.Application
Loader: oc4j:10.1.3
Code-Source: /u01/app/12.1.0/grid/oc4j/j2ee/home/lib/oc4j-internal.jar
Configuration: <code-source> in META-INF/boot.xml in /u01/app/12.1.0/grid/oc4j/j2ee/home/oc4j.jar
Still srvctl fails to start oc4j without reporting any useful error message in logs.
Trace from srvctl is as given below.
main 2016-06-03 12:53:40.737 UTC CRSNative.genericStartResource:307 Failed to start resource: Name: ora.oc4j, node: null, filter: null, msg CRS-2674: Start of 'ora.oc4j' on 'el01cn04' failed
CRS-2632: There are no more servers to try to place resource 'ora.oc4j' on that would satisfy its placement policy
main 2016-06-03 12:53:40.754 UTC StartAction.executeOC4J:2268 SoftwareModule exception PRCR-1079 : Failed to start resource ora.oc4j
CRS-2674: Start of 'ora.oc4j' on 'el01cn04' failed
CRS-2632: There are no more servers to try to place resource 'ora.oc4j' on that would satisfy its placement policy
OC4J could not be started
main 2016-06-03 12:53:40.795 UTC InterruptHandler.unRegisterInterruptHandler:76 UNRegistering shutdown hook.....
main 2016-06-03 12:53:40.795 UTC InterruptHandler.unRegisterInterruptHandler:81 UnRegistered shutdown hook.....
main 2016-06-03 12:53:40.795 UTC OPSCTLDriver.main:242 OPSCTL execute() failed. Unregistered OPSCTL driver's interrupt handler
main 2016-06-03 12:53:40.795 UTC OPSCTLDriver.main:247 exiting abnormally due to FrameworkException
PRCR-1079 : Failed to start resource ora.oc4j
CRS-2674: Start of 'ora.oc4j' on 'el01cn04' failed
CRS-2632: There are no more servers to try to place resource 'ora.oc4j' on that would satisfy its placement policy
main 2016-06-03 12:53:40.796 UTC OPSCTLDriver.main:249 PRCR-1079 : Failed to start resource ora.oc4j
CRS-2674: Start of 'ora.oc4j' on 'el01cn04' failed
CRS-2632: There are no more servers to try to place resource 'ora.oc4j' on that would satisfy its placement policy
oracle.ops.opsctl.StartAction.executeOC4J(StartAction.java:2271)
oracle.ops.opsctl.Action.execute(Action.java:436)
oracle.ops.opsctl.OPSCTLDriver.execute(OPSCTLDriver.java:502)
oracle.ops.opsctl.OPSCTLDriver.main(OPSCTLDriver.java:231)
main 2016-06-03 12:53:40.796 UTC SRVMContext.term:151 Performing SRVM Context Term. Term counter is 2
Found this in /u01/app/oracle/diag/crs/el01cn04/crs/trace/crsd.trc .
2016-06-03 13:37:08.292212 : AGFW:515852032: {1:63881:1910} Received the reply to the message: RESOURCE_START[ora.oc4j 1 1] ID 4098:8367 from the agent /u01/app/12.1.0/grid/bin/scriptagent_oracle
2016-06-03 13:37:08.292665 : AGFW:515852032: {1:63881:1910} Agfw Proxy Server sending the reply to PE for message:RESOURCE_START[ora.oc4j 1 1] ID 4098:8351
2016-06-03 13:37:08.292928 : CRSPE:505345792: {1:63881:1910} Received reply to action [Start] message ID: 8351
2016-06-03 13:37:08.295089 : CRSD:505345792: {1:63881:1910} {1:63881:1910} Created alert : (:CRSPE00163:) : Start action timed out!
2016-06-03 13:37:08.295106 : CRSPE:505345792: {1:63881:1910} Start action failed with error code: 3
2016-06-03 13:37:08.295944 : CRSRPT:503244544: {1:63881:1910} Published to EVM CRS_ACTION_FAILURE for ora.oc4j
/etc/hosts has 127.0.0.1 localhost entry.
I have been stuck at this point for a while. Really appreciate if any one could help me resolve this.
Thanks
Shravan
Correct Answerby User11343197-Oracle on Jun 7, 2016 2:40 PM
After looking through decompiled java classes and the perl script it calls eventually I found that jps is not returning any pids because of which perl script reports OC4J failure after certain timeout. Though oracle user has rw permissions over /tmp/hsperfdata_oracle directory for some reasons pid files are not created when OC4J is launched and hence jps returns nothing. Deleting /tmp/hsperfdata_oracle directory fixed the issue.
-Shravan
No comments:
Post a Comment