One of the reasons you put systems into a datacenter, is to have redundant power, so our precious servers never loose power.
No power, no server.
No servers, no database!
No databases, unhappy users armed with torches and pitchforks!
All that being said, you can hit an interesting potential problem after restoring power to a virtualized ODA. After the reboot, none of your VMs are running and it appears that all of your shared repositories are missing!
When you log into ODA_BASE check the repositories;
[root@oda1a bin]# ./oakcli show repo
NAME TYPE NODENUM STATE
odarepo1 local 0 N/Aodarepo2 local 1 N/A
See, no shared repositories!
The fix is simple, restart oak on both nodes, one at a time.
[root@oda1abin]# ./oakcli restart oak
Restarting the oakd..
Killing the running oakd with pid 13298
Successfully re-started the oakd..
and
[root@oda1b]# ./oakcli restart oak
Restarting the oakd..
Killing the running oakd with pid 13142
Successfully re-started the oakd..
Now, after restarting, check again;
[root@oda1abin]# ./oakcli show repo
NAME TYPE NODENUM STATE
odarepo1 local 0 N/Aodarepo2 local 1 N/A
repo0 shared 0 UNKNOWN
repo0 shared 1 ONLINE
You will see them coming back online. Wait a few minutes more and all will be good.
[root@oda1a bin]# ./oakcli show repo
NAME TYPE NODENUM STATE
odarepo1 local 0 N/Aodarepo2 local 1 N/A
repo0 shared 0 ONLINE
repo0 shared 1 ONLINE
What causes this is that when the ODA is restarted, the ASM instances are also restarted. As ASM is mounting ACFS file-systems at boot, oak is checking for the repositories. Since the ACFS is not yet mounted, no shared repository is located by oak. A quick fix to a simple problem. Now all you need to deal with are the pitch forks!
Good point Erik. Thanks for the tip!
The nodes on our ODA x5-2 went down due to an overtemp condition in our datacenter. The Air Handling issue was resolved, we got the nodes back on line with an Oracle SR assistance, but the oakd did not start on node 0 and node 1 became the master as verified with ‘oakcli show ismaster’
With Oracle direction, manually did a ‘oakcli restart oak’ command on node 0 and it came back up, but in the Slave mode.
I have a question in to Oracle via the SR, but will ask here also…will restarting the oak on node 1 force the Master state back on node 0? Or are the steps to flip this more involved?
Thank you.
Generally the master should be the “0” server when available. The main thing to watch for, is you should always run the oakcli command to manage the ODA on the master. If you want to move it when “1” is the master, do a oakcli restart oakd on the “0” node an then the “1” node.