Blog – Page 12 – LUXOUG – LUXEMBOURG ORACLE USERS GROUP

Blog

Solving MGMTDB errors during 18c GI RU apply

Category: Engineer System Author: Fernando Simon (Board Member) Date: 6 years ago Comments: 0

Solving MGMTDB errors during 18c GI RU apply

Recently I executed the upgrade of Oracle GI to 19c version, from 18.6.0.0 to 19.5.0.0 version. But one step that was not showed there was that, because of requirements, the GI was upgraded from 18.2.0.0 to 18.6.0.0. This upgrade is a just Release Update (RU) apply and opatchauto command.

But during this upgrade, from 18.2 to 18.6, I faced (more than one time – 5 to be precise) errors during the update because of the MGMTDB errors. I got these errors:

ORA-12514, TNS: Listener does not currently know of service requested in connect descriptor
ORA-01017: invalid username/password; logon denied
MGTCA-1005 : Could not connect to the GIMR.
CRS-10407: (:CLSCRED1079:)Credential domain does not exist.

Here I will show how to solve these errors, how to identify if everything was fine and if you can continue. Be careful that it is an example, always open a support SR to identify the source of the error.

RU and Patch Process

To apply 18c RU over GI it is simple, basically, it is needed to call opatchauto for every node. One example of correct execution is:

[root@exa01vm01 ~]# opatchauto apply /u01/patches/grid/29301682 -oh /u01/app/18.0.0/grid




OPatchauto session is initiated at Mon Jan 20 12:48:25 2020




System initialization log file is /u01/app/18.0.0/grid/cfgtoollogs/opatchautodb/systemconfig2020-01-20_12-48-48PM.log.




Session log file is /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/opatchauto2020-01-20_12-49-44PM.log

The id for this session is RQ3F




Executing OPatch prereq operations to verify patch applicability on home /u01/app/18.0.0/grid

Patch applicability verified successfully on home /u01/app/18.0.0/grid







Bringing down CRS service on home /u01/app/18.0.0/grid

CRS service brought down successfully on home /u01/app/18.0.0/grid







Start applying binary patch on home /u01/app/18.0.0/grid

Binary patch applied successfully on home /u01/app/18.0.0/grid







Starting CRS service on home /u01/app/18.0.0/grid

CRS service started successfully on home /u01/app/18.0.0/grid




OPatchAuto successful.




--------------------------------Summary--------------------------------




Patching is completed successfully. Please find the summary as follows:




Host:exa01vm01

CRS Home:/u01/app/18.0.0/grid

Version:18.0.0.0.0

Summary:




==Following patches were SUCCESSFULLY applied:




Patch: /u01/patches/grid/29301682/28435192

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-20_12-51-42PM_1.log




Patch: /u01/patches/grid/29301682/28547619

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-20_12-51-42PM_1.log




Patch: /u01/patches/grid/29301682/29301631

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-20_12-51-42PM_1.log




Patch: /u01/patches/grid/29301682/29301643

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-20_12-51-42PM_1.log




Patch: /u01/patches/grid/29301682/29302264

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-20_12-51-42PM_1.log










OPatchauto session completed at Mon Jan 20 13:00:01 2020

Time taken to complete the session 11 minutes, 36 seconds

[root@exa01vm01 ~]#










####################################################

#Execute in node 2

####################################################







[root@exa02vm01 ~]# opatchauto apply /u01/patches/grid/29301682 -oh /u01/app/18.0.0/grid




OPatchauto session is initiated at Tue Jan 21 13:12:37 2020




System initialization log file is /u01/app/18.0.0/grid/cfgtoollogs/opatchautodb/systemconfig2020-01-21_01-12-51PM.log.




Session log file is /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/opatchauto2020-01-21_01-13-37PM.log

The id for this session is NFXL




Executing OPatch prereq operations to verify patch applicability on home /u01/app/18.0.0/grid

Patch applicability verified successfully on home /u01/app/18.0.0/grid







Bringing down CRS service on home /u01/app/18.0.0/grid

CRS service brought down successfully on home /u01/app/18.0.0/grid







Start applying binary patch on home /u01/app/18.0.0/grid

Binary patch applied successfully on home /u01/app/18.0.0/grid







Starting CRS service on home /u01/app/18.0.0/grid

CRS service started successfully on home /u01/app/18.0.0/grid




OPatchAuto successful.




--------------------------------Summary--------------------------------




Patching is completed successfully. Please find the summary as follows:




Host:exa02vm01

CRS Home:/u01/app/18.0.0/grid

Version:18.0.0.0.0

Summary:




==Following patches were SUCCESSFULLY applied:




Patch: /u01/patches/grid/29301682/28435192

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-21_13-15-58PM_1.log




Patch: /u01/patches/grid/29301682/28547619

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-21_13-15-58PM_1.log




Patch: /u01/patches/grid/29301682/29301631

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-21_13-15-58PM_1.log




Patch: /u01/patches/grid/29301682/29301643

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-21_13-15-58PM_1.log




Patch: /u01/patches/grid/29301682/29302264

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-21_13-15-58PM_1.log










OPatchauto session completed at Tue Jan 21 13:35:21 2020

Time taken to complete the session 22 minutes, 44 seconds

[root@exa02vm01 ~]#

As you can see, call in one node first, and after the second node. Basically during this process, the opatch will apply the binary updates in GI home, restart GI (to open with the new version), and apply needed SQL patches over MGMTDB and ASM.

MGMTDB and errors

But some errors can occur during the apply, mainly for MGMTDB. These errors are usually related to the communication between MGMTDB and CRS resources (listener and others). Usually, these errors occur when applying the patch in the last node of the cluster because the MGMTDB is swapped from other nodes (until the last one) during the patch application.

Some errors examples that I faced:

ORA-12514, TNS:listener does not currently know of service requested in connect descriptor

...

CRS-4123: Oracle High Availability Services has been started.

Oracle Clusterware active version on the cluster is [18.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [4127601284].

SQL Patching tool version 18.0.0.0.0 Production on Wed Jan 15 13:57:08 2020

Copyright (c) 2012, 2019, Oracle.  All rights reserved.




Log file for this invocation: /u01/app/grid/cfgtoollogs/sqlpatch/sqlpatch_360667_2020_01_15_13_57_08/sqlpatch_invocation.log




Connecting to database...OK

Gathering database info...done




Note:  Datapatch will only apply or rollback SQL fixes for PDBs

       that are in an open state, no patches will be applied to closed PDBs.

       Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation

       (Doc ID 1585822.1)




Bootstrapping registry and package to current versions...done

Determining current state...done




Current state of interim SQL patches:

  No interim patches found




Current state of release update SQL patches:

  Binary registry:

    18.6.0.0.0 Release_Update 1903190102: Installed

  PDB CDB$ROOT:

    Applied 18.2.0.0.0 Release_Update 1804041635 successfully on 02-OCT-18 03.11.47.950696 PM

  PDB PDB$SEED:

    Applied 18.2.0.0.0 Release_Update 1804041635 successfully on 02-OCT-18 03.16.34.680672 PM




Adding patches to installation queue and performing prereq checks...done

Installation queue:

  For the following PDBs: CDB$ROOT PDB$SEED

    No interim patches need to be rolled back

    Patch 29301631 (Database Release Update : 18.6.0.0.190416 (29301631)):

      Apply from 18.2.0.0.0 Release_Update 1804041635 to 18.6.0.0.0 Release_Update 1903190102

    No interim patches need to be applied




Installing patches...

Patch installation complete.  Total patches installed: 2




Validating logfiles...done

Patch 29301631 apply (pdb CDB$ROOT): SUCCESS

  logfile: /u01/app/grid/cfgtoollogs/sqlpatch/29301631/22832106/29301631_apply__MGMTDB_CDBROOT_2020Jan15_13_58_14.log (no errors)

Patch 29301631 apply (pdb PDB$SEED): SUCCESS

  logfile: /u01/app/grid/cfgtoollogs/sqlpatch/29301631/22832106/29301631_apply__MGMTDB_PDBSEED_2020Jan15_14_01_22.log (no errors)

SQL Patching tool complete on Wed Jan 15 14:02:41 2020

MGTCA-1005 : Could not connect to the GIMR.




Listener refused the connection with the following error:




ORA-12514, TNS:listener does not currently know of service requested in connect descriptor










2020/01/15 14:02:49 CLSRSC-180: An error occurred while executing the command '/u01/app/18.0.0/grid/bin/mgmtca applysql'




After fixing the cause of failure Run opatchauto resume




]

OPATCHAUTO-68061: The orchestration engine failed.

OPATCHAUTO-68061: The orchestration engine failed with return code 1

OPATCHAUTO-68061: Check the log for more details.

OPatchAuto failed.




OPatchauto session completed at Wed Jan 15 14:02:50 2020

Time taken to complete the session 28 minutes, 34 seconds




 opatchauto failed with error code 42

[root@exa03vm02 ~]#

ORA-01017: invalid username/password; logon denied

CRS-4123: Oracle High Availability Services has been started.

Oracle Clusterware active version on the cluster is [18.0.0.0.0]. The cluster upgrade state is [NORMAL]. The cluster active patch level is [4127601284].

SQL Patching tool version 18.0.0.0.0 Production on Fri Nov 29 10:32:41 2019

Copyright (c) 2012, 2019, Oracle.  All rights reserved.




Log file for this invocation: /u01/app/grid/cfgtoollogs/sqlpatch/sqlpatch_128123_2019_11_29_10_32_41/sqlpatch_invocation.log




Connecting to database...OK

Gathering database info...done




Note:  Datapatch will only apply or rollback SQL fixes for PDBs

       that are in an open state, no patches will be applied to closed PDBs.

       Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation

       (Doc ID 1585822.1)




Bootstrapping registry and package to current versions...done

Determining current state...done




Current state of interim SQL patches:

  No interim patches found




Current state of release update SQL patches:

  Binary registry:

    18.6.0.0.0 Release_Update 1903190102: Installed

  PDB CDB$ROOT:

    Applied 18.2.0.0.0 Release_Update 1804041635 successfully on 02-OCT-18 01.32.26.561113 PM

  PDB GIMR_DSCREP_10:

    Applied 18.2.0.0.0 Release_Update 1804041635 successfully on 02-OCT-18 01.37.12.290147 PM

  PDB PDB$SEED:

    Applied 18.2.0.0.0 Release_Update 1804041635 successfully on 02-OCT-18 01.37.12.290147 PM




Adding patches to installation queue and performing prereq checks...done

Installation queue:

  For the following PDBs: CDB$ROOT PDB$SEED GIMR_DSCREP_10

    No interim patches need to be rolled back

    Patch 29301631 (Database Release Update : 18.6.0.0.190416 (29301631)):

      Apply from 18.2.0.0.0 Release_Update 1804041635 to 18.6.0.0.0 Release_Update 1903190102

    No interim patches need to be applied




Installing patches...

Patch installation complete.  Total patches installed: 3




Validating logfiles...done

Patch 29301631 apply (pdb CDB$ROOT): SUCCESS

  logfile: /u01/app/grid/cfgtoollogs/sqlpatch/29301631/22832106/29301631_apply__MGMTDB_CDBROOT_2019Nov29_10_33_45.log (no errors)

Patch 29301631 apply (pdb PDB$SEED): SUCCESS

  logfile: /u01/app/grid/cfgtoollogs/sqlpatch/29301631/22832106/29301631_apply__MGMTDB_PDBSEED_2019Nov29_10_36_02.log (no errors)

Patch 29301631 apply (pdb GIMR_DSCREP_10): SUCCESS

  logfile: /u01/app/grid/cfgtoollogs/sqlpatch/29301631/22832106/29301631_apply__MGMTDB_GIMR_DSCREP_10_2019Nov29_10_36_02.log (no errors)

SQL Patching tool complete on Fri Nov 29 10:38:04 2019

MGTCA-1005 : Could not connect to the GIMR.




ORA-01017: invalid username/password; logon denied










2019/11/29 10:38:13 CLSRSC-180: An error occurred while executing the command '/u01/app/18.0.0/grid/bin/mgmtca applysql'




After fixing the cause of failure Run opatchauto resume




]

OPATCHAUTO-68061: The orchestration engine failed.

OPATCHAUTO-68061: The orchestration engine failed with return code 1

OPATCHAUTO-68061: Check the log for more details.

OPatchAuto failed.




OPatchauto session completed at Fri Nov 29 10:38:15 2019

Time taken to complete the session 26 minutes, 48 seconds




 opatchauto failed with error code 42

[root@exa01vm02 ~]#

And inside of the opatch log you can reach/see something like:

SQL Patching tool complete on Fri Nov 29 10:44:09 2019

MGTCA-1005 : Could not connect to the GIMR.




ORA-01017: invalid username/password; logon denied










2019/11/29 10:44:18 CLSRSC-180: An error occurred while executing the command '/u01/app/18.0.0/grid/bin/mgmtca applysql'




After fixing the cause of failure Run opatchauto resume




]]. Failures:

OPATCHAUTO-68067: Check the details to determine the cause of the failure.

        at com.oracle.glcm.patch.auto.action.PatchActionExecutor.execute(PatchActionExecutor.java:172)

        at com.oracle.glcm.patch.auto.wizard.silent.tasks.PatchActionTask.execute(PatchActionTask.java:102)

        ... 2 more

2019-11-29 10:44:19,660 INFO  [1] com.oracle.glcm.patch.auto.db.integration.model.productsupport.DBBaseProductSupport - Space available after session: 29898 MB

2019-11-29 10:44:19,728 SEVERE [1] com.oracle.glcm.patch.auto.OPatchAuto - OPatchAuto failed.

com.oracle.glcm.patch.auto.OPatchAutoException: OPATCHAUTO-68061: The orchestration engine failed.

OPATCHAUTO-68061: The orchestration engine failed with return code 1

OPATCHAUTO-68061: Check the log for more details.

        at com.oracle.glcm.patch.auto.OrchestrationEngineImpl.orchestrate(OrchestrationEngineImpl.java:40)

        at com.oracle.glcm.patch.auto.OPatchAuto.orchestrate(OPatchAuto.java:858)

        at com.oracle.glcm.patch.auto.OPatchAuto.orchestrate(OPatchAuto.java:398)

        at com.oracle.glcm.patch.auto.OPatchAuto.orchestrate(OPatchAuto.java:344)

        at com.oracle.glcm.patch.auto.OPatchAuto.main(OPatchAuto.java:212)

2019-11-29 10:44:19,729 INFO  [1] com.oracle.cie.common.util.reporting.CommonReporter - Reporting console output : Message{id='null', message='OPATCHAUTO-68061: The orchestration engine failed.

OPATCHAUTO-68061: The orchestration engine failed with return code 1

OPATCHAUTO-68061: Check the log for more details.'}

2019-11-29 10:44:19,729 INFO  [1] com.oracle.cie.common.util.reporting.CommonReporter - Reporting console output : Message{id='null', message='OPatchAuto failed.'}

^C

[root@exa01vm02 ~]#

But the common in all cases occurred during the “/u01/app/18.0.0/grid/bin/mgmtca applysql” call because the opatch detected that is need to apply the patch over MGMTDB, and tried to apply (and reached the error).

Solving the issue

After some dig in the web about the error and MOS/Metalink search, nothing was found. No error, hint or something related. Even after one SR opened the report was always the same: inconclusive and not relevant to build a fix. But for all one solution/workaround can be applied and worked every time.

Before, just to show that if you look inside the MGMTDB itself you can see that the patch was applied correctly:

[grid@exa01vm02 -MGMTDB]$ sqlplus / as sysdba




SQL*Plus: Release 18.0.0.0.0 - Production on Thu Nov 14 09:19:05 2019

Version 18.6.0.0.0




Copyright (c) 1982, 2018, Oracle.  All rights reserved.







Connected to:

Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0




SQL> show pdbs




    CON_ID CON_NAME                       OPEN MODE  RESTRICTED

---------- ------------------------------ ---------- ----------

         2 PDB$SEED                       READ ONLY  NO

SQL> select instance_name from v$instance;




INSTANCE_NAME

----------------

-MGMTDB




SQL> exit

Disconnected from Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$ cd $ORACLE_HOME/OPatch

[grid@exa01vm02 -MGMTDB]$ ./datapatch -verbose

SQL Patching tool version 18.0.0.0.0 Production on Thu Nov 14 09:21:33 2019

Copyright (c) 2012, 2019, Oracle.  All rights reserved.




Log file for this invocation: /u01/app/grid/cfgtoollogs/sqlpatch/sqlpatch_242627_2019_11_14_09_21_33/sqlpatch_invocation.log




Connecting to database...OK

Gathering database info...done




Note:  Datapatch will only apply or rollback SQL fixes for PDBs

       that are in an open state, no patches will be applied to closed PDBs.

       Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation

       (Doc ID 1585822.1)




Bootstrapping registry and package to current versions...done

Determining current state...done




Current state of interim SQL patches:

  No interim patches found




Current state of release update SQL patches:

  Binary registry:

    18.6.0.0.0 Release_Update 1903190102: Installed

  PDB CDB$ROOT:

    Applied 18.6.0.0.0 Release_Update 1903190102 successfully on 13-NOV-19 05.17.48.183101 PM

  PDB PDB$SEED:

    Applied 18.6.0.0.0 Release_Update 1903190102 successfully on 13-NOV-19 05.17.50.152110 PM




Adding patches to installation queue and performing prereq checks...done

Installation queue:

  For the following PDBs: CDB$ROOT PDB$SEED

    No interim patches need to be rolled back

    No release update patches need to be installed

    No interim patches need to be applied




SQL Patching tool complete on Thu Nov 14 09:21:48 2019

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$ sqlplus / as sysdba




SQL*Plus: Release 18.0.0.0.0 - Production on Thu Nov 14 09:23:35 2019

Version 18.6.0.0.0




Copyright (c) 1982, 2018, Oracle.  All rights reserved.







Connected to:

Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0




SQL> set linesize 250

SQL> col comments format a50

SQL> select ACTION_TIME,VERSION,COMMENTS from dba_registry_history;




ACTION_TIME                                                                 VERSION                        COMMENTS

--------------------------------------------------------------------------- ------------------------------ --------------------------------------------------

                                                                            18                             RDBMS_18.6.0.0.0DBRU_LINUX.X64_190313.1

19-JUN-18 01.51.48.687187 PM                                                18.0.0.0.0                     Patch applied from 18.1.0.0.0 to 18.2.0.0.0

13-NOV-19 05.16.14.040821 PM                                                18.0.0.0.0                     Patch applied from 18.2.0.0.0 to 18.6.0.0.0




SQL> exit

Disconnected from Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0

[grid@exa01vm02 -MGMTDB]$

If you look above even the datapatch and database itself report that patch was applied correctly:

[grid@exa01vm02 -MGMTDB]$ sqlplus / as sysdba




SQL*Plus: Release 18.0.0.0.0 - Production on Thu Nov 14 09:19:05 2019

Version 18.6.0.0.0




Copyright (c) 1982, 2018, Oracle.  All rights reserved.







Connected to:

Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0




SQL> show pdbs




    CON_ID CON_NAME                       OPEN MODE  RESTRICTED

---------- ------------------------------ ---------- ----------

         2 PDB$SEED                       READ ONLY  NO

SQL> select instance_name from v$instance;




INSTANCE_NAME

----------------

-MGMTDB




SQL> exit

Disconnected from Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$ cd $ORACLE_HOME/OPatch

[grid@exa01vm02 -MGMTDB]$ ./datapatch -verbose

SQL Patching tool version 18.0.0.0.0 Production on Thu Nov 14 09:21:33 2019

Copyright (c) 2012, 2019, Oracle.  All rights reserved.




Log file for this invocation: /u01/app/grid/cfgtoollogs/sqlpatch/sqlpatch_242627_2019_11_14_09_21_33/sqlpatch_invocation.log




Connecting to database...OK

Gathering database info...done




Note:  Datapatch will only apply or rollback SQL fixes for PDBs

       that are in an open state, no patches will be applied to closed PDBs.

       Please refer to Note: Datapatch: Database 12c Post Patch SQL Automation

       (Doc ID 1585822.1)




Bootstrapping registry and package to current versions...done

Determining current state...done




Current state of interim SQL patches:

  No interim patches found




Current state of release update SQL patches:

  Binary registry:

    18.6.0.0.0 Release_Update 1903190102: Installed

  PDB CDB$ROOT:

    Applied 18.6.0.0.0 Release_Update 1903190102 successfully on 13-NOV-19 05.17.48.183101 PM

  PDB PDB$SEED:

    Applied 18.6.0.0.0 Release_Update 1903190102 successfully on 13-NOV-19 05.17.50.152110 PM




Adding patches to installation queue and performing prereq checks...done

Installation queue:

  For the following PDBs: CDB$ROOT PDB$SEED

    No interim patches need to be rolled back

    No release update patches need to be installed

    No interim patches need to be applied




SQL Patching tool complete on Thu Nov 14 09:21:48 2019

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$

[grid@exa01vm02 -MGMTDB]$ sqlplus / as sysdba




SQL*Plus: Release 18.0.0.0.0 - Production on Thu Nov 14 09:23:35 2019

Version 18.6.0.0.0




Copyright (c) 1982, 2018, Oracle.  All rights reserved.







Connected to:

Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0




SQL> set linesize 250

SQL> col comments format a50

SQL> select ACTION_TIME,VERSION,COMMENTS from dba_registry_history;




ACTION_TIME                                                                 VERSION                        COMMENTS

--------------------------------------------------------------------------- ------------------------------ --------------------------------------------------

                                                                            18                             RDBMS_18.6.0.0.0DBRU_LINUX.X64_190313.1

19-JUN-18 01.51.48.687187 PM                                                18.0.0.0.0                     Patch applied from 18.1.0.0.0 to 18.2.0.0.0

13-NOV-19 05.16.14.040821 PM                                                18.0.0.0.0                     Patch applied from 18.2.0.0.0 to 18.6.0.0.0




SQL> exit

Disconnected from Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0

[grid@exa01vm02 -MGMTDB]$

The Workaround

The workaround tries to bypass some checks executed to opatch when checking MGMTDB. If you look at the errors above, they were caused because CRS can’t communicate correctly with the database. Maybe because of some credential failure inside of CRS or listener registration.

The first step is to remove completely the MGMTDB. To do that you can use the dbca (from GI home and only one node):

[grid@exa01vm02 +ASM2]$ $ORACLE_HOME/bin/dbca -silent -deleteDatabase -sourceDB -MGMTDB

[WARNING] [DBT-19202] The Database Configuration Assistant will delete the Oracle instances and datafiles for your database. All information in the database will be destroyed.

Prepare for db operation

32% complete

Connecting to database

35% complete

39% complete

42% complete

45% complete

48% complete

52% complete

65% complete

Updating network configuration files

68% complete

Deleting instance and datafiles

84% complete

100% complete

Database deletion completed.

Look at the log file "/u01/app/grid/cfgtoollogs/dbca/_mgmtdb/_mgmtdb.log" for further details.

[grid@exa01vm02 +ASM2]$

After remove completely, you just need to recreate the MGMTDB. This can be done manually or using the script from MOS. One good source for information is the note How to Move/Recreate GI Management Repository (GIMR / MGMTDB) to Different Shared Storage (Diskgroup, CFS or NFS etc) (Doc ID 1589394.1) where you can find how to create manually or the script (the script just call the dbca).

Here I used the script, you can download it from the note MDBUtil: GI Management Repository configuration tool (Doc ID 2065175.1). The script call DBCA and you just need to specify the diskgroup that you want to use to store the database files.

In this case, I made:

[grid@exa01vm02 +ASM2]$ /tmp/mdbutil.pl --addmdb --target=+DATAC2

mdbutil.pl version : 1.98

2019-11-14 09:54:58: I Starting To Configure MGMTDB at +DATAC2...

2019-11-14 09:55:19: I Container database creation in progress... for GI 18.0.0.0.0

2019-11-14 10:12:50: I Plugable database creation in progress...

2019-11-14 10:16:59: I Executing "/tmp/mdbutil.pl --addchm" on exa01vm02 as root to configure CHM.

root@exa01vm02's password:

2019-11-14 10:17:26: I Executing "/tmp/mdbutil.pl --addchm" on exa01vm02 as root to configure CHM.

root@exa01vm02's password:

2019-11-14 10:17:34: I MGMTDB & CHM configuration done!

[grid@exa01vm02 +ASM2]$

You can even call with debug mode:

[grid@exa02vm03 -MGMTDB]$ /tmp/mdbutil.pl --addmdb --target=+DATAC3 --debug

mdbutil.pl version : 1.98

2020-01-15 15:30:58: D Executing: /u01/app/18.0.0/grid/bin/srvctl status diskgroup -g DATAC3

2020-01-15 15:30:59: D Exit code: 0

2020-01-15 15:30:59: D Output of last command execution:

Disk Group DATAC3 is running on exa02vm03,exa01vm03

2020-01-15 15:30:59: I Starting To Configure MGMTDB at +DATAC3...

2020-01-15 15:30:59: D Executing: /u01/app/18.0.0/grid/bin/srvctl status mgmtlsnr

2020-01-15 15:30:59: D Exit code: 0

2020-01-15 15:30:59: D Output of last command execution:

Listener MGMTLSNR is enabled

2020-01-15 15:30:59: D Executing: /u01/app/18.0.0/grid/bin/srvctl status mgmtdb

2020-01-15 15:31:00: D Exit code: 1

2020-01-15 15:31:00: D Output of last command execution:

PRCD-1120 : The resource for database _mgmtdb could not be found.

2020-01-15 15:31:00: D Executing: /u01/app/18.0.0/grid/bin/srvctl status mgmtdb

2020-01-15 15:31:01: D Exit code: 1

2020-01-15 15:31:01: D Output of last command execution:

PRCD-1120 : The resource for database _mgmtdb could not be found.

2020-01-15 15:31:01: D Executing: /u01/app/18.0.0/grid/bin/srvctl stop mgmtlsnr

2020-01-15 15:31:05: D Exit code: 0

2020-01-15 15:31:05: D Output of last command execution:

2020-01-15 15:31:05: D Executing: /u01/app/18.0.0/grid/bin/crsctl query crs activeversion

2020-01-15 15:31:05: D Exit code: 0

2020-01-15 15:31:05: D Output of last command execution:

Oracle Clusterware active version on the cluster is [18.0.0.0.0]

2020-01-15 15:31:05: D Executing: /u01/app/18.0.0/grid/bin/srvctl enable qosmserver

2020-01-15 15:31:06: D Exit code: 2

2020-01-15 15:31:06: D Output of last command execution:

PRKF-1321 : QoS Management Server is already enabled.

2020-01-15 15:31:06: D Executing: /u01/app/18.0.0/grid/bin/srvctl start qosmserver

2020-01-15 15:31:06: D Exit code: 2

2020-01-15 15:31:06: D Output of last command execution:

PRCC-1014 : qosmserver was already running

2020-01-15 15:31:06: I Container database creation in progress... for GI 18.0.0.0.0

2020-01-15 15:31:06: D Executing: /u01/app/18.0.0/grid/bin/dbca  -silent -createDatabase -createAsContainerDatabase true -templateName MGMTSeed_Database.dbc -sid -MGMTDB -gdbName _mgmtdb -storageType ASM -diskGroupName DATAC3 -datafileJarLocation /u01/app/18.0.0/grid/assistants/dbca/templates -characterset AL32UTF8 -autoGeneratePasswords -skipUserTemplateCheck

2020-01-15 15:47:19: D Exit code: 0

2020-01-15 15:47:19: D Output of last command execution:

Prepare for db operation

2020-01-15 15:47:19: I Plugable database creation in progress...

2020-01-15 15:47:19: D Executing: /u01/app/18.0.0/grid/bin/mgmtca -local

2020-01-15 15:49:28: D Exit code: 0

2020-01-15 15:49:28: D Output of last command execution:

2020-01-15 15:49:28: D Executing: scp /tmp/mdbutil.pl exa01vm03:/tmp/

2020-01-15 15:49:28: D Exit code: 0

2020-01-15 15:49:28: D Output of last command execution:

2020-01-15 15:49:28: I Executing "/tmp/mdbutil.pl --addchm" on exa01vm03 as root to configure CHM.

2020-01-15 15:49:28: D Executing: ssh root@exa01vm03 "/tmp/mdbutil.pl --addchm"

root@exa01vm03's password:

2020-01-15 15:49:39: D Exit code: 0

2020-01-15 15:49:39: D Output of last command execution:

mdbutil.pl version : 1.98

2020-01-15 15:49:39: D Executing: scp /tmp/mdbutil.pl exa02vm03:/tmp/

2020-01-15 15:49:39: D Exit code: 0

2020-01-15 15:49:39: D Output of last command execution:

2020-01-15 15:49:39: I Executing "/tmp/mdbutil.pl --addchm" on exa02vm03 as root to configure CHM.

2020-01-15 15:49:39: D Executing: ssh root@exa02vm03 "/tmp/mdbutil.pl --addchm"

root@exa02vm03's password:

2020-01-15 15:49:46: D Exit code: 0

2020-01-15 15:49:46: D Output of last command execution:

mdbutil.pl version : 1.98

2020-01-15 15:49:46: I MGMTDB & CHM configuration done!

[grid@exa02vm03 -MGMTDB]$

Resume and finish

So, after recreating the MGMTDB we can resume the opatch:

[root@exa02vm03 ~]# opatchauto resume




OPatchauto session is initiated at Wed Jan 15 15:50:15 2020

Session log file is /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/opatchauto2020-01-15_03-50-16PM.log

Resuming existing session with id NRZ1




Starting CRS service on home /u01/app/18.0.0/grid

CRS service started successfully on home /u01/app/18.0.0/grid




OPatchAuto successful.




--------------------------------Summary--------------------------------




Patching is completed successfully. Please find the summary as follows:




Host:exa02vm03

CRS Home:/u01/app/18.0.0/grid

Version:18.0.0.0.0

Summary:




==Following patches were SUCCESSFULLY applied:




Patch: /u01/patches/grid/29301682/28435192

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-15_13-40-42PM_1.log




Patch: /u01/patches/grid/29301682/28547619

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-15_13-40-42PM_1.log




Patch: /u01/patches/grid/29301682/29301631

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-15_13-40-42PM_1.log




Patch: /u01/patches/grid/29301682/29301643

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-15_13-40-42PM_1.log




Patch: /u01/patches/grid/29301682/29302264

Log: /u01/app/18.0.0/grid/cfgtoollogs/opatchauto/core/opatch/opatch2020-01-15_13-40-42PM_1.log










OPatchauto session completed at Wed Jan 15 15:57:24 2020

Time taken to complete the session 7 minutes, 9 seconds

[root@exa02vm03 ~]#

But why this work? So, the idea is that when the opatch tries to check if it is needed to apply the patch over the MGMTDB, it will reach it and verify that everything was already there and the call for mgmtca applysql will be successful. And since all the credentials between GI and MGMTDB are fine, the check can be done correctly. Another point is that you are recreating the database with one binary that already has the RU running.

Again, this solution can be applied for these errors when the communication between MGMTDB and GI fails because of credential errors or listener errors. And I include the complete check for opatch error to identify exactly the error (it is just because mgmtca applysql), and if the SQL’s was applied by datpatch and are registered inside of the database. To finish, always open SR.

Disclaimer: “The postings on this site are my own and don’t necessarily represent my actual employer positions, strategies or opinions. The information here was edited to be useful for general purpose, specific data and identifications were removed to allow reach the generic audience and to be useful for the community.”

19c Grid Infrastructure Upgrade

Category: Engineer System Author: Fernando Simon (Board Member) Date: 6 years ago Comments: 0

19c Grid Infrastructure Upgrade

Upgrade GRID infrastructure is one activity that usually is postponed because it involves a sensible area that, when not works, causes big downtime until be fixed. But, in the last versions, it is not a complicated task and if you follow the basic rules, it works without problems.

Here I will show a little example of how to upgrade the GI from 18.6.0 to 19.5. The steps below were executed at Exadata running version 19.2.7.0.0.191012 and GI 18.6.0.0, but can be done in every environment that supports Oracle GI.

Requirements and Plan

The basis for every upgrade is, first, check the requirements to do that. This means not just for operational system version, but the current GI version (that one that will be upgraded). For Exadata and GI 19 you can follow the note 19c Grid Infrastructure and Database Upgrade steps for Exadata Database Machine running on Oracle Linux (Doc ID 2542082.1) but can use as a guide even in the non-Exadata environment.

Every upgrade or patch needs to be planned to be successful. Check previously the current versions, backup, and others are the basis. There is no rule of thumb, but a base plan has:

Check operation system requirements (like version).
Check current GI requirements (maybe install additional patches, like 28553832 for Exadata).
Check system requirements (space and others).
Check Oracle Home Requirements (maybe you need to apply additional patches for Oracle Homes).
Download and copy to one server that will be patched (this includes the patches to be applied over the new GI).
Prepare the X (you can use xauth and redirection through ssh).

You can follow the note hinted previously to see all the requirements. Another REQUIRED reference is the blog of Mike Dietrich that is dedicated to Oracle Upgrade Practices. Since I am doing this at Oracle Exadata, I downloaded the compatible versions from note Exadata Database Machine and Exadata Storage Server Supported Versions (Doc ID 888828.1).

Pre-Upgrade

When you start the upgrade phase you already have all the requirements OK (operational system, previous GI patched). The steps here are based in the note 19c Grid Infrastructure and Database Upgrade steps for Exadata Database Machine running on Oracle Linux (Doc ID 2542082.1) but I will include some extra details and tricks that will help you.

Memory parameters

It is necessary to have at least 3GB for SGA to correctly upgrade (and runs) the GI 19c version. Check this in the current GI to guarantee that this is OK (executed as grid user):

[grid@exacl04n1 +ASM1]$ sqlplus / as sysdba




SQL*Plus: Release 18.0.0.0.0 - Production on Fri Jan 24 14:57:24 2020

Version 18.6.0.0.0




Copyright (c) 1982, 2018, Oracle.  All rights reserved.







Connected to:

Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0




SQL> show parameter memory_target




NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

memory_target                        big integer 0

SQL> show parameter memory_max_target




NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

memory_max_target                    big integer 0

SQL> show parameter use_large_pages




NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

use_large_pages                      string      TRUE

SQL> show parameter sga_max_size




NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

sga_max_size                         big integer 3G

SQL> show parameter sga_target




NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

sga_target                           big integer 3G

SQL> exit

Disconnected from Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

Version 18.6.0.0.0

[grid@exacl04n1 +ASM1]$

Folders and Unzip

Create all the folders for oracle GI installation. This means (usually) the /u01/app/19.0.0.0/grid for GI at Exadata.

[root@exacl04n1 ~]# dcli -g /zfs/EXADATA/exacl04.txt -l root mkdir -p /u01/app/19.0.0.0/grid

[root@exacl04n1 ~]# dcli -g /zfs/EXADATA/exacl04.txt -l root chown grid /u01/app/19.0.0.0/grid

[root@exacl04n1 ~]# dcli -g /zfs/EXADATA/exacl04.txt -l root chgrp -R oinstall /u01/app/19.0.0.0/grid

[root@exacl04n1 ~]#

In Exadata, I used dcli to do that automatically in every node of the cluster. But remember to check the ownership of the folder in every node.

After you create the folders, in the first node (where +ASM1 runs), you can unzip the GI Base version for 19c (downloaded from edelivery.oracle.com) in the folder for GI destination. Remember to execute this as the user that will run GI, in this case, I made with grid user and unzipped directly from an NFS folder (but you can put in a local folder and unzip):

[grid@exacl04n1 +ASM1]$ unzip -q /nfs/19c/Exadata-Patch/19c-Grid/V982068-01.zip -d /u01/app/19.0.0.0/grid

[grid@exacl04n1 +ASM1]$

runcluvfy

Cluster verify utility is amazing because automates a lot of tests and checks. It is needed to execute before you continue because it will report to you the success (if everything is OK), or error and points to be fixed.

In this case, I made (the output was cropped but a full output can be checked here):

[grid@exacl04n1 +ASM1]$ cd /u01/app/19.0.0.0/grid/

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ ./runcluvfy.sh stage -pre crsinst -upgrade -rolling -src_crshome /u01/app/18.0.0/grid -dest_crshome /u01/app/19.0.0.0/grid -dest_version 19.0.0.0.0 -fixup -verbose




Verifying Physical Memory ...

  Node Name     Available                 Required                  Status

  ------------  ------------------------  ------------------------  ----------

  exacl04n2     84.1372GB (8.8224204E7KB)  8GB (8388608.0KB)         passed

  exacl04n1     84.1372GB (8.8224204E7KB)  8GB (8388608.0KB)         passed

…

Verifying ASM Filter Driver configuration ...PASSED

Verifying Systemd login manager IPC parameter ...PASSED




Pre-check for cluster services setup was successful.

Verifying RPM Package Manager database ...INFORMATION

PRVG-11250 : The check "RPM Package Manager database" was not performed because

it needs 'root' user privileges.







CVU operation performed:      stage -pre crsinst

Date:                         Jan 24, 2020 3:17:24 PM

CVU home:                     /u01/app/19.0.0.0/grid/

User:                         grid

[grid@exacl04n1 +ASM1]$

As you can see, was a success. But one example of a failed report is:

Checks did not pass for the following nodes:

        zeroing02,zeroing01







Failures were encountered during execution of CVU verification request "stage -pre crsinst".




Verifying Node Connectivity ...FAILED

zeroing02: PRVG-11068 : Highly Available IP (HAIP) is enabled on the nodes

            "zeroing01,zeroing02".




zeroing01: PRVG-11068 : Highly Available IP (HAIP) is enabled on the nodes

            "zeroing01,zeroing02".




Verifying RPM Package Manager database ...INFORMATION

PRVG-11250 : The check "RPM Package Manager database" was not performed because

it needs 'root' user privileges.







CVU operation performed:      stage -pre crsinst

Date:                         Dec 9, 2019 1:38:46 PM

CVU home:                     /u01/app/19.0.0.0/grid/

User:                         oracle

[root@zeroing01 ~]#

Prepare Patch for the New GI

It is possible to patch the new GI even before install the 19c GI (the same was possible to do at 12 and 18c too). It is recommended by the way. Fortunately, you don’t need to patch manually, you just call the gridSetup.sh with -applyRU if it is Release Update (or -applyRUR for RUR patch).

Since some files will be overwritten, I create in node 1 one new folder to backup them (executed as the same owner from GI):

[grid@exacl04n1 +ASM1]$ mkdir /u01/patches/grid-19c

[grid@exacl04n1 +ASM1]$

To apply these patches correctly it is needed to update the opatch from the GI base release (that was unzipped previously). Execute this as the same user that will run GI:

[grid@exacl04n1 +ASM1]$ #backup current opatch to the folder created before

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ cp /u01/app/19.0.0.0/grid/OPatch /u01/patches/grid-19c/OPatch-ORG -R

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ # go to GI home folder

[grid@exacl04n1 +ASM1]$ cd /u01/app/19.0.0.0/grid

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ # unzip the new version of Opatch for 19c

[grid@exacl04n1 +ASM1]$ unzip -qa /zfs/EXADATA_PATCHING/19c/Exadata-Patch/p6880880_190000_Linux-x86-64.zip

replace OPatch/emdpatch.pl? [y]es, [n]o, [A]ll, [N]one, [r]ename: A

[grid@exacl04n1 +ASM1]$

As you saw, I create the backup of Opatch and overwrote with the new version. Remember to execute this with the same user that will runs GI.

Another point is unzipping the patch that you want to apply. Here I used the same folder create before to store the patch. Everything executed in the first node and as the user for GI:

[grid@exacl04n1 +ASM1]$ cd /u01/patches/grid-19c/

[grid@exacl04n1 +ASM1]$ unzip -qa /zfs/EXADATA_PATCHING/19c/Exadata-Patch/19c-Grid/p30116789_190000_Linux-x86-64.zip

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ ls -l

total 232

drwxr-x---  7 grid oinstall   4096 Oct  9 17:11 30116789

drwxr-x--- 14 grid oinstall   4096 Jan 24 15:23 OPatch-ORG

-rw-rw-r--  1 grid oinstall 225499 Oct 15 13:24 PatchSearch.xml

[grid@exacl04n1 +ASM1]$

The version here was downloaded from node 888828.1 for Exadata and it is the GI Release Update 19.5.0.0 – Patch 30116789.

Fix know issues

Little know issues need to be fixed before you upgrade the GI. These errors were based on my experience during passed upgraded (These errors were not reported by cluster verify utility).

Lock for oraInventory

The first is related to inventory lock files. It is common to have an environment with role separation (where GI runs with a different user than OH) and can occur that lock file can be set as other ownership. To fix execute in both nodes (or dcli for Exadata):

[root@exacl04n1 ~]# dcli -g /zfs/EXADATA/exacl04.txt -l root "chown grid:oinstall /u01/app/oraInventory/locks -R"

[root@exacl04n1 ~]#

[root@exacl04n1 ~]# dcli -g /zfs/EXADATA/exacl04.txt -l root "ls -l /u01/app/oraInventory/locks"

exacl04n1: total 0

exacl04n1: -rw-r--r-- 1 grid oinstall 0 Jan 20 13:07 inventory.lock

exacl04n2: total 0

exacl04n2: -rw-r--r-- 1 grid oinstall 0 Jan 21 13:36 inventory.lock

[root@exacl04n1 ~]#

ACFS for ghchkpt

Can occur that GHCHKPT filesystem was removed (intentionally or no) from the current GI home and during the rootupgrade.sh you can hit the error belowing it does not exists:

2019/12/04 17:09:28 CLSRSC-692: Checking whether CRS entities are ready for upgrade. This operation may take a few minutes.

PRCR-1129 : Failed to upgrade resource type ora.rhpserver.type

PRCR-1071 : Failed to register or update resource ora.rhpserver

CRS-2510: Resource 'ora.datac3.ghchkpt.acfs' used in dependency 'hard' does not exist or is not registered.

CRS-2514: Dependency attribute specification 'hard' is invalid in resource 'ora.rhpserver'

2019/12/04 17:11:14 CLSRSC-180: An error occurred while executing the command '/u01/app/19.0.0.0/grid/bin/srvctl upgrade model  -s 18.0.0.0.0 -d 19.0.0.0.0 -p first'

2019/12/04 17:11:17 CLSRSC-694: failed to validate CRS entities for upgrade, aborting the upgrade

2019/12/04 17:11:17 CLSRSC-362: The pre-upgrade checks failed, aborting the upgrade

Died at /u01/app/19.0.0.0/grid/crs/install/crsupgrade.pm line 3772.

[root@exacl03n1 ~]#

To avoid this you need to recreate it manually doing this:

[root@exacl04n1 ~]# su - grid

Last login: Fri Jan 24 14:51:16 CET 2020 from 35.213.248.138 on ssh

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ #CHECK if ghchkpt exists. In this case NO

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ crsctl stat res -w "TYPE = ora.acfs.type" -p | grep VOLUME

AUX_VOLUMES=

CANONICAL_VOLUME_DEVICE=/dev/asm/acfsoh-256

VOLUME_DEVICE=/dev/asm/acfsoh-256

AUX_VOLUMES=

CANONICAL_VOLUME_DEVICE=/dev/asm/acfsoh-256

VOLUME_DEVICE=/dev/asm/acfsoh-256

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ #Create the volume at ASM. The size it is important

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ asmcmd

ASMCMD>

ASMCMD> volcreate -G DATAC4 -s 1536M ghchkpt

ASMCMD>

ASMCMD> volinfo -G DATAC4 ghchkpt

Diskgroup Name: DATAC4




         Volume Name: GHCHKPT

         Volume Device: /dev/asm/ghchkpt-256

         State: ENABLED

         Size (MB): 1536

         Resize Unit (MB): 512

         Redundancy: MIRROR

         Stripe Columns: 8

         Stripe Width (K): 1024

         Usage:

         Mountpath:




ASMCMD>

ASMCMD> exit

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ #Format the volume as ACFS

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ /sbin/mkfs -t acfs /dev/asm/ghchkpt-256

mkfs.acfs: version                   = 18.0.0.0.0

mkfs.acfs: on-disk version           = 46.0

mkfs.acfs: volume                    = /dev/asm/ghchkpt-256

mkfs.acfs: volume size               = 1610612736  (   1.50 GB )

mkfs.acfs: Format complete.

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$




[root@exacl04n1 ~]#

[root@exacl04n1 ~]# #As root, add the filesystem at current GI

[root@exacl04n1 ~]#

[root@exacl04n1 ~]# export ORACLE_HOME=/u01/app/18.0.0/grid

[root@exacl04n1 ~]# $ORACLE_HOME/bin/srvctl add filesystem -d /dev/asm/ghchkpt-256 -g DATAC4 -v GHCHKPT -m /mnt/oracle/rhpimages/chkbase/ -user oracle,grid

[root@exacl04n1 ~]#

[root@exacl04n1 ~]#

[root@exacl04n1 ~]# #Check that everything is OK

[root@exacl04n1 ~]#

[root@exacl04n1 ~]# $ORACLE_HOME/bin/crsctl stat res -t |grep acfs

ora.datac4.acfsoh.acfs

               ONLINE  ONLINE       exacl04n1                mounted on /u01/acfs

               ONLINE  ONLINE       exacl04n2                mounted on /u01/acfs

ora.datac4.ghchkpt.acfs

[root@exacl04n1 ~]#

[root@exacl04n1 ~]#

[root@exacl04n1 ~]# $ORACLE_HOME/bin/crsctl stat res -w "TYPE = ora.acfs.type" -p | grep VOLUME

AUX_VOLUMES=

CANONICAL_VOLUME_DEVICE=/dev/asm/acfsoh-256

VOLUME_DEVICE=/dev/asm/acfsoh-256

AUX_VOLUMES=

CANONICAL_VOLUME_DEVICE=/dev/asm/acfsoh-256

VOLUME_DEVICE=/dev/asm/acfsoh-256

AUX_VOLUMES=

CANONICAL_VOLUME_DEVICE=/dev/asm/ghchkpt-256

VOLUME_DEVICE=/dev/asm/ghchkpt-256

AUX_VOLUMES=

CANONICAL_VOLUME_DEVICE=/dev/asm/ghchkpt-256

VOLUME_DEVICE=/dev/asm/ghchkpt-256

[root@exacl04n1 ~]#

[root@exacl04n1 ~]#

Important here is the size for GHCHKPT volume that needs to be 1536M and registered at current GI. When registered the filesystem remember to set correctly the volume name.

ACFS filesystem and CRS resources

If you have some mountpoint over ACFS, it is recommended to stop it “safely” before executing the gridSetup. This occurs because during the upgrade phase the script will try to shutdown the entire clusters in the node, and if the unmount of ACFS goes wrong you will receive an error (and this can be boring and stressful to handle). The same is valid for resources that you registered at CRS.

So, just verify ACFS and unmount it (including stop the ACFS filesystem) and stop additional CRS resources:

[root@exacl04n1 ~]# /u01/app/18.0.0/grid/bin/crsctl stop res flk.mountbind -n exacl04n1

CRS-2673: Attempting to stop 'flk.mountbind' on 'exacl04n1'

CRS-2675: Stop of 'flk.mountbind' on 'exacl04n1' failed

CRS-2679: Attempting to clean 'flk.mountbind' on 'exacl04n1'

CRS-2681: Clean of 'flk.mountbind' on 'exacl04n1' succeeded

[root@exacl04n1 ~]#

[root@exacl04n1 ~]# #Stopped two times (the error above was expected)

[root@exacl04n1 ~]#

[root@exacl04n1 ~]# /u01/app/18.0.0/grid/bin/crsctl stop res flk.mountbind -n exacl04n1

CRS-2500: Cannot stop resource 'flk.mountbind' as it is not running

CRS-4000: Command Stop failed, or completed with errors.

[root@exacl04n1 ~]#

[root@exacl04n1 ~]# #Stop ACFS

[root@exacl04n1 ~]#

[root@exacl04n1 ~]# /u01/app/18.0.0/grid/bin/srvctl stop filesystem -d /dev/asm/acfsoh-256 -n exacl04n1

[root@exacl04n1 ~]#

Run gridSetup.sh

After preparing everything, patches, folders, and fix some issues we can call the gridSetup.sh to install the Oracle GI 19c.

The first is clear all the current session variables that point to the current GI, and set some variables (as DISPLAY for X and move to the new GI home to call the grid setup):

[root@exacl04n1 ~]# su - grid

.Last login: Fri Jan 24 16:06:12 CET 2020 from 35.213.248.138 on ssh

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ unset ORACLE_HOME

[grid@exacl04n1 +ASM1]$ unset ORACLE_BASE

[grid@exacl04n1 +ASM1]$ unset ORACLE_SID

[grid@exacl04n1 ]$

[grid@exacl04n1 ]$ cd /u01/app/19.0.0.0/grid/

[grid@exacl04n1 ]$

[grid@exacl04n1 ]$ export DISPLAY=35.213.248.116:1.0

[grid@exacl04n1 ]$

Now the gridSetup.sh can be called, and I do this with some special parameters:

-applyRU: This will apply the Release Update passed as parameter BEFORE start to install the 19c GI.
mgmtDB parameters: With 19c the MGMTDB it is not needed anymore, these parameters disable the installation and even the options are not shown during the graphical installation.

So, the call it is:

[grid@exacl04n1 ]$ ./gridSetup.sh -applyRU /u01/patches/grid-19c/30116789 -J-Doracle.install.mgmtDB=false -J-Doracle.install.mgmtDB.CDB=false -J Doracle.install.crs.enableRemoteGIMR=false

Preparing the home to patch...

Applying the patch /u01/patches/grid-19c/30116789...

Successfully applied the patch.

The log can be found at: /u01/app/oraInventory/logs/GridSetupActions2020-01-17_04-49-57PM/installerPatchActions_2020-01-17_04-49-57PM.log

Launching Oracle Grid Infrastructure Setup Wizard...




The response file for this session can be found at:

 /u01/app/19.0.0.0/grid/install/response/grid_2020-01-17_04-49-57PM.rsp

As you saw, the report says that the patch 30116789 was installed over the GI home (this took around 10 minutes). Just after that the graphical installer appears and the installation is basically NNF (next next finish) until the request for root upgrade scripts appears:

rootupgrade.sh – Node 1

At this point it is needed to execute, node by node, the rootupgrade.sh to “finish” the installation. The rootupgrade.sh execute important steps like:

TFA upgrade
Upgrade OCR
Upgrade ACFS driver
Upgrade GI scripts

If something wrong occurs here (mainly after step 3) the recommendation is open SR to verify the errors and if the issue.

Another point it is stop all Oracle Databases running in the node before executing the script.

So, in the first node was called the rootupgrade.sh:

[root@exacl04n1 ~]# /u01/app/19.0.0.0/grid/rootupgrade.sh

Performing root user operation.

The following environment variables are set as:

ORACLE_OWNER= grid

ORACLE_HOME= /u01/app/19.0.0.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]:

The contents of "dbhome" have not changed. No need to overwrite.

The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n)

[n]: y

Copying oraenv to /usr/local/bin ...

The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n)

[n]: y

Copying coraenv to /usr/local/bin ...

Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root script.

Now product-specific root actions will be performed.

Relinking oracle with rac_on option

Using configuration parameter file: /u01/app/19.0.0.0/grid/crs/install/crsconfig_params

The log of current session can be found at:

/u01/app/grid/crsdata/exacl04n1/crsconfig/rootcrs_exacl04n1_2020-01-24_04-42-49PM.log

2020/01/24 16:43:14 CLSRSC-595: Executing upgrade step 1 of 18: 'UpgradeTFA'.

2020/01/24 16:43:14 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector.

2020/01/24 16:43:14 CLSRSC-595: Executing upgrade step 2 of 18: 'ValidateEnv'.

2020/01/24 16:43:19 CLSRSC-595: Executing upgrade step 3 of 18: 'GetOldConfig'.

2020/01/24 16:43:19 CLSRSC-464: Starting retrieval of the cluster configuration data

2020/01/24 16:47:50 CLSRSC-692: Checking whether CRS entities are ready for upgrade. This operation may take a few minutes.

2020/01/24 16:48:25 CLSRSC-4003: Successfully patched Oracle Trace File Analyzer (TFA) Collector.

2020/01/24 16:50:06 CLSRSC-693: CRS entities validation completed successfully.

2020/01/24 16:50:12 CLSRSC-515: Starting OCR manual backup.

2020/01/24 16:50:48 CLSRSC-516: OCR manual backup successful.

2020/01/24 16:50:57 CLSRSC-486:

At this stage of upgrade, the OCR has changed.

Any attempt to downgrade the cluster after this point will require a complete cluster outage to restore the OCR.

2020/01/24 16:50:57 CLSRSC-541:

To downgrade the cluster:

All nodes that have been upgraded must be downgraded.

2020/01/24 16:50:57 CLSRSC-542:

Before downgrading the last node, the Grid Infrastructure stack on all other cluster nodes must be down.

2020/01/24 16:51:07 CLSRSC-465: Retrieval of the cluster configuration data has successfully completed.

2020/01/24 16:51:07 CLSRSC-595: Executing upgrade step 4 of 18: 'GenSiteGUIDs'.

2020/01/24 16:51:08 CLSRSC-595: Executing upgrade step 5 of 18: 'UpgPrechecks'.

2020/01/24 16:51:13 CLSRSC-363: User ignored prerequisites during installation

2020/01/24 16:51:25 CLSRSC-595: Executing upgrade step 6 of 18: 'SetupOSD'.

2020/01/24 16:51:25 CLSRSC-595: Executing upgrade step 7 of 18: 'PreUpgrade'.

2020/01/24 16:55:02 CLSRSC-468: Setting Oracle Clusterware and ASM to rolling migration mode

2020/01/24 16:55:02 CLSRSC-482: Running command: '/u01/app/18.0.0/grid/bin/crsctl start rollingupgrade 19.0.0.0.0'

CRS-1131: The cluster was successfully set to rolling upgrade mode.

2020/01/24 16:55:09 CLSRSC-482: Running command: '/u01/app/19.0.0.0/grid/bin/asmca -silent -upgradeNodeASM -nonRolling false -oldCRSHome /u01/app/18.0.0/grid -oldCRSVersion 18.0.0.0.0 -firstNode true -startRolling false '




ASM configuration upgraded in local node successfully.




2020/01/24 16:55:12 CLSRSC-469: Successfully set Oracle Clusterware and ASM to rolling migration mode

2020/01/24 16:55:18 CLSRSC-466: Starting shutdown of the current Oracle Grid Infrastructure stack

2020/01/24 16:56:08 CLSRSC-467: Shutdown of the current Oracle Grid Infrastructure stack has successfully completed.

2020/01/24 16:56:11 CLSRSC-595: Executing upgrade step 8 of 18: 'CheckCRSConfig'.

2020/01/24 16:56:12 CLSRSC-595: Executing upgrade step 9 of 18: 'UpgradeOLR'.

2020/01/24 16:56:23 CLSRSC-595: Executing upgrade step 10 of 18: 'ConfigCHMOS'.

2020/01/24 16:56:23 CLSRSC-595: Executing upgrade step 11 of 18: 'UpgradeAFD'.

2020/01/24 16:56:30 CLSRSC-595: Executing upgrade step 12 of 18: 'createOHASD'.

2020/01/24 16:56:37 CLSRSC-595: Executing upgrade step 13 of 18: 'ConfigOHASD'.

2020/01/24 16:56:37 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service'

2020/01/24 16:57:29 CLSRSC-595: Executing upgrade step 14 of 18: 'InstallACFS'.

2020/01/24 16:58:01 CLSRSC-595: Executing upgrade step 15 of 18: 'InstallKA'.

2020/01/24 16:58:21 CLSRSC-595: Executing upgrade step 16 of 18: 'UpgradeCluster'.

2020/01/24 17:00:27 CLSRSC-343: Successfully started Oracle Clusterware stack

clscfg: EXISTING configuration version 5 detected.

Successfully taken the backup of node specific configuration in OCR.

Successfully accumulated necessary OCR keys.

Creating OCR keys for user 'root', privgrp 'root'..

Operation successful.

2020/01/24 17:00:52 CLSRSC-595: Executing upgrade step 17 of 18: 'UpgradeNode'.

2020/01/24 17:00:57 CLSRSC-474: Initiating upgrade of resource types

2020/01/24 17:02:09 CLSRSC-475: Upgrade of resource types successfully initiated.

2020/01/24 17:02:24 CLSRSC-595: Executing upgrade step 18 of 18: 'PostUpgrade'.

2020/01/24 17:02:34 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded

[root@exacl04n1 ~]#

After finish with success, you can start the Oracle Databases in this node, or even relocate RACOneNode databases to this node.

rootupgrade.sh – Node 2

Here it is called the script in the second node. But some details before execute it:

Stop and unmount ACFS filesystem running in the node
Stop any additional resources at CRS running in this node
Stop all Databases running in the node (you relocate to others nodes already upgraded

After checking the points above, you can call the script:

[root@exacl04n2 ~]# /u01/app/18.0.0/grid/bin/crsctl stop res flk.mountbind -n exacl04n2

CRS-2673: Attempting to stop 'flk.mountbind' on 'exacl04n2'

CRS-2675: Stop of 'flk.mountbind' on 'exacl04n2' failed

CRS-2679: Attempting to clean 'flk.mountbind' on 'exacl04n2'

CRS-2681: Clean of 'flk.mountbind' on 'exacl04n2' succeeded

[root@exacl04n2 ~]#

[root@exacl04n2 ~]#

[root@exacl04n2 ~]# /u01/app/18.0.0/grid/bin/crsctl stop res flk.mountbind -n exacl04n2

CRS-2500: Cannot stop resource 'flk.mountbind' as it is not running

CRS-4000: Command Stop failed, or completed with errors.

[root@exacl04n2 ~]#

[root@exacl04n2 ~]#

[root@exacl04n2 ~]# /u01/app/18.0.0/grid/bin/srvctl stop filesystem -d /dev/asm/acfsoh-256 -n exacl04n2

[root@exacl04n2 ~]#

[root@exacl04n2 ~]#

[root@exacl04n2 ~]# /u01/app/19.0.0.0/grid/rootupgrade.sh

Performing root user operation.




The following environment variables are set as:

    ORACLE_OWNER= grid

    ORACLE_HOME=  /u01/app/19.0.0.0/grid




Enter the full pathname of the local bin directory: [/usr/local/bin]:

The contents of "dbhome" have not changed. No need to overwrite.

The file "oraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)

[n]: y

   Copying oraenv to /usr/local/bin ...

The file "coraenv" already exists in /usr/local/bin.  Overwrite it? (y/n)

[n]: y

   Copying coraenv to /usr/local/bin ...




Entries will be added to the /etc/oratab file as needed by

Database Configuration Assistant when a database is created

Finished running generic part of root script.

Now product-specific root actions will be performed.

Relinking oracle with rac_on option

Using configuration parameter file: /u01/app/19.0.0.0/grid/crs/install/crsconfig_params

The log of current session can be found at:

  /u01/app/grid/crsdata/exacl04n2/crsconfig/rootcrs_exacl04n2_2020-01-24_06-01-55PM.log

2020/01/24 18:02:12 CLSRSC-595: Executing upgrade step 1 of 18: 'UpgradeTFA'.

2020/01/24 18:02:12 CLSRSC-4015: Performing install or upgrade action for Oracle Trace File Analyzer (TFA) Collector.

2020/01/24 18:02:12 CLSRSC-595: Executing upgrade step 2 of 18: 'ValidateEnv'.

2020/01/24 18:02:13 CLSRSC-595: Executing upgrade step 3 of 18: 'GetOldConfig'.

2020/01/24 18:02:13 CLSRSC-464: Starting retrieval of the cluster configuration data

2020/01/24 18:02:42 CLSRSC-465: Retrieval of the cluster configuration data has successfully completed.

2020/01/24 18:02:42 CLSRSC-595: Executing upgrade step 4 of 18: 'GenSiteGUIDs'.

2020/01/24 18:02:43 CLSRSC-595: Executing upgrade step 5 of 18: 'UpgPrechecks'.

2020/01/24 18:02:43 CLSRSC-363: User ignored prerequisites during installation

2020/01/24 18:02:44 CLSRSC-595: Executing upgrade step 6 of 18: 'SetupOSD'.

2020/01/24 18:02:44 CLSRSC-595: Executing upgrade step 7 of 18: 'PreUpgrade'.




ASM configuration upgraded in local node successfully.




2020/01/24 18:03:01 CLSRSC-466: Starting shutdown of the current Oracle Grid Infrastructure stack

2020/01/24 18:04:45 CLSRSC-467: Shutdown of the current Oracle Grid Infrastructure stack has successfully completed.

2020/01/24 18:04:57 CLSRSC-595: Executing upgrade step 8 of 18: 'CheckCRSConfig'.

2020/01/24 18:04:57 CLSRSC-595: Executing upgrade step 9 of 18: 'UpgradeOLR'.

2020/01/24 18:05:05 CLSRSC-595: Executing upgrade step 10 of 18: 'ConfigCHMOS'.

2020/01/24 18:05:06 CLSRSC-595: Executing upgrade step 11 of 18: 'UpgradeAFD'.

2020/01/24 18:05:07 CLSRSC-595: Executing upgrade step 12 of 18: 'createOHASD'.

2020/01/24 18:05:09 CLSRSC-595: Executing upgrade step 13 of 18: 'ConfigOHASD'.

2020/01/24 18:05:09 CLSRSC-329: Replacing Clusterware entries in file 'oracle-ohasd.service'

2020/01/24 18:05:54 CLSRSC-595: Executing upgrade step 14 of 18: 'InstallACFS'.

2020/01/24 18:06:18 CLSRSC-4003: Successfully patched Oracle Trace File Analyzer (TFA) Collector.

2020/01/24 18:06:20 CLSRSC-595: Executing upgrade step 15 of 18: 'InstallKA'.

2020/01/24 18:06:34 CLSRSC-595: Executing upgrade step 16 of 18: 'UpgradeCluster'.

2020/01/24 18:08:31 CLSRSC-343: Successfully started Oracle Clusterware stack

clscfg: EXISTING configuration version 19 detected.

Successfully taken the backup of node specific configuration in OCR.

Successfully accumulated necessary OCR keys.

Creating OCR keys for user 'root', privgrp 'root'..

Operation successful.

2020/01/24 18:08:46 CLSRSC-595: Executing upgrade step 17 of 18: 'UpgradeNode'.

Start upgrade invoked..

2020/01/24 18:08:50 CLSRSC-478: Setting Oracle Clusterware active version on the last node to be upgraded

2020/01/24 18:08:50 CLSRSC-482: Running command: '/u01/app/19.0.0.0/grid/bin/crsctl set crs activeversion'

Started to upgrade the active version of Oracle Clusterware. This operation may take a few minutes.

Started to upgrade CSS.

CSS was successfully upgraded.

Started to upgrade Oracle ASM.

Started to upgrade CRS.

CRS was successfully upgraded.

Started to upgrade Oracle ACFS.

Oracle ACFS was successfully upgraded.

Successfully upgraded the active version of Oracle Clusterware.

Oracle Clusterware active version was successfully set to 19.0.0.0.0.

2020/01/24 18:10:02 CLSRSC-479: Successfully set Oracle Clusterware active version

2020/01/24 18:10:04 CLSRSC-476: Finishing upgrade of resource types

2020/01/24 18:10:18 CLSRSC-477: Successfully completed upgrade of resource types

2020/01/24 18:10:45 CLSRSC-595: Executing upgrade step 18 of 18: 'PostUpgrade'.

Successfully updated XAG resources.

2020/01/24 18:11:21 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded

[root@exacl04n2 ~]#

[root@exacl04n2 ~]#

As you can see, everything was fine and the cluster was upgraded in both nodes. If you have more nodes, continue with the others.

Continue with Graphical Installation – AND INTENTIONAL ERROR

After that is possible to continue the installation at the graphical side. But we will hit on intentional error.

During the phase Upgrade RHP Repository will occur one error. But this is expected. Remember that during the gridSetup.sh the mgmtDB parameters were disabled? So, since it was not created, the RHP will fail because the mgmtDB was not found (this occurred in all of 10 clusters that upgraded recently):

If you check in the log you will see details informing that mgmtDB was not found. To solve, just click OK and Skip. Look below that upgrade RHP was Ignored.

Post Upgrade

After finish the upgrade above, everything will be working correctly. But some details need to be checked to deliver everything at 100%.

ASM Compatibility

Remember that now the ASM.COMPATIBILITY parameter can be upgraded to allow you to use some new features. Remember, just ASM.COMPATIBILITY for all diskgroups:

[grid@exacl04n1 +ASM1]$ echo "ALTER DISKGROUP DATAC4 SET ATTRIBUTE 'compatible.asm' = '19.3.0.0.0';" | sqlplus -s / as sysasm




Diskgroup altered.




[grid@exacl04n1 +ASM1]$ echo "ALTER DISKGROUP RECOC4 SET ATTRIBUTE 'compatible.asm' = '19.3.0.0.0';" | sqlplus -s / as sysasm




Diskgroup altered.




[grid@exacl04n1 +ASM1]$

Inventory Update

Since Oracle 19C, GI did not register the nodes in the oraInventory. So, it is needed to add to allow older versions (11, 12 and 18) to see the GI nodes. Simple call resolves this (take attention the correct path and node names):

[grid@exacl04n1 +ASM1]$ /u01/app/19.0.0.0/grid/oui/bin/runInstaller -nowait -waitforcompletion -ignoreSysPrereqs -updateNodeList ORACLE_HOME=/u01/app/19.0.0.0/grid "CLUSTER_NODES={exacl04n1,exacl04n2}" CRS=true LOCAL_NODE=exacl04n1

Starting Oracle Universal Installer...




Checking swap space: must be greater than 500 MB.   Actual 16174 MB    Passed

The inventory pointer is located at /etc/oraInst.loc

[grid@exacl04n1 +ASM1]$

Remove previous GI

Remember to delete (or at least mark) in the Oracle inventory that previous GI was removed. To do that, you need to call runInstaller (from previous OH) with option detachHome. Be careful to set the ORACLE_HOME correctly (and in just one node):

[grid@exacl04n1 +ASM1]$ cat /u01/app/oraInventory/ContentsXML/inventory.xml |grep grid

<HOME NAME="OraGI18Home1" LOC="/u01/app/18.0.0/grid" TYPE="O" IDX="6">

<HOME NAME="OraGI19Home1" LOC="/u01/app/19.0.0.0/grid" TYPE="O" IDX="11" CRS="true">

<HOME NAME="OraGridHome" LOC="/u01/app/12.1.0.2/grid" TYPE="O" IDX="1" REMOVED="T"/>

<HOME NAME="OraGI12Home1" LOC="/u01/app/12.2.0.1/grid" TYPE="O" IDX="5" REMOVED="T"/>

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ unset ORACLE_HOME

[grid@exacl04n1 +ASM1]$ export ORACLE_HOME=/u01/app/18.0.0/grid/

[grid@exacl04n1 +ASM1]$ $ORACLE_HOME/oui/bin/runInstaller -detachHome -silent ORACLE_HOME=/u01/app/18.0.0/grid

Starting Oracle Universal Installer...




Checking swap space: must be greater than 500 MB.   Actual 16174 MB    Passed

The inventory pointer is located at /etc/oraInst.loc

'DetachHome' was successful.

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ cat /u01/app/oraInventory/ContentsXML/inventory.xml |grep grid

<HOME NAME="OraGI19Home1" LOC="/u01/app/19.0.0.0/grid" TYPE="O" IDX="11" CRS="true">

<HOME NAME="OraGridHome" LOC="/u01/app/12.1.0.2/grid" TYPE="O" IDX="1" REMOVED="T"/>

<HOME NAME="OraGI12Home1" LOC="/u01/app/12.2.0.1/grid" TYPE="O" IDX="5" REMOVED="T"/>

<HOME NAME="OraGI18Home1" LOC="/u01/app/18.0.0/grid" TYPE="O" IDX="6" REMOVED="T"/>

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$

After that, you can delete the previous GI Home.

Others details

To finish, remember to fix the bash_profile and fix oratab (if you have something particularly set there). Remember to execute in both nodes:

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ cat .bash_profile |grep HOME

export ORACLE_HOME=/u01/app/18.0.0/grid

export PATH=$ORACLE_HOME/bin:/usr/local/bin:/bin:/usr/bin

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ vi .bash_profile

[grid@exacl04n1 +ASM1]$

[grid@exacl04n1 +ASM1]$ cat .bash_profile |grep HOME

export ORACLE_HOME=/u01/app/19.0.0.0/grid

export PATH=$ORACLE_HOME/bin:/usr/local/bin:/bin:/usr/bin

[grid@exacl04n1 +ASM1]$




#################

#Second node

#################




[root@exacl04n2 ~]# su - grid

Last login: Fri Jan 24 18:17:43 CET 2020 from exacl04n1.flisk.rulz.org on ssh

[grid@exacl04n2 +ASM2]$ cat .bash_profile |grep HOME

export ORACLE_HOME=/u01/app/18.0.0/grid

export PATH=$ORACLE_HOME/bin:/usr/local/bin:/bin:/usr/bin

[grid@exacl04n2 +ASM2]$

[grid@exacl04n2 +ASM2]$ vi .bash_profile

[grid@exacl04n2 +ASM2]$

[grid@exacl04n2 +ASM2]$ cat .bash_profile |grep HOME

export ORACLE_HOME=/u01/app/19.0.0.0/grid

export PATH=$ORACLE_HOME/bin:/usr/local/bin:/bin:/usr/bin

[grid@exacl04n2 +ASM2]$

Conclusion

Upgrade GI to the 19c version is easier than the previous versions. If you follow correctly the requirements probably you will not see errors. At the beginning of 2018, I made the same for 18c upgrade (migrating from 12c) and the process was similar. You can check in my post Reaching Exadata 18c.

But be careful with ACFS and the repost from cluster verify script. The output from it (even if still requires additional checks as I showed before) is a good indication of system health and “OK to continue”.

The process that I showed above was executed in Exadata, but with small adaptions (mainly for dcli), it can be used in other environments as well.

The post-upgrade has some additional tricks as I showed in the previous post about Exadata, workaround for oracka.ko error and TFA error after GI upgrade to 19c. Nothing special but will require additional attention too.

Cleaning up JOB’s with erros in OEM 13c.

Category: Database Author: Andre Luiz Dutra Ontalba (Board Member) Date: 6 years ago Comments: 0

Cleaning up JOB's with OEM 13c errors

Today a simple but useful article, in OEM 13c we have the very useful job schedulling system.

However, as we can see in the image below when we have several errors, it is difficult to clean the jobs using the web interface or EMCLI.

EMCLI verb “get_jobs” reports all runs of the jobs, and “delete job” can delete job and its all runs but it’s not possible to delete a specific run of a job. For example, if you want to delete hundreds of “failed” (or “skipped”) runs of a job, EMCLI won’t help you and doing it through the web interface will be very time consuming.

We can easily solve this with a PL/SQL script to clean up jobs.

Connect to sqlplus with the sysman account:




[oracle@dbadutra:/home/oracle] sqlplus sysman@OEM13c




SQL*Plus: Release 19.0.0.0.0 – Production on Fri 28 08:15:03 2020

Version 19.3.0.0.0




Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 – Production

Version 19.3.0.0.0




SQL>

BEGIN
    FOR C IN
    (SELECT SCHEDULED_TIME, JOB_ID
    FROM MGMT$JOB_EXECUTION_HISTORY
    WHERE JOB_OWNER = ‘SYSMAN’
    AND JOB_ID IN (SELECT DISTINCT(JOB_ID) FROM MGMT$JOB_EXECUTION_HISTORY WHERE STATUS=’Error’ or STATUS=’Failed’)
    AND STATUS = ‘Error’ or STATUS=’Failed’)
    LOOP
          EM_JOB_OPS.DELETE_JOB_RUN(C.JOB_ID,C.SCHEDULED_TIME);
    END LOOP;
  COMMIT;
END;

Now let’s get there and the jobs are clean

I hope I helped with this tip

André Ontalba

ZDLRA, Protection Policies

Category: Engineer System Author: Fernando Simon (Board Member) Date: 6 years ago Comments: 0

ZDLRA, Protection Policies

For ZDLRA the protection policies have a significant role in the appliance management, but not just that, for the architecture design too. And usually (and unfortunately) policies do not take a lot of attention as deserved.

To create a good ZDLRA design, and avoid future problems, it is important to understand all the requirements for the protection policies and all the impacts. You can check the official documentation for this, but I will explain deeply the details that can pass without you notice them in the documentation.

DBMS_RA.CREATE_PROTECTION_POLICY

To create the policy it is easy, just need to use the DBMS_RA.CREATE_PROTECTION_POLICY and set the parameters:

SQL> BEGIN

  2  DBMS_RA.CREATE_PROTECTION_POLICY(

  3      protection_policy_name => 'ZDLRA_BRONZE'

  4      , description => 'Policy ZDLRA MAA BRONZE'

  5      , storage_location_name => 'DELTA'

  6      , recovery_window_goal => INTERVAL '30' DAY

  7      , max_retention_window => INTERVAL '60' DAY

  8      , recovery_window_sbt => INTERVAL '120' DAY

  9      , guaranteed_copy => 'NO'

 10      , allow_backup_deletion => 'YES'

 11  );

 12  END;

 13  /




PL/SQL procedure successfully completed.




SQL>

As you can see, the parameters are self-explained, you just define the name, description, and recovery/retention goals. But these retention windows are important and need some attention. To check inside of ZDLRA database, you can check the table RASYS.RA_PROTECTION_POLICY.

Retention Window

As you saw above, when you create the policy you have three parameters related to the retention window:

RECOVERY_WINDOW_GOAL: This parameter defines the time that ZDLRA will keep the backups (inside the appliance disks) for the database that is covered by this policy. In the example above, all the databases will stay for 30 days. Oldest, it is not guaranteed and can be deleted.
MAX_RETENTION_WINDOW: If ZDLRA has free space the backups can be retained until this retention window days inside of ZDLRA. In the example above, will be 60 days. If you do not specify, it will be “until having space”. If you specify, ZDLRA will delete everything after that period.
RECOVERY_WINDOW_SBT: It is the window that ZDLRA will retain for backups cloned to tape. In the example above, it will be 120 days and after that, the backup is not valid and will be expired.

The import here is understanding the small details. By the ZDLRA rules, it always tries to support (for each database) the point-in-time recovery from today until the recovery window goal. As the documentation says: “For each protected database, the Recovery Appliance attempts to ensure that the oldest backup on disk can support a point-in-time recovery to any time within the specified interval, counting backward from the current time.”.

One collateral effect for RECOVERY_WINDOW parameter is that it is global for policy (and not per database), and if you remember, when you enroll the database at ZDLRA, you need to define the “reserved_space” for it. And the detail it is that this value (reserved_space) needs to cover the recovery_window_goal. So, if your database changes a lot (or it is a big database) you need to constantly check the “Recovery Window Goal” and adjust the reserved space for your database. You can read some best practices here (page 15).

The MAX_RETENTION_WINDOW means the maximum time that your databases will be inside of ZDLRA. Imagine that the period between RECOVERY_WINDOW_GOAL and the MAX_RETENTION_WINDOW as a bonus, the backups are not guaranteed that will remains or will be there. If ZDLRA needs to delete some backups (because of lack of space), it will delete these backups between these dates first. And since the management is based in backupset, it can occur that one backupset for your database is deleted and you can’t use this point in time to restore the database.

One detail here it is that backups that pass MAX_RETENTION_WINDOW are forcibly deleted by ZDLRA. So, if you have a close date/time/day between RETENTION_WINDOW and MAX_RETENTION_WINDOW (like 10 for the first and 11 for the second), you can put a high pressure over ZDLRA because it will never stop of doing delete tasks for backups. In the ZDLRA best practices (or the old version) there is some vague indication of how to set it, but the idea is not to be aggressive with this value. By experience, I recommend that, at least, the MAX_RETENTION_WINDOW be 20% higher than RETENTION_WINDOW (and the same for reservered_space – be at least 20% higher than database size). But if you have big databases, this value needs to be high because the delete task will demand more time to finish and you can lead for a non-ending delete queue/load inside ZDLRA.

RECOVERY_WINDOW_SBT means that the period that backups will be available (for recovery purpose) in the cloned destination (tape or cloud). Since these backups are not inside of ZDLRA, it will not struggle for a lack of space.

More than one Policy

For ZDLRA probably you will have more than one policy to protect your databases since you probably will have databases (PROD, UAT, TST, DEV) with different requirements for recovery window. And even inside of same type (like PROD) it is possible to have different requirements (because of law regimentation as an example) and these lead/force you to create more than one policy.

Whatever the case, all databases will “fight each other” for disk space, and if you badly design your policies, or left the database in the wrong protection policy, you can have a system with high pressure for disk usage. ZDLRA always will accept new backups, if needed will delete the oldest backup (if you think this is adequate because probably the newest data is more important). But it is true too that ZDLRA will try to support the point-in-time recovery for all databases to reach what was defined in the policy. If you want to control this behavior you can set parameter GUARANTEED_COPY to YES. Doing this ZDLRA will delete old backups just if they were already copied to tape or replicated.

Don’t be afraid to create more than one policy because to move one database from one policy to another it is a simple command and will be more adequate to manage space usage if needed. One drawback of the huge number of protection policies is that clone to tape backups are based/scheduled in protection policies. If you have a lot of them, you need to create one clone to tape job for each one.

Protection Policies and ZDLRA Replication

One important detail is ZDLRA replication and how it interacted with protection policies. This is important because the replication between ZDLRA’s is purely based on policies, this means that replicates everything/all databases for the protection policy that you defined as a parameter. So, as you can imagine, if you want to replicate just some part of your databases between ZDLRA’s you need to create a specific protection policy.

Another interesting point is that on both sides of replicated ZDLRA the protection policies can have different recovery window goal. As example, in the primary site, the upstream ZDLRA can have 30 days of recovery windows and guaranteed copy as YES (because this ZDLRA receive more backups), but in the downstream ZDLRA, the destination protection policy can have 120 days as recovery window goal (because this ZDLRA protect fewer databases and the pressure for space usage will be less).

Let’s imagine protection policy for SILVER databases (https://www.oracle.com/a/tech/docs/maa-overview-onpremise-2019.pdf), that you want to replicate just some of them. In this case both ZDLRA’s (upstream and downstream) will have the “normal” silver protection policy (named as policy_silver as an example), as well another policy just for replicate some silver databases (named as policy_replicated_silver).

Architecture Design

The correct definition for your protection policies it is important for ZDLRA maintenance and usability. Design correctly the polices are important to avoid high pressure over the storage location for ZDLRA, even if you start to used (or maintain) one already deployed ZDLRA.

Understand recovery window goals and max retention windows constraint will avoid reaching full space utilization. You don’t need to create just one or two protection policies for your ZDLRA, but be careful with your design if you have replicated ZDLRA or protection a mix of database types. Group them correctly.

As explained before, there is a direct link between retention_window and reserved_space for your databases. If you create a unique protection policy for all of your databases, you can lead to putting a high value for reserved space and this can cause problems (like ZDLRA deny to add databases because you already reserved all the available space – even existing free space).

There is no rule of thumb to follow, like create policy A or B with X or Y values for the recovery window. The correct way is checking the requirements (and rules) that you need to follow and design the architecture that meets your requirements. Don’t worry if you need to change it in the future, it is possible and easy to do.

So, the most important is to know and understand the links that exist between the ZDLRA functionalities. Protection policies, replicated backups, and reserved space are some examples. A good time understanding them will reduce rework in the future.

« 1 … 10 11 12 13 14 … 32 »

Solving MGMTDB errors during 18c GI RU apply

Recently I executed the upgrade of Oracle GI to 19c version, from 18.6.0.0 to 19.5.0.0 version. But one step that was not showed there was that, because of requirements, the GI was upgraded from 18.2.0.0 to 18.6.0.0. This upgrade is a just Release Update (RU) apply and opatchauto command.

But during this upgrade, from 18.2 to 18.6, I faced (more than one time – 5 to be precise) errors during the update because of the MGMTDB errors. I got these errors:

ORA-12514, TNS: Listener does not currently know of service requested in connect descriptor

ORA-01017: invalid username/password; logon denied

MGTCA-1005 : Could not connect to the GIMR.

CRS-10407: (:CLSCRED1079:)Credential domain does not exist.

Here I will show how to solve these errors, how to identify if everything was fine and if you can continue. Be careful that it is an example, always open a support SR to identify the source of the error.

RU and Patch Process

To apply 18c RU over GI it is simple, basically, it is needed to call opatchauto for every node. One example of correct execution is:

As you can see, call in one node first, and after the second node. Basically during this process, the opatch will apply the binary updates in GI home, restart GI (to open with the new version), and apply needed SQL patches over MGMTDB and ASM.

MGMTDB and errors

Some errors examples that I faced:

ORA-12514, TNS:listener does not currently know of service requested in connect descriptor

ORA-01017: invalid username/password; logon denied

And inside of the opatch log you can reach/see something like:

But the common in all cases occurred during the “/u01/app/18.0.0/grid/bin/mgmtca applysql” call because the opatch detected that is need to apply the patch over MGMTDB, and tried to apply (and reached the error).

Solving the issue

After some dig in the web about the error and MOS/Metalink search, nothing was found. No error, hint or something related. Even after one SR opened the report was always the same: inconclusive and not relevant to build a fix. But for all one solution/workaround can be applied and worked every time.

Before, just to show that if you look inside the MGMTDB itself you can see that the patch was applied correctly:

If you look above even the datapatch and database itself report that patch was applied correctly:

The Workaround

The workaround tries to bypass some checks executed to opatch when checking MGMTDB. If you look at the errors above, they were caused because CRS can’t communicate correctly with the database. Maybe because of some credential failure inside of CRS or listener registration.

The first step is to remove completely the MGMTDB. To do that you can use the dbca (from GI home and only one node):

Here I used the script, you can download it from the note MDBUtil: GI Management Repository configuration tool (Doc ID 2065175.1). The script call DBCA and you just need to specify the diskgroup that you want to use to store the database files.

In this case, I made:

You can even call with debug mode:

Resume and finish

So, after recreating the MGMTDB we can resume the opatch:

19c Grid Infrastructure Upgrade

Upgrade GRID infrastructure is one activity that usually is postponed because it involves a sensible area that, when not works, causes big downtime until be fixed. But, in the last versions, it is not a complicated task and if you follow the basic rules, it works without problems.

Here I will show a little example of how to upgrade the GI from 18.6.0 to 19.5. The steps below were executed at Exadata running version 19.2.7.0.0.191012 and GI 18.6.0.0, but can be done in every environment that supports Oracle GI.

Requirements and Plan

Every upgrade or patch needs to be planned to be successful. Check previously the current versions, backup, and others are the basis. There is no rule of thumb, but a base plan has:

Check operation system requirements (like version).

Check current GI requirements (maybe install additional patches, like 28553832 for Exadata).

Check system requirements (space and others).

Check Oracle Home Requirements (maybe you need to apply additional patches for Oracle Homes).

Download and copy to one server that will be patched (this includes the patches to be applied over the new GI).

Prepare the X (you can use xauth and redirection through ssh).

Pre-Upgrade

Memory parameters

It is necessary to have at least 3GB for SGA to correctly upgrade (and runs) the GI 19c version. Check this in the current GI to guarantee that this is OK (executed as grid user):

Folders and Unzip

Create all the folders for oracle GI installation. This means (usually) the /u01/app/19.0.0.0/grid for GI at Exadata.

In Exadata, I used dcli to do that automatically in every node of the cluster. But remember to check the ownership of the folder in every node.

runcluvfy

Cluster verify utility is amazing because automates a lot of tests and checks. It is needed to execute before you continue because it will report to you the success (if everything is OK), or error and points to be fixed.

In this case, I made (the output was cropped but a full output can be checked here):

As you can see, was a success. But one example of a failed report is:

Prepare Patch for the New GI

It is possible to patch the new GI even before install the 19c GI (the same was possible to do at 12 and 18c too). It is recommended by the way. Fortunately, you don’t need to patch manually, you just call the gridSetup.sh with -applyRU if it is Release Update (or -applyRUR for RUR patch).

Since some files will be overwritten, I create in node 1 one new folder to backup them (executed as the same owner from GI):

To apply these patches correctly it is needed to update the opatch from the GI base release (that was unzipped previously). Execute this as the same user that will run GI:

As you saw, I create the backup of Opatch and overwrote with the new version. Remember to execute this with the same user that will runs GI.

Another point is unzipping the patch that you want to apply. Here I used the same folder create before to store the patch. Everything executed in the first node and as the user for GI:

The version here was downloaded from node 888828.1 for Exadata and it is the GI Release Update 19.5.0.0 – Patch 30116789.

Fix know issues

Little know issues need to be fixed before you upgrade the GI. These errors were based on my experience during passed upgraded (These errors were not reported by cluster verify utility).

Lock for oraInventory

The first is related to inventory lock files. It is common to have an environment with role separation (where GI runs with a different user than OH) and can occur that lock file can be set as other ownership. To fix execute in both nodes (or dcli for Exadata):

ACFS for ghchkpt

Can occur that GHCHKPT filesystem was removed (intentionally or no) from the current GI home and during the rootupgrade.sh you can hit the error belowing it does not exists:

To avoid this you need to recreate it manually doing this:

Important here is the size for GHCHKPT volume that needs to be 1536M and registered at current GI. When registered the filesystem remember to set correctly the volume name.

ACFS filesystem and CRS resources

So, just verify ACFS and unmount it (including stop the ACFS filesystem) and stop additional CRS resources:

Run gridSetup.sh

After preparing everything, patches, folders, and fix some issues we can call the gridSetup.sh to install the Oracle GI 19c.

The first is clear all the current session variables that point to the current GI, and set some variables (as DISPLAY for X and move to the new GI home to call the grid setup):

Now the gridSetup.sh can be called, and I do this with some special parameters:

-applyRU: This will apply the Release Update passed as parameter BEFORE start to install the 19c GI.

mgmtDB parameters: With 19c the MGMTDB it is not needed anymore, these parameters disable the installation and even the options are not shown during the graphical installation.

So, the call it is:

As you saw, the report says that the patch 30116789 was installed over the GI home (this took around 10 minutes). Just after that the graphical installer appears and the installation is basically NNF (next next finish) until the request for root upgrade scripts appears:

rootupgrade.sh – Node 1

At this point it is needed to execute, node by node, the rootupgrade.sh to “finish” the installation. The rootupgrade.sh execute important steps like:

TFA upgrade

Upgrade OCR

Upgrade ACFS driver