ZDLRA, Protection Policies
For ZDLRA the protection policies have a significant role in the appliance management, but not just that, for the architecture design too. And usually (and unfortunately) policies do not take a lot of attention as deserved.
To create a good ZDLRA design, and avoid future problems, it is important to understand all the requirements for the protection policies and all the impacts. You can check the official documentation for this, but I will explain deeply the details that can pass without you notice them in the documentation.
To create the policy it is easy, just need to use the DBMS_RA.CREATE_PROTECTION_POLICY and set the parameters:
3 protection_policy_name => 'ZDLRA_BRONZE'
4 , description => 'Policy ZDLRA MAA BRONZE'
5 , storage_location_name => 'DELTA'
6 , recovery_window_goal => INTERVAL '30' DAY
7 , max_retention_window => INTERVAL '60' DAY
8 , recovery_window_sbt => INTERVAL '120' DAY
9 , guaranteed_copy => 'NO'
10 , allow_backup_deletion => 'YES'
PL/SQL procedure successfully completed.
As you can see, the parameters are self-explained, you just define the name, description, and recovery/retention goals. But these retention windows are important and need some attention. To check inside of ZDLRA database, you can check the table RASYS.RA_PROTECTION_POLICY.
As you saw above, when you create the policy you have three parameters related to the retention window:
RECOVERY_WINDOW_GOAL: This parameter defines the time that ZDLRA will keep the backups (inside the appliance disks) for the database that is covered by this policy. In the example above, all the databases will stay for 30 days. Oldest, it is not guaranteed and can be deleted.
MAX_RETENTION_WINDOW: If ZDLRA has free space the backups can be retained until this retention window days inside of ZDLRA. In the example above, will be 60 days. If you do not specify, it will be “until having space”. If you specify, ZDLRA will delete everything after that period.
RECOVERY_WINDOW_SBT: It is the window that ZDLRA will retain for backups cloned to tape. In the example above, it will be 120 days and after that, the backup is not valid and will be expired.
The import here is understanding the small details. By the ZDLRA rules, it always tries to support (for each database) the point-in-time recovery from today until the recovery window goal. As the documentation says: “For each protected database, the Recovery Appliance attempts to ensure that the oldest backup on disk can support a point-in-time recovery to any time within the specified interval, counting backward from the current time.”.
One collateral effect for RECOVERY_WINDOW parameter is that it is global for policy (and not per database), and if you remember, when you enroll the database at ZDLRA, you need to define the “reserved_space” for it. And the detail it is that this value (reserved_space) needs to cover the recovery_window_goal. So, if your database changes a lot (or it is a big database) you need to constantly check the “Recovery Window Goal” and adjust the reserved space for your database. You can read some best practices here (page 15).
The MAX_RETENTION_WINDOW means the maximum time that your databases will be inside of ZDLRA. Imagine that the period between RECOVERY_WINDOW_GOAL and the MAX_RETENTION_WINDOW as a bonus, the backups are not guaranteed that will remains or will be there. If ZDLRA needs to delete some backups (because of lack of space), it will delete these backups between these dates first. And since the management is based in backupset, it can occur that one backupset for your database is deleted and you can’t use this point in time to restore the database.
One detail here it is that backups that pass MAX_RETENTION_WINDOW are forcibly deleted by ZDLRA. So, if you have a close date/time/day between RETENTION_WINDOW and MAX_RETENTION_WINDOW (like 10 for the first and 11 for the second), you can put a high pressure over ZDLRA because it will never stop of doing delete tasks for backups. In the ZDLRA best practices (or the old version) there is some vague indication of how to set it, but the idea is not to be aggressive with this value. By experience, I recommend that, at least, the MAX_RETENTION_WINDOW be 20% higher than RETENTION_WINDOW (and the same for reservered_space – be at least 20% higher than database size). But if you have big databases, this value needs to be high because the delete task will demand more time to finish and you can lead for a non-ending delete queue/load inside ZDLRA.
RECOVERY_WINDOW_SBT means that the period that backups will be available (for recovery purpose) in the cloned destination (tape or cloud). Since these backups are not inside of ZDLRA, it will not struggle for a lack of space.
More than one Policy
For ZDLRA probably you will have more than one policy to protect your databases since you probably will have databases (PROD, UAT, TST, DEV) with different requirements for recovery window. And even inside of same type (like PROD) it is possible to have different requirements (because of law regimentation as an example) and these lead/force you to create more than one policy.
Whatever the case, all databases will “fight each other” for disk space, and if you badly design your policies, or left the database in the wrong protection policy, you can have a system with high pressure for disk usage. ZDLRA always will accept new backups, if needed will delete the oldest backup (if you think this is adequate because probably the newest data is more important). But it is true too that ZDLRA will try to support the point-in-time recovery for all databases to reach what was defined in the policy. If you want to control this behavior you can set parameter GUARANTEED_COPY to YES. Doing this ZDLRA will delete old backups just if they were already copied to tape or replicated.
Don’t be afraid to create more than one policy because to move one database from one policy to another it is a simple command and will be more adequate to manage space usage if needed. One drawback of the huge number of protection policies is that clone to tape backups are based/scheduled in protection policies. If you have a lot of them, you need to create one clone to tape job for each one.
Protection Policies and ZDLRA Replication
One important detail is ZDLRA replication and how it interacted with protection policies. This is important because the replication between ZDLRA’s is purely based on policies, this means that replicates everything/all databases for the protection policy that you defined as a parameter. So, as you can imagine, if you want to replicate just some part of your databases between ZDLRA’s you need to create a specific protection policy.
Another interesting point is that on both sides of replicated ZDLRA the protection policies can have different recovery window goal. As example, in the primary site, the upstream ZDLRA can have 30 days of recovery windows and guaranteed copy as YES (because this ZDLRA receive more backups), but in the downstream ZDLRA, the destination protection policy can have 120 days as recovery window goal (because this ZDLRA protect fewer databases and the pressure for space usage will be less).
Let’s imagine protection policy for SILVER databases (https://www.oracle.com/a/tech/docs/maa-overview-onpremise-2019.pdf), that you want to replicate just some of them. In this case both ZDLRA’s (upstream and downstream) will have the “normal” silver protection policy (named as policy_silver as an example), as well another policy just for replicate some silver databases (named as policy_replicated_silver).
The correct definition for your protection policies it is important for ZDLRA maintenance and usability. Design correctly the polices are important to avoid high pressure over the storage location for ZDLRA, even if you start to used (or maintain) one already deployed ZDLRA.
Understand recovery window goals and max retention windows constraint will avoid reaching full space utilization. You don’t need to create just one or two protection policies for your ZDLRA, but be careful with your design if you have replicated ZDLRA or protection a mix of database types. Group them correctly.
As explained before, there is a direct link between retention_window and reserved_space for your databases. If you create a unique protection policy for all of your databases, you can lead to putting a high value for reserved space and this can cause problems (like ZDLRA deny to add databases because you already reserved all the available space – even existing free space).
There is no rule of thumb to follow, like create policy A or B with X or Y values for the recovery window. The correct way is checking the requirements (and rules) that you need to follow and design the architecture that meets your requirements. Don’t worry if you need to change it in the future, it is possible and easy to do.
So, the most important is to know and understand the links that exist between the ZDLRA functionalities. Protection policies, replicated backups, and reserved space are some examples. A good time understanding them will reduce rework in the future.
Disclaimer: “The postings on this site are my own and don’t necessarily represent my actual employer positions, strategies or opinions. The information here was edited to be useful for general purpose, specific data and identifications were removed to allow reach the generic audience and to be useful for the community.”