High Availability, Load Balancing, & Disaster Recovery Options in Oracle
6:58 PMHigh Availability - Protects against hardware/software issues that may cause downtime from one server in a cluster. An example could be a bad processor or memory issue.
Disaster Recovery - The ability to recover from a catastrophic event, such as a natural disaster or, on a more local level, a data center disaster (fire, flood, cooling outage, etc) and be able to continue business operations with minimal loss and minimal downtime. Optionally can be configured with same level of performance or reduced level of performance to save hardware costs on the DR site.
OpSource supports all of the following types of HA & DR
- RMAN backups - Full, incremental, archivelog backups- Database backups using RMAN is the most basic form of disaster recovery for Oracle and one that should be practiced in every situation, regardless of what other HA/DR solutions are in place. This requires local, san, or nas space connected directly to the database server for RMAN files to write too. In a cluster environment, the space must be san or nas so that the disk is shared between all nodes in the cluster. Once RMAN writes the backups to disk, the nightly OS backup will back the files up to the OS backup location. Local retention of RMAN files is set based on available disk space. OS backup retention is set based on contract with customer, typically 28 days. Recovery from RMAN backups consist of dropping the current database and restoring over the existing location, when customer does not have standby or test servers to use for restores.
- Exports - Traditional and Datapump - Database backups using exports can be done either at a full database level, tablespace or by schema. Typically customers that do not want RMAN for their own reasons, may want nightly exports or on demand exports. This does not restore every system component of the database as RMAN backups do. This is used for restoring data only into pre-configured oracle database. This can be an option also for maintenance windows when customer is working on certain schemas and do not want a full database restore if they need to restore just one piece of the database. Restore of exports can be done in either the production or staging/test databases, and can also be done in same schema/tablespaces or different schema/tablespaces.
- RAC - Real Application Clusters - Oracle Real Application Clusters (RAC) harnesses the processing power of multiple, interconnected servers on a cluster; allows access to a single database from multiple servers on a cluster, insulating both applications and database users from server failures, while providing performance that scales out on-demand at low cost; and is a vital component of grid computing that allows multiple servers to access a single database at one time.
- ODG/ADG - Oracle Dataguard/Active Dataguard - Predominantly protects a Production database against disasters and data corruption. Resource intensive operations on the Primary (Production) can be passed over to the Standby including reporting and backups. In some cases the Standby is used as the source for exports and subsequent refreshes of non-Production schemas. A Physical standby is kept synchronized with the Primary database by Redo Apply which recovers the redo received from the Primary and applies it. A standby can step-in and take the Production role, should there be any unplanned outage. Dataguard also benefits from automatic gap resolution where any missing redo, due to transport from the Primary failures, is automatically transmitted to the standby database when transport is resumed. Oracle Active Dataguard available in 11g, enhances the Dataguard offering (licensable). This allows the Standby to be open read only and be accessed for queries, sorting, reporting, etc whilst continuously applying redo from the Primary.
- Failsafe - Oracle Failsafe for Windows only- Oracle Fail Safe is a core feature included with every Oracle 11g and Oracle 10g license for Windows 2003 R2, and Windows 2008. It is a high availability software, integrated with Microsoft Cluster Server, that provides a fast, easy, and accurate way to configure and verify Windows clusters and to automatically fail over Oracle databases and applications. In the event of a system failure, Oracle Fail Safe works with Microsoft Cluster Server to restart Oracle databases and applications on a surviving cluster node. Oracle Fail Safe also enables flexible fail back when nodes are returned to service; planned failover for rolling cluster upgrades and workload balancing; and scripting of high availability tasks through its command line interface.
- Golden Gate - Alternative to ODG- Delivers low-impact, real-time data acquisition, distribution, and delivery across heterogeneous systems. Using this technology, it enables cost-effective and low-impact real-time data integration and continuous availability solutions. Oracle GoldenGate offers tighter integration with Oracle technologies and applications, support for additional heterogeneous systems, and improved performance. Moves committed transactions with transaction integrity and minimal overhead on your existing infrastructure. Its wide variety of use cases includes real-time business intelligence; query offloading; zero-downtime upgrades and migrations; disaster recovery; and active-active databases for data distribution, data synchronization and high availability.
- Active/Passive- Similar to Failsafe but for Linux - Reduces license cost as only the active server needs licensed. The shared storage is presented to both nodes, but only mounted and accessed by one node at any one time. In the event of a hardware failure, the failover is to simply mount on the passive node and start up the database. This failover is not transparent to the users, but can happen very quickly. Good alternative to RAC when the load balancing needs are not what is needed, but hardware failure support is.
- Cold Standby -- No license needed for cold standby configuration. A server is simply Oracle ready and we copy RMAN backup logs to the server on a regular routine basis. In the event of a failure we would just initiate an Oracle restore without having to wait on the primary hardware to be repaired or replaced.
0 comments