Monday, August 15, 2011

Oracle Golden Gate


Oracle Golden Gate is a tool provided by oracle for transactional data replication among oracle databases and other RDBMS tools (SQL SERVER, DB2.Etc). Its modular architecture gives you the flexibility to easily decouple or combined to provide best solution to meet the business requirements.
 Because of this flexibility in the architecture, Golden Gate supports numerous business requirements:
  • High Availability
  • Data Integration
  • Zero downtime upgrade and migration
  • Live reporting database
etc
 Oracle Golden Gate Architecture
 Oracle Golden Gate Architecture is composed of the following Components:
 ● Extract
 ● Data pump
 ● Replicat
 ● Trails or extract
 ● Checkpoints
 ● Manager
 ● Collector

 Below is the architecture diagram of GG:
 
 Oracle Golden Gate server runs on both source and target server. Oracle Golden Gate is installed as an external component to the database and it wont uses database resource, in turn it won’t effect database performance. Where as Oracle streams which uses built in packages which are provided by oracle, which uses most of the database resources and there are chances of performance slow down in both source and target databases.
 Let first have a look at architectural components of Oracle Golden Gate:
 EXTRACT:
Extract runs on the source system and it is the extraction mechanism for oracle Golden Gate( capture the changes which happens at the source database).
 The Extract process extracts the necessary data from the database transaction logs. For oracle database transaction logs are nothing both REDO log file data. Unlike streams which runs in the oracle database itself and needs the access to the database. Oracle Golden Gate does not needs access to the oracle database and also it will extract only the committed transaction from the online redo log file.
 Whenever there is a long running transaction which generates more number of redo data will force to switch the redo log file and in turn more number of archive logs will be generated. In these cases the extract process need to read the archive log files to get the data.
 Extract process captures all the changes that are made to objects that are configured for synchronization.  Multiple Extract processes can operate on different objects at the same time. For example once process could continuously extract transactional data changes and stream them to a decision support database. while another process performs batch extracts for periodic reporting or, two extract processes could extract and transmit in parallel to two replicat processes ( with two trails) to minimize target latency when the databases are large.
 DATAPUMP
 Datapump is the secondary extract process within source oracle Golden Gate configuration. You can have the source oracle Golden Gate configured without Datapump process also, but in this case Extract process has to send the data to trail file at the target. If the Datapump is configured the primary extract process writes the data to the source trail file and Datapump will read this trail file and propagate the data over the network to target trail file. The Datapump adds the storage flexibility and it isolates the primary extract process from TCP/IP activity.
 You can configure the primary extract process and Data pump extract to extract online or extract during batch processing.
 REPLICAT
 Replicat process runs on the target system. Replicat reads extracted transactional data changes and DDL changes (IF CONFIGURED) that are specified in the Replicat configuration, and then it replicates them to the target database.
 TRAILS OR EXTRACTS
 To support the continuous extraction and replication of source database changes, Oracle Golden Gate stores the captured changes temporarily on the disk in a series of files call a TRAIL. A trail can exist on the source or target system and even it can be in a intermediate system, depending on how the configuration is done. On the local system it is known as an EXTRACT TRAIL and on the remote system it is known as REMOTE TRAIL.
 The use of a trail also allows extraction and replication activities to occur independently of each other. Since these two ( source trail and target trail) are independent you have more choices for how data is delivered.
 CHECKPOINT
 Checkpoints stores the current read and write positions of a process to disk for recovery purposes. These checkpoints ensure that data changes that are marked for synchronization are extracted by extract and replicated by replicat.
 Checkpoint work with inter process acknowledgments to prevent messages from being lost in the network. OracleGolden Gatehas a proprietary guaranteed-message delivery technology.
 Checkpoint information is maintained in checkpoint files within the dirchk sub-directory of the Oracle Golden Gate directory. Optionally, Replicat checkpoints can be maintained in a checkpoint table within the target database, apart from standard checkpoint file.
 MANAGER
 The Manager process runs on both source and target systems and it is the heart or control process of Oracle Golden Gate. Manager must be up and running before you create EXTRAT or REPLICAT process. Manager performs Monitoring, restarting oracle golden gate process, report errors, report events, maintains trail files and logs etc.
 COLLECTOR
 Collector is a process that runs in the background on the target system. Collector receives extracted database changes that are sent across the TCP/IP network and it writes them to a trail or extract file.

2 comments:

Anonymous said...

Great article, do you know of a similar article that explains high availability usage of this product to eliminate downtime during DB minor and major upgrades?

Oracle DBA said...

Thanks, here are some of good links http://gavinsoorma.com/2010/02/goldengate-concepts-and-architecture/

http://www.oracle.com/technetwork/middleware/goldengate/documentation/index.html

Explore, you can find more

Post a Comment

Auto Scroll Stop Scroll