Data Migration: Summary of a Case We recently migrated data from an Oracle 10.2.0.4 to an 11.2.0.4 database. Some lessons learned are summarized below. * The amount of data in the source database is 160GB. But we found that about 90G is in an error log table. After repeated calls to the owner of the table to allow us not to include that data in the migration, there's no clear answer. We decided to omit it. So we only needed to migrate 70G. * Of this 70G, 50G is from one table, which unfortunately stores most of this data in a big LOB segment. Test-runs indicated, as expected, this was the bottleneck. LOB segment export didn't seem to be paralleled. * We decided to use data pump to do this work because it's simple and the amount of data is not huge. The command is one single: impdp username/password network_link=sourcedb parallel=4 schemas=... run on the target database. The source database servers are 2-node RAC, each with 4 (logical) CPUs. Test-runs took almost 5 hours. * Because of the limitation of network_link parameter of impdp, the few LONG and LONG RAW column data are copied separately: copy from username@sourcedb to username@targetdb insert tablename using select * from tablename * The app team would like to significantly reduce the downtime because not all apps query the 50G table. We checked the table and its owner for dependencies: select owner, count(*) from dba_dependencies where referenced_owner = ... select referenced_owner, count(*) from dba_dependencies where owner = ... select owner, count(*) from dba_constraints where r_owner = ... select r_owner, count(*) from dba_constraints where owner = ... select owner, count(*) from dba_tab_privs where grantee = ... select grantee, count(*) from dba_tab_privs where owner = ... Fortunately, we were able to work with the app team to separate all 81 schemas into two groups. (App team's input is crucial because dependency or lack of inside the database is not necessarily what matters to the apps.) * Before the data migration, all needed roles were pre-created with correct system privileges, and object (table) privileges from Oracle's internal users (mostly SYS). All 81 to-be-migrated schemas were pre-created with correct default tablespace, profile, system and role privileges, and object (table) privileges from Oracle's internal users (mostly SYS). This is particularly helpful for the second batch of the migration. * On the night the migration was done, most cron or DBMS jobs were disabled. TrendMicro was stopped (stupid InfoSec policy requires that virus scan be run on Linux servers). An important job such as this data migration is done in a Putty window launched in a remote Windows terminal on a production Windows server, not in Putty launched from my PC using tools such as screen. We first imported the group which act as parent tables (this group contains that 50G table), while most apps were still available for users. As a result, the downtime for most users was only about 20 minutes, during which the second impdp ran. * During the migration, monitoring was mostly done by querying v$session_longops: select * from gv$session_longops where sofar != totalwork; After the initial 20 to 30 minutes's self-calibration, this view quite accurately reported the progress of each step, mainly the export of the biggest table. * After the work, running my Oracle Session Statistic Watcher to watch 'parse count (failures)' statistic revealed sessions having SQL parse problems. Setting event 10035 on the session showed the SQLs that Oracle 11g can't parse. (It was tolerated in 10g due to Bug 5368296.) Developers fixed the code. * One of the slow SQLs in the source database took 10 seconds per execution. Now it takes .05 seconds, 200 times faster. Buffer gets per exec remains the same, indicating that the execution itself has not improved. (The plan only "syntactically" changed because of the way different versions display the plan.) Disk reads dramatically decreased for this SQL (from 2000 to 5). The new servers of the 2-node RAC each have 48G memory (we configured 40G SGA, 32G buffer cache lower limit) vs. the old servers' 16G memory (8G SGA with 4.5G buffer cache lower limit), 24 CPUs vs. 4. * A cron job failed to run correctly. It was found that lftp replaced ftp on Red Hat Linux 6. Chose to modify script instead of installing ftp. Also EOF token is no longer allowed to be followed by closing parenthesis (as in $(...)) or closing backtick in Bash 4. June 2014 http://www.itpub.net/thread-1874277-1-1.html