Report Server Notes Oracle Application Server 10g (specifically 10.1.2.3), unless otherwise noted * Standalone instead of in-process Is it "recommended"?, according to wording at http://docs.oracle.com/cd/E15523_01/bi.1111/b32121/pbr_arch003.htm#CHDFEFBJ Check if it's in-process or standalone at http://server/reports/rwservlet/getserverinfo Set SERVER_IN_PROCESS=NO in $ORACLE_HOME/conf/rwservlet.properties (11g path is different) opmnctl stopproc process-type=OC4J_BI_Forms Run customized start_report_server.sh (see next bullet) opmnctl startproc process-type=OC4J_BI_Forms * Start standalone report server You may add to /etc/rc.local: su - orappsrv -c "/path_to_dir/start_report_server.sh" > /tmp/start_report_server.out 2>&1 #!/bin/bash #start_report_server.sh: Start standalone report server REPORT_SERVER=rep_$(hostname -s)_oracleas1 #Without batch=yes, you would need to set DISPLAY to a real X server and it cannot be localhost:0 #even if you have X running locally nohup /u01/app/orappsrv/OraHome_1/bin/rwserver.sh $REPORT_SERVER batch=yes & Ref: Note 265213.1 (How to start and stop reports server in 10g R1 / R2) * If starting report server complains about naming server: REP-56114: Bind to naming server failed :oracle.reports.RWException: IDL:oracle/reports/RWException:1.0 even though naming server process exists $ opmnctl status ... namingservice | namingservice | 4728 | Alive then, follow Note 428168.1. * Check to make sure report server runs OK (by creating the test report; other health check methods are in Note 387208.1) Create cron job to run #!/bin/bash #ck_report_server.sh: Generate a dummy report, notify us if failed export ORACLE_HOME=/u01/app/orappsrv/OraHome_1 export PATH=/usr/bin:/bin:$ORACLE_HOME/bin RECIPIENT="emailaddress" TIMEOUT=15 #usually 5 seconds is enough REPORT_SERVER=rep_$(hostname -s)_oracleas1 #default report server cd /path_to_dir #best practice instead of default to ~ rm /tmp/test.rdf rwclient.sh server=$REPORT_SERVER report=test.rdf destype=FILE desname=/tmp/test.pdf desformat=pdf & sleep $TIMEOUT [[ $(ps -opid= -p $!) ]] && (mail -s "Creating report hangs on $(hostname -s)" $RECIPIENT < /dev/null; kill $!; exit) [[ -e /tmp/test.pdf ]] || mail -s "Report server on $(hostname -s) has stopped working!" $RECIPIENT < /dev/null * Check "health" of report server (engine?) process (by strace) #find a report server java process ps -ef | grep rep_ #trace it (follow to child process or thread in the process, show string length to 100 chars, trace write call only) strace -f -s 100 -e trace=write -p Normally the above should not show anything or anything significant. If it shows process state dump like this [note] $ strace -f -s 85 -e trace=write -p 3692 Process 3713 attached with 14 threads - interrupt to quit [ Process PID=3692 runs in 32 bit mode. ] [pid 3713] write(1, "/P/- 1 NONE \n 3 85d6e40 85d66bc I/P/A 1 NONE \n 4 85d6ea4 85d6"..., 4096) = 4096 [pid 3713] write(1, "mespace0002/PKG_ACC_DATA \nhash=3c759a7e7d2a0ed0235c84648db74ce timestamp=NULL\nnamespa"..., 4096) = 4096 [pid 3713] write(1, "ATA BLOCKS:\n data# heap pointer status pins change\n ----- -------- -------- --"..., 4096) = 4096 [pid 3713] write(1, "ch can cause a deadlock.\nThis should not be reported to Oracle Support.\nThe following"..., 4096) = 4096 [pid 3713] write(1, "---------------------\n\n\n---------- DUMP OF WAITING AND BLOCKING LOCKS ----------\n----"..., 4096 this report server process is already bad. If there're other processes running, you may still be able to create reports. Otherwise, users will complain, and latest files in $ORACLE_HOME/reports/cache do not reflect latest user request to create reports. Killing this java process immediately allows "blocked" report creation requests to be processed (if there were queued) and cache directory updated, and one or more report java processes may be spawned immediately if this is an in-process report server. Create a cron job to check with strace: #!/bin/bash #ck_report_server2.sh: strace report server process, notify me if found library cache deadlock (ORA-4020) #See www.freelists.org/post/oracle-l/OAS-report-server-sometimes-does-not-create-reports export ORACLE_HOME=/u01/app/orappsrv/OraHome_1 export PATH=/usr/bin:/bin:$ORACLE_HOME/bin RECIPIENT="emailaddress" TIMEOUT=5 #how long to trace REPORT_SERVER=rep_$(hostname -s)_oracleas1 #default report server for i in $(ps -e -opid,args | grep rep_ | grep java | awk '{print $1}'); do { strace -o $i.strace -f -s 1000 -e trace=write -p $i & } sleep $TIMEOUT egrep -q 'Namespace0002|status=VALD|deadlock|ORA-04020' $i.strace && echo "Report server process $i probably hangs on $(hostname -s)" | mail -s "Report server on $(hostname -s)" $RECIPIENT kill $! [[ -e $i.strace && ! -s $i.strace ]] && rm $i.strace #remove empty file done * More logging If logs under $ORACLE_HOME/reports/logs/ are useless because they're too terse, uncomment in report server.conf to trace report server and engine. See Note 237301.1. (Frowned upon by Oracle support, worrying about "resource intensive"-ness) Ref: my SR https://support.oracle.com/epmos/faces/SrDetail?srNumber=3-6105527791 Increasing cacheSize, initEngine, maxEngine, minEngine, callbackTimeOut, jvm memory, deleting .dat file, cache files, etc. doesn't seem to have much effect. [note] I saved the screen shot below and replaced all \n in the strace output with Carriage Returns for easy reading. Text starting with <-- are my comments. Note that the package PKG_ACC_DATA is not a database object (not in data dictionary such as dba_objects), but one in the report. The mystery is why the exclusive mode lock hold and request. The addresses for the object handle and waiting or blocking session are not found in the database, which is 64-bit, not 32-bit. The report server java process is 32-bit. (Does the report server implement its own sophiscated RDBMS engine, or at least the library cache locking mechanism?) [2012-09 Update] The ORA-4020 deadlock error seems to have gone away. Compiling the report locally on the production server probably made it disappear; previously the report was compiled in Dev and the .rep file was scp'ed to Prod, which to the best of our knowledge is set up exactly the same as Dev. After that, the report request failed on reports processing a large amount of data; turning off trace in .conf (see above) and bouncing the report server corrected it. $ ps -ef | grep rep_ orappsrv 7180 20458 0 12:46 ? 00:00:10 /u01/app/orappsrv/OraHome_1/jdk/jre/bin/java -server -cp /u01/app/orappsrv/OraHome_1/j2ee/home/lib/ojsp.jar:/u01/app/orappsrv/OraHome_1/reports/jlib/rwrun.jar:/u01/app/orappsrv/OraHome_1/jl ib/zrclient.jar -Duser.language=en -Duser.region=US -Xmx256M oracle.reports.engine.RWEngine name=rwEng-0 server=rep_dcprpcrisoas1_oracleas1 ORACLE_HOME=/u01/app/orappsrv/OraHome_1 engineimplclass=oracle.reports.engine.EngineImpl cacheDir=/u01/app/orappsrv/OraHome_1/reports/cache server_ior="rep_dcprpcrisoas1_oracleas1_28329183_1345571213451" $ strace -f -e trace=write -s 100000 -p 7180 #all \n's below are replaced with Carriage Returns Process 7202 attached with 14 threads - interrupt to quit [ Process PID=7180 runs in 32 bit mode. ] [pid 7202] write(1, "P/A 1 NONE 4 878572c 8784b14 I/P/A 1 NONE 7 8785364 87847b8 I/P/A 1 NONE ------------- BLOCKING LOCK ------------ ---------------------------------------- SO: 0x86a81fc, type: 6, owner: 0x8692f5c, flag: INIT/-/-/0x00 LIBRARY OBJECT LOCK: lock=86a81fc handle=87825b0 mode=X call pin=(nil) session pin=(nil) hpc=0000 hlc=0000 htl=0x86a823c[0x86b6dc8,0x86a81e0] htb=0x86b6dc8 user=868fd08 session=868fd08 count=1 flags=[0000] savepoint=0 LIBRARY OBJECT HANDLE: handle=87825b0 name=/Namespace0002/PKG_ACC_DATA <-- Namespace0002 is a good keyword to search, specific to Oracle reports hash=3c759a7e7d2a0ed0235c84648db74ce timestamp=NULL namespace=BODY flags=KGHP/TIM/SML/[02000000] kkkk-dddd-llll=0000-009d-009d lock=X pin=X latch#=1 hpc=fff8 hlc=fff8 <-- 9d is 10011101 in binary, heaps 0,2,3,4,7 are loaded lwt=0x878260c[0x86a81b4,0x86a81b4] ltm=0x8782614[0x8782614,0x8782614] pwt=0x87825f0[0x87825f0,0x87825f0] ptm=0x87825f8[0x87825f8,0x87825f8] ref=0x878262c[0x878262c, 0x878262c] lnd=0x8782638[0x8782638,0x8782638] LIBRARY OBJECT: object=87855a4 type=PKBD flags=EXS/LOC/ALT[0025] pflags= [00] status=VALD load=0 DEPENDENCIES: count=1 size=64 PARAMETERS are used DATA BLOCKS: data# heap pointer status pins change ----- -------- -------- ------ ---- ------ 0 8829114 0 I/P/A 0 NONE 2 8785664 0 -/P/- 1 NONE 3 87856c8 8784f44 I/P/A 1 NONE 4 878572c 8784b14 I/P/A 1 NONE 7 8785364 87847b8 I/P/A 1 NONE -------------------------------------------------------- A deadlock among DDL and parse locks is detected. This deadlock is usually due to user errors in the design of an application or from issuing a set of concurrent statements which can cause a deadlock. This should not be reported to Oracle Support. The following information may aid in finding the errors which cause the deadlock: ORA-04020: deadlock detected while trying to lock object /Namespace0002/PKG_ACC_DATA -------------------------------------------------------- object waiting waiting blocking blocking handle session lock mode session lock mode -------- -------- -------- ---- -------- -------- ---- 0x87825b0 0x868fd08 0x86a81a0 X 0x868fd08 0x86a81fc X <-- waiting and blocking modes are both exclusive, addresses are 32-bit, self-deadlock -------------------------------------------------------- ---------- DUMP OF WAITING AND BLOCKING LOCKS ---------- -------------------------------------------------------- ------------- WAITING LOCK ------------- ---------------------------------------- SO: 0x86a81a0, type: 6, owner: 0x8692f3c, flag: INIT/-/-/0x00 LIBRARY OBJECT LOCK: lock=86a81a0 handle=87825b0 request=X call pin=(nil) session pin=(nil) hpc=0000 hlc=0000 htl=0x86a81e0[0x86a823c,0x86b6dc8] htb=0x86b6dc8 user=868fd08 session=868fd08 count=0 flags=[0000] savepoint=0 LIBRARY OBJECT HANDLE: handle=87825b0 name=/Namespace0002/PKG_ACC_DATA hash=3c759a7e7d2a0ed0235c84648db74ce timestamp=NULL namespace=BODY flags=KGHP/TIM/SML/[02000000] kkkk-dddd-llll=0000-009d-009d lock=X pin=X latch#=1 hpc=fff8 hlc=fff8 lwt=0x878260c[0x86a81b4,0x86a81b4] ltm=0x8782614[0x8782614,0x8782614] pwt=0x87825f0[0x87825f0,0x87825f0] ptm=0x87825f8[0x87825f8,0x87825f8] ref=0x878262c[0x878262c, 0x878262c] lnd=0x8782638[0x8782638,0x8782638] LIBRARY OBJECT: object=87855a4 type=PKBD flags=EXS/LOC/ALT[0025] pflags= [00] status=VALD load=0 DEPENDENCIES: count=1 size=64 PARAMETERS are used DATA BLOCKS: data# heap pointer status pins change ----- -------- -------- ------ ---- ------ 0 8829114 0 I/P/A 0 NONE 2 8785664 0 -/P/- 1 NONE 3 87856c8 8784f44 I/P/A 1 NONE 4 878572c 8784b14 I/P/A 1 NONE 7 8785364 87847b8 I/P/A 1 NONE ------------- BLOCKING LOCK ------------ ...[more repeated text snipped here]...