42th Zone: 2017

ORA-00210 ORA-00202 - Missing controlfiles (and redologs)

November 8, 2017

Scenario: Database is shutdown in NOARCHIVELOG with no backups ever made and controlfiles are lost.

When starting the database the following message is shown:

ORA-00205: error in identifying control file, check alert log for more info

On alert log:

ALTER DATABASE   MOUNT

ORA-00210: cannot open the specified control file

ORA-00202: control file: '/oracle/fast_recovery_area/ORCL/control02.ctl'

ORA-27037: unable to obtain file status

Shutdown the database:

SQL> shutdown immediate;

The CONTROLFILE recreation script will need to be created manually, or based on a previous execution of:

SQL> alter database backup controlfile to trace;

Or browse the FS and locate all datafiles and redologs

Example of a CREATE CONTROLFILE script:

I'll use this as create_control.sql below

CREATE CONTROLFILE REUSE DATABASE "ORCL" NORESETLOGS  NOARCHIVELOG

    MAXLOGFILES 16

    MAXLOGMEMBERS 3

    MAXDATAFILES 100

    MAXINSTANCES 8

    MAXLOGHISTORY 4672

LOGFILE

  GROUP 1 '/oracle/oradata/ORCL/redo01.log'  SIZE 50M BLOCKSIZE 512,

  GROUP 2 '/oracle/oradata/ORCL/redo02.log'  SIZE 50M BLOCKSIZE 512,

  GROUP 3 '/oracle/oradata/ORCL/redo03.log'  SIZE 50M BLOCKSIZE 512

-- STANDBY LOGFILE

DATAFILE

  '/oracle/oradata/ORCL/system01.dbf',

  '/oracle/oradata/ORCL/sysaux01.dbf',

  '/oracle/oradata/ORCL/undotbs01.dbf',

  '/oracle/oradata/ORCL/users01.dbf',

  '/oracle/oradata/ORCL/test01.dbf'

CHARACTER SET AL32UTF8

;

Start the database in NOMOUNT a execute the script:

SQL> startup nomount;

SQL> @create_control.sql

Control file created.

Database is left in mount mode and can now be opened:

SQL> alter database open;

alter database open

*

ERROR at line 1:

ORA-01113: file 1 needs media recovery

ORA-01110: data file 1: '/oracle/oradata/ORCL/system01.dbf'

SQL> recover database;

Media recovery complete.

SQL> alter database open;

Database altered.

Information will probably be shown that some file need recover, and RECOVER DATABASE command should help fixing the issue.

After this steps database is OPENED and everything should be OK!

BUT what if REDOLOGS were also lost (alas most times they are on the same mount points of CONTROLFILES) or someone tried to open with resetlogs!?

Changed first line of create_control.sql to:

CREATE CONTROLFILE REUSE SET DATABASE "ORCL" RESETLOGS  NOARCHIVELOG

In this scenario, you probably won't be able to complete media recovered due to missing redologs and inexistent archives:

SQL> startup

ORA-00205: error in identifying control file, check alert log for more info

SQL> shutdown immediate;

ORA-01507: database not mounted

ORACLE instance shut down.

SQL> startup nomount;

SQL> @create_control.sql

Control file created.

SQL> alter database open resetlogs;

alter database open resetlogs

*

ERROR at line 1:

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/oracle/oradata/ORCL/system01.dbf'

SQL> recover database;

ORA-00283: recovery session canceled due to errors

ORA-01610: recovery using the BACKUP CONTROLFILE option must be done

SQL> recover database using backup controlfile; -- noarchivelog database no luck

ORA-00279: change 76885912 generated at 11/08/2017 12:50:24 needed for thread 1

ORA-00289: suggestion : /oracle/oradata/arch1/1_1034_950889305.dbf

ORA-00280: change 76885912 for thread 1 is in sequence #1034

Specify log: {=suggested | filename | AUTO | CANCEL}

ORA-00308: cannot open archived log

'/oracle/oradata/arch1/1_1034_950889305.dbf'

ORA-27037: unable to obtain file status

Linux-x86_64 Error: 2: No such file or directory

Additional information: 3

There is a step that can be taken as a last resort.

The hidden parameter _allow_resetlogs_corruption , the name is pretty self explanatory

SQL> alter system set "_allow_resetlogs_corruption" = true scope = spfile; -- last resort

System altered.

SQL> shutdown;

ORA-01109: database not open

Database dismounted.

ORACLE instance shut down.

SQL> startup mount;

SQL> alter database open resetlogs;

Database altered.

On the alert log:

Wed Nov 08 12:59:22 2017

alter database open resetlogs

RESETLOGS is being done without consistancy checks. This may result

in a corrupted database. The database should be recreated.

RESETLOGS after incomplete recovery UNTIL CHANGE 76885912

This has a 50/50 percent change of working. In some test cases database opening ended up aborting after 01555 errors with no more change of recovery.

Verify consistency after it opens successfully reveals no problems:

RMAN> validate database;

...

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

1    OK     0              11751        89602           76886011

  File Name: /oracle/oradata/ORCL/system01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              29366

  Index      0              7094

  Other      0              41389

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

2    OK     0              16330        232961          76886020

  File Name: /oracle/oradata/ORCL/sysaux01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              113974

  Index      0              94231

  Other      0              8425

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

3    OK     0              1            64000           76886013

  File Name: /oracle/oradata/ORCL/undotbs01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              0

  Index      0              0

  Other      0              63999

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

4    OK     0              0            1284            76081965

  File Name: /oracle/oradata/ORCL/users01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              528

  Index      0              414

  Other      0              338

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

5    OK     0              588          638976          76662763

  File Name: /oracle/oradata/ORCL/test01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              577853

  Index      0              57493

  Other      0              3042

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

including current control file for validation

including current SPFILE in backup set

channel ORA_DISK_1: validation complete, elapsed time: 00:00:01

List of Control File and SPFILE

===============================

File Type    Status Blocks Failing Blocks Examined

------------ ------ -------------- ---------------

SPFILE       OK     0              2

Control File OK     0              966

Finished validate at 2017-11-08 13:07:42

And do not forget to re-add the tempfiles:

SQL> select * from dba_temp_files;

no rows selected

SQL> alter tablespace temp add tempfile size 1g;

Tablespace altered.

SQL> select file_name, tablespace_name from dba_temp_files;

FILE_NAME                                                    TABLESPACE_NAME

------------------------------------------------------------ ------------------------------

/oracle/oradata/ORCL/datafile/o1_mf_temp_f05xdr2y_.tmp       TEMP

Connection Reset using ORACLE JDBC

June 14, 2017

Applies to:

Oracle 11.2.0.4

on Linux Platform

Symptoms:

example 1:

Caused by: java.sql.SQLRecoverableException: Erreur d'E/S: Connection reset

example 2:

ORACLE CONNECTION PROBLEM (Attempt number 1 of 3) MESSAGE:Cannot create PoolableConnectionFactory (Erreur d'E/S: Connection reset)]

example 3:

return jpype.java.sql.DriverManager.getConnection(*driver_args)

jpype._jexception.SQLRecoverableExceptionPyRaisable: java.sql.SQLRecoverableException: Erreur d'E/S: Connection reset

Cause:

java.security.SecureRandom (#5 ) is a standard API provided by sun. Among various methods offered by this class void nextBytes(byte[]) is one. This method is used for generating random bytes. Oracle 11g JDBC drivers use this API to generate random number during login. Users using Linux have been encountering SQLException("Io exception: Connection reset").

The problem is two fold

The JVM tries to list all the files in the /tmp (or alternate tmp directory set by -Djava.io.tmpdir) when SecureRandom.nextBytes(byte[]) is invoked. If the number of files is large the method takes a long time to respond and hence cause the server to timeout
The method void nextBytes(byte[]) uses /dev/random on Linux and on some machines which lack the random number generating hardware the operation slows down to the extent of bringing the whole login process to a halt. Ultimately the the user encounters SQLException("Io exception: Connection reset")

Users upgrading to 11g can encounter this issue if the underlying OS is Linux which is running on a faulty hardware.

The cause of this has not yet been determined exactly. It could either be a problem in your hardware or the fact that for some reason the software cannot read from dev/random

Solution:

Change the setup for your application, so you add the next parameter to the java command:

-Djava.security.egd=file:/dev/./urandom

Or edit $JAVA_HOME/jre/lib/security/java.security and add the line : securerandom.source=file:/dev/./urandom

Why not securerandom.source=file/dev/urandom (#4)

Note the "/./" characters in the value. They are needed to work around known Oracle JRE bug #6202721.

See also JDK Enhancement Proposal 123. It is known that implementation of SecureRandom was improved in Java 8 onwards.

Sources:

https://community.oracle.com/thread/943911

http://www.usn-it.de/index.php/2009/02/20/oracle-11g-jdbc-driver-hangs-blocked-by-devrandom-entropy-pool-empty/

https://kb.sos-berlin.com/display/PKB/ORACLE+IO+exception+-+Connection+reset+or+Timeout+reached+-+30s+-+for+process

#4 https://docs.oracle.com/cd/E13209_01/wlcp/wlss30/configwlss/jvmrand.html

#5 https://stackoverflow.com/questions/2327220/oracle-jdbc-intermittent-connection-issue

Change TFA default OSWATCHER parameters

May 26, 2017

By default, after TFA installation OSWATCHER runs with:

> ps -edf | grep -i osw
... /bin/sh ./OSWatcher.sh 30 48 ...

Meaning that a snapshot is made every 30 minutes and kept for 48 hours.

Since this 48 hours seem to be insufficient for a problem detection/analysis so I wanted to double it, but nowhere on tfactl was I able to find any way to change it.

Looking into the file $TFA_BASE/$hostname/tfa_home/ext/oswbb/oswbb.pm

Where $TFA_BASE is the path you specified when install TFA:

Enter a location for installing TFA (/tfa will be appended if not supplied) [/root/tfa]

I found that a file called .osw.prop is being read to get the values for the variables:

my $tool = "oswbb";
my $tfa_base = tfactlshare_get_tfa_base($tfa_home);
..
my $tool_base = catfile($tfa_base, "suptools", "$hostname", $tool, $current_user);
my $osw_prop = catfile($tool_base, ".osw.prop");
..
if ( -f $osw_prop )
{
open(ORF, $osw_prop);
while()
{
if ( ! $interval && /interval=(\d+)/ )
{
$interval = $1;
}
elsif ( ! $hours && /hours=(\d+)/ )
{
$hours = $1;
}
elsif ( ! $zip && /zip=(.*)/ )

The location of the file is not clear since it is given by the function tfactlshare_get_tfa_base, so:

find . -name ".osw.prop"

And the file is found in:

> $TFA_BASE/repository/suptools/$hostname/oswbb/$current_user/.osw.prop

Note: after know this the file is easily found by checking the archive location of the running process:

[root@hostname root]# ps -edf | grep -i osw
root 31662 1 0 20:57 pts/0 00:00:00 /bin/sh ./OSWatcher.sh 30 48 NONE /oracle/products/tfa/repository/suptools/hostname/oswbb/root/archive

The .osw.prop resides in the parent directory.

Having located the file is just a matter of editing it:

> cat $TFA_BASE/repository/suptools/$hostname/oswbb//$current_user/.osw.prop
interval=30
hours=48
zip=NONE
runuser=oracle

> vi $TFA_BASE/repository/suptools/$hostname/oswbb//$current_user/.osw.prop

> cat $TFA_BASE/repository/suptools/$hostname/oswbb//$current_user/.osw.prop
interval=30
hours=96
zip=NONE
runuser=oracle

> tfactl stop oswbb
> tfactl start oswddb

And now the values are modified.

> ps -edf | grep -i osw
... /bin/sh ./OSWatcher.sh 30 96 ...

I think there should be a easier way to do this: Making this change in a cluster environment we'll need to run it in all nodes of the cluster.

ORA-00210 ORA-00202 - Missing controlfiles (and redologs)

November 8, 2017

Connection Reset using ORACLE JDBC

June 14, 2017

Change TFA default OSWATCHER parameters

May 26, 2017

About Me

Category

Archive

Links

Feed