Backup Options   «Prev  Next»

Lesson 5Recovering a Database with inactive Redo Logs
ObjectiveRecover a Database with Inactive Redo Logs

Recovering Database with Inactive Redo Logs

Let us say you just spent 2 hours fixing the problems in your database and put it in ARCHIVELOG mode. Before you make a backup of this database, you decide to test the automatic creation of the archived log function. You issue the
ALTER SYSTEM SWITCH LOGFILE

command. The second time you issue the command you get the following error message:
SQL> alter system switch logfile; 
ORA-00740: LGWR process terminated with error

The error `ORA-00740: LGWR process terminated with error` is still a valid error in Oracle 19c. This error indicates that the Log Writer (LGWR) process has encountered a critical issue and terminated, which can lead to significant database issues since the LGWR is responsible for writing redo log entries from the redo log buffer to the online redo log files.
Common Causes of ORA-00740 in Oracle 19c
  1. Corrupted or Unavailable Redo Log Files:
    • LGWR may fail if the online redo log files are corrupted or inaccessible due to storage failures or file system errors.
  2. Insufficient Disk Space:
    • If there is no space available to write redo log files, the LGWR process will terminate.
  3. Redo Log File Permissions Issues:
    • LGWR may fail to write if the redo log files or their directories have incorrect file permissions.
  4. I/O Errors or Hardware Failures:
    • Issues at the storage level can cause the LGWR process to fail, especially if redo log files are on faulty or misconfigured storage devices.
  5. Redo Log Group Misconfiguration:
    • For example, having missing or mismatched members in a multiplexed redo log group.
  6. Log Writer Bugs:
    • While rare, some specific LGWR-related bugs in Oracle 19c may cause this error. Always check the Oracle Database Release Notes and apply the latest patches.
  7. File System Full or Corruption:
    • If the redo logs reside on a file system that becomes full or corrupted, LGWR will be unable to write to the redo logs.
  8. High Workload or Resource Contention:
    • Excessive workload or contention for I/O resources can prevent LGWR from performing its tasks, leading to this error.

Steps to Diagnose ORA-00740
  1. Check the Alert Log:
    • Review the alert.log file for detailed diagnostic messages about the LGWR termination. It may indicate the root cause of the problem.
      tail -f $ORACLE_BASE/diag/rdbms/<db_name>/<db_name>/trace/alert_<db_name>.log
              
  2. Examine LGWR Trace Files:
    • Look for the LGWR trace file in the Oracle Diagnostic Destination (ADR) for more specific error messages.
  3. Inspect Redo Log Status:
    • Query V$LOG and V$LOGFILE to verify the status of the redo logs.
      SELECT GROUP#, STATUS, ARCHIVED FROM V$LOG;
      SELECT GROUP#, MEMBER FROM V$LOGFILE;
              
  4. Check Disk Space and I/O Errors:
    • Use OS commands to verify disk space and check for any I/O-related issues:
      df -h        # Check file system space
      dmesg | grep I/O    # Look for disk I/O errors
              
  5. Validate Permissions:
    • Verify that redo log files and directories have the correct permissions and ownership.
  6. Check Multiplexed Redo Logs:
    • If multiplexing is used, ensure all members of each redo log group are intact and available.

Steps to Resolve ORA-00740
  1. Repair or Recreate Redo Log Files:
    • If redo log files are corrupt, use the following commands to drop and recreate the redo log group:
      ALTER DATABASE DROP LOGFILE GROUP <group_number>;
      ALTER DATABASE ADD LOGFILE GROUP <group_number> ('<path>') SIZE <size>;
              
  2. Increase Disk Space:
    • Free up space or expand the storage where redo logs reside.
  3. Verify and Fix Permissions:
    • Correct any file or directory permissions issues:
      chmod 660 <redo_log_file>
      chown oracle:oinstall <redo_log_file>
              
  4. Recover the Database:
    • If the database fails to open due to this error, you may need to perform a recovery:
      RECOVER DATABASE;
              
  5. Apply Patches:
    • Check for any Oracle 19c patches or updates that address LGWR-related issues.
  6. Reconfigure Multiplexing:
    • Add or repair members of a multiplexed redo log group to ensure redundancy.

Preventing ORA-00740 in Oracle 19c
  1. Implement Redo Log Multiplexing:
    • Configure multiple members in each redo log group on separate disks to increase fault tolerance.
  2. Monitor Disk Space and Storage Health:
    • Use tools like Oracle Enterprise Manager or OS utilities to proactively monitor storage health and capacity.
  3. Regular Backups and Archiving:
    • Ensure redo logs are archived regularly to avoid contention or space issues.
  4. Follow Best Practices for Redo Log Configuration:
    • Use appropriately sized redo log files to minimize log switches.
Conclusion The error `ORA-00740: LGWR process terminated with error` is still relevant in Oracle 19c. Diagnosing and resolving it involves investigating the redo log configuration, storage health, and system resource availability. Oracle's modern diagnostic tools, such as ADR and multiplexed redo logs, provide robust mechanisms to identify and mitigate the underlying issues.
As an experienced DBA, you immediately know that a background process is abnormally terminated and the Oracle instance is aborted. Since you do not have a backup of the fixed database and you do not want to spend another two hours fixing those problems, what can you do?
Solution:
In this situation, the only option you have is to recover the database with "inactive redo logs". The folowing series of images below describes the steps for "Recovering Database with Inactive Redo Log File".

1) Since the database was shut down abnormally, the DBA tries to mount it.
1) Since the database was shut down abnormally, the DBA tries to mount it. The DBA uses the startup open command to mount the database(1) and gets an error message showing what is wrong with the database. From the error message, the DBA knows that the online log #4 is corrupted(2)

2) The DBA queries the data dictionary view V$LOG to find information about the damaged online redo log.
2) The DBA queries the data dictionary view V$LOG to find information about the damaged online redo log. From the result, (4) the DBA knows that the online log #4 is not the current online redo log, and that online log #4 is archived (5). The value zero for the FIRST_CHANGE column indicates that there is some problem with the redo log file. The DBA concludes that no data is lost and, thus, no recovery is needed.

3) To correct the situation, the DBA needs to clear the corrupted online redo log and open the database.
3) To correct the situation, the DBA needs to clear the corrupted online redo log and open the database.

The code and all the command lines are displayed below.
SQL> STARTUP OPEN PFILE='C:\ORANT\DATABASE\INITORCL.ORA'
ORACLE instance started.
Total System Global Area                         11710464 bytes
Fixed Size                                          49152 bytes
Variable Size                                    11177984 bytes
Database Buffers                                   409600 bytes
Redo Buffers                                        73728 bytes
Database mounted.
ORA-00313: open failed for members of log group 4 of thread 1
ORA-00312: online log 4 thread 1: 'C:\ORANT\DATABASE\LOG1ORCL.ORA'
SQL> SELECT * FROM V$LOG; 
GROUP# THREAD# ... MEMBERS    ARC   STATUS     FIRST_CHAN FIRST_TIM
------ -------   -------    ---   ---------- ---------- ---------
     1       1         1     NO    CURRENT      6402439 28-FEB-00
     2       1         1    YES    INACTIVE     6382423 28-FEB-00
     3       1         1    YES    INACTIVE     6322223 25-FEB-00
     4       1         1    YES    INACTIVE     6342407 26-FEB-00
4 rows selected.
SQL> ALTER DATABASE CLEAR LOGFILE GROUP 4; 
Statement processed.
SQL> ALTER DATABASE OPEN; 
Statement processed.
SQL>

"ORA-00313" is still a valid error for the Oracle RDBMS. It indicates that the online redo log file cannot be opened. This error can occur for a variety of reasons, such as a corrupt redo log file, a missing redo log file, or a permissions issue.
The online redo log file is a critical component of the Oracle database. It is used to track changes made to the database so that those changes can be recovered in the event of a system failure or other disaster. If the online redo log file cannot be opened, the database cannot be opened. There are a number of things that can cause the error ORA-00313, including:
  • Corrupt redo log file: If the redo log file is corrupt, it cannot be opened and the error ORA-00313 will occur.
  • Missing redo log file: If the redo log file is missing, it cannot be opened and the error ORA-00313 will occur.
  • Permissions issue: If the user attempting to open the redo log file does not have the necessary permissions, the error ORA-00313 will occur.

If you encounter the error ORA-00313, you should first try to identify the cause of the problem. If the problem is caused by a corrupt redo log file, you will need to restore the database from a backup. If the problem is caused by a missing redo log file, you will need to create a new redo log file. If the problem is caused by a permissions issue, you will need to grant the necessary permissions to the user attempting to open the redo log file.
Here are some of the steps you can take to troubleshoot ORA-00313:
  • Check the Oracle alert log for any error messages that may be related to the online redo log file.
  • Use the V$LOGFILE view to check the status of the online redo log file.
  • Use the RMAN recovery tool to restore the database from a backup.

If you are still unable to resolve the problem, you may need to contact Oracle support for assistance.

Is Incomplete Recovery necessary?

When an online redo log file is lost or damaged due to media failure, it becomes almost impossible to perform a complete recovery. However, you may find that there is no data loss in the following situations:
  1. The damaged or lost online redo log is not current
  2. This online redo log is already archived
  3. The database has mirrored online redo log files
In these situations, incomplete database recovery is unnecessary. You can recover the database by getting rid of the damaged online redo log and creating a new one. The next lesson demonstrates how to recreate an online redo log file.

Recovering Database with Inactive RedoLogs - Quiz

Click the Quiz link below to review your understanding of database recovery with inactive redo logs.
Recovering Database with Inactive RedoLogs - Quiz

SEMrush Software Target 5SEMrush Software Banner 5