Trace File Analyzer (TFA) : pas un simple outil de collecte pour le support Oracle

Dans le cadre de l’ouverture d’une Service Request (SR), il est fréquent que le support Oracle demande de collecter des traces via l’outil TFA : Trace File Analyzer.
Dans le cas d’un problème sur un cluster Oracle, il permet de récupérer les traces sur tous les nœuds et est obligatoire pour qu’une SR soit prise en compte.
TFA n’est pas un simple outil de collecte de traces, il contient de nombreux outils complémentaires.

 
 
La liste des outils est la suivante :

  • ORAchk : Oracle Stack Health Checks on non-engineered systems
  • EXAchk : Oracle Stack Health Checks on Engineered Systems
  • oswatcher : Collect and archive OS metrics, useful for instance / node evictions & performance Issues.
  • procwatcher : Automate & capture database performance diagnostics & session level hangs
  • oratop : Near real-time database monitoring
  • alertsummary : Provides summary of events for one or more database or ASM alert files from all nodes
  • ls / dir : Lists all files TFA knows about for a given file name pattern across all nodes
  • pstack : Generate process stack for specified processes across all nodes
  • grep / findstr : Search alert or trace files with a given database and file name pattern, for a search string
  • summary : High level summary of the configuration
  • vi / notepad : Open alert or trace files for viewing a given database and file name pattern in the vi editor
  • tail : Run a tail on an alert or trace files for a given database and file name pattern
  • param : Show all database and OS parameters that match a specified pattern
  • dbglevel : Set and unset multiple CRS trace levels with one command
  • history : Show the shell history for the tfactl shell
  • changes : Report any noted changes in the system setup over a given time period. This includes database a parameters, OS parameters, patches applied etc
  • calog : Reports major events from the Cluster Event log
  • events : Reports warnings and errors seen in the logs
  • managelogs : Shows disk space usage and purges ADR log and trace files
  • ps / tasklist : Finds processes
  • triage : Summarize oswatcher/exawatcher data

TFA est disponible au téléchargement sur le support Oracle. Il est également intégré à certains PSU.
L’installation s’effectue de la manière suivante :

unzip -q TFA-LINUX_v12.2.1.3.1.zip -d tfa
mkdir /u01/app/oracle/admin/tfa
cd /u01/app/oracle/admin/tfa
./installTFA-LINUX -extractto /u01/app/oracle/admin/tfa
TFA Installation Log will be written to File : /tmp/tfa_install_4846_2017_12_14-01_29_48.log
Starting TFA installation
TFA Version: 122130 Build Date: 201711060821
Running Extractto Setup for TFA as user oracle ...
Enabling Access for user oracle on oel74db12cr2...
TFA is successfully installed...
TFA_BASE for user oracle : /u01/app/oracle/admin/oracle.tfa/oel74db12cr2/oracle
JAVA_HOME for running TFA : /u01/app/oracle/admin/tfa/tfa_home/jre
Adding directories to TFA. It might take couple of minutes. Please wait...
Added 1/1 directories to TFA... 100%
Successfully added directories to TFA
.------------------------------------------------------------------------------.
|                         Summary of TFA Configuration                         |
+------------+-----------------------------------------------------------------+
| Parameter  | Value                                                           |
+------------+-----------------------------------------------------------------+
| Repository | /u01/app/oracle/admin/oracle.tfa/oel74db12cr2/oracle/repository |
| TFA_BASE   | /u01/app/oracle/admin/oracle.tfa/oel74db12cr2/oracle            |
| TFA HOME   | /u01/app/oracle/admin/tfa/tfa_home                              |
| JAVA_HOME  | /u01/app/oracle/admin/tfa/tfa_home/jre                          |
'------------+-----------------------------------------------------------------'
Run Inventory process started... It might take a couple of minutes.
Run Inventory process completed.
Successfully updated r.tfar_monitor.output to /u01/app/oracle/admin/oracle.tfa/oel74db12cr2/oracle/oel74db12cr2/output/metadata in TFA...
OSWatcher is already deployed at /u01/app/oracle/admin/tfa/tfa_home/ext/oswbb
Starting OSWatcher
Found new ORACLE_HOME /u01/app/oracle/product/12.2.0/dbhome_1
Usage : /u01/app/oracle/admin/tfa/tfa_home/bin/tfactl  [options]
 commands:diagcollect|print|setupmos|upload
For detailed help on each command use:
 /u01/app/oracle/admin/tfa/tfa_home/bin/tfactl  -help
Moving Install log file to /u01/app/oracle/admin/tfa/tfa_home/log

L’outil en ligne de commande est  tfactl.
La commande suivante permet d’afficher le paramétrage :

$ tfactl print config
.------------------------------------------------------------------------------------.
|                                    oel74db12cr2                                    |
+-----------------------------------------------------------------------+------------+
| Configuration Parameter                                               | Value      |
+-----------------------------------------------------------------------+------------+
| TFA Version                                                           | 12.2.1.3.1 |
| Java Version                                                          | 1.8        |
| Public IP Network                                                     | false      |
| Automatic Diagnostic Collection                                       | true       |
| Alert Log Scan                                                        | true       |
| Disk Usage Monitor                                                    | true       |
| Managelogs Auto Purge                                                 | false      |
| Trimming of files during diagcollection                               | false      |
| Inventory Trace level                                                 | 1          |
| Collection Trace level                                                | 1          |
| Scan Trace level                                                      | 1          |
| Other Trace level                                                     | 1          |
| Repository current size (MB)                                          | 0          |
| Repository maximum size (MB)                                          | 10240      |
| Max Size of TFA Log (MB)                                              | 50         |
| Max Number of TFA Logs                                                | 10         |
| Max Size of Core File (MB)                                            | 50         |
| Max Collection Size of Core Files (MB)                                | 500        |
| Minimum Free Space to enable Alert Log Scan (MB)                      | 500        |
| Time interval between consecutive Disk Usage Snapshot(minutes)        | 60         |
| Time interval between consecutive Managelogs Auto Purge(minutes)      | 60         |
| Logs older than the time period will be auto purged(days[d]|hours[h]) | 30d        |
| Automatic Purging                                                     | true       |
| Age of Purging Collections (Hours)                                    | 12         |
| TFA IPS Pool Size                                                     | 5          |
'-----------------------------------------------------------------------+------------''

Il est possible de lister les répertoires pris en compte pour la collecte :

$ tfactl  print directories
1. /u01/app/oracle/cfgtoollogs
[CFGTOOLS]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
2. /u01/app/oracle/cfgtoollogs/dbca/TESTDB12
[RDBMS]
{RDBMS|database=TESTDB12}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
3. /u01/app/oracle/diag/rdbms/testdb12/TESTDB12/cdump
[RDBMS]
{RDBMS|database=testdb12, RDBMS|instance=TESTDB12}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
4. /u01/app/oracle/diag/rdbms/testdb12/TESTDB12/incident
[RDBMS]
{RDBMS|database=testdb12, RDBMS|instance=TESTDB12}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
5. /u01/app/oracle/diag/rdbms/testdb12/TESTDB12/trace
[RDBMS]
{RDBMS|database=testdb12, RDBMS|instance=TESTDB12}
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
6. /u01/app/oracle/diag/tnslsnr/oel74db12cr2/listener
[TNS]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
7. /u01/app/oracle/product/12.2.0/dbhome_1/cfgtoollogs
[CFGTOOLS]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
8. /var/adm
[OS]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
9. /var/log
[OS]
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Il est possible de lister rapidement les erreurs internes détectés dans les fichiers d’alerte :

$ tfactl alertsummary
Output from host : oel74db12cr2
 ------------------------------
Reading /u01/app/oracle/diag/rdbms/testdb12/TESTDB12/trace/alert_TESTDB12.log
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
------------------------------------------------------------------------
 14 12 2017 00:33:50 Database started
 ------------------------------------------------------------------------
 14 12 2017 00:35:27 Database started
 ------------------------------------------------------------------------
 14 12 2017 00:35:53 Database started
 ------------------------------------------------------------------------
 14 12 2017 00:38:23 Database started
 ------------------------------------------------------------------------
 14 12 2017 01:10:41 Database started
 ------------------------------------------------------------------------
 14 12 2017 01:14:32 Database started
Summary: Ora-600=0, Ora-7445=0, Ora-700=0
 ~~~~~~~
 Warning: Only FATAL errors reported
 Warning: These errors were seen and NOT reported
 Ora-00313

La commande DIAGCOLLECT est utilisée pour collecter les traces.

Usage : /u01/app/oracle/admin/tfa/tfa_home/bin/tfactl diagcollect [ [component_name1] [component_name2] ... [component_nameN] [-srdc ]]  [-node ] [-tag ] [-z ] [-last | -from 

Certains composants de collecte sont associés aux AWR et données ASH des instances de bases de données et nécessitent des packs (DIAGNOSTIC).
Le format de date est spécifique : ${GI_HOME}/tfa/bin/tfactl diagcollect -from « MMM/dd/yyyy hh:mm:ss » -to « MMM/dd/yyyy hh:mm:ss »
Par exemple :

  • ${GI_HOME}/tfa/bin/tfactl diagcollect -crs -os -asm -cfgtools -from « May/18/2016 13:00:00 » -to « May/18/2016 15:00:00 » -node node1,node2
  • ${GI_HOME}/tfa/bin/tfactl diagcollect -database db1,db2 -node node1 -from « Oct/30/2017 17:27:00 » -to « Oct/31/2017 17:27:00 »
  • ${GI_HOME}/tfa/bin/tfactl diagcollect -all -from « Oct/30/2017 17:27:00 » -to « Oct/31/2017 17:27:00 »

Par exemple, la commande DIAGCOLLECT permet également de collecter des informations liées à des problèmes spécifiques :

Ci-dessous la configuration des ressources Linux :

$ tfactl diagcollect -srdc dbunixresources
Enter the Database Name [Required for this SRDC] : TESTDB12
Do you have the issue now [Y|y|N|n] [Y]: Y
As you have indicated that the performance issue is currently happening,
will be collecting snapshots for the following periods:
Start time when the performance was bad: Dec/14/2017 01:00:04
Stop  time when the performance was bad: Dec/14/2017 02:00:04
Scripts to be run by this srdc: rdahcve1120 rdahcve1210 rdahcve1110 runawr dbadmin_os_script dbadmin_os_linuxonly srdc_db_NUMA_config.sql cp_tns
Components included in this srdc: DATABASE NOCHMOS OS TNS
Collection Id : 20171214020005oel74db12cr2
Detailed Logging at : /u01/app/oracle/admin/oracle.tfa/oel74db12cr2/oracle/repository/srdc_dbunixresources_collection_Thu_Dec_14_02_00_05_CET_2017_node_local/diagcollect_20171214020005_oel74db12cr2.log
2017/12/14 02:00:09 CET : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2017/12/14 02:00:09 CET : Collection Name : tfa_srdc_dbunixresources_Thu_Dec_14_02_00_05_CET_2017.zip
2017/12/14 02:00:09 CET : Scanning of files for Collection in progress...
2017/12/14 02:00:09 CET : Collecting additional diagnostic information...
2017/12/14 02:00:10 CET : Completed collection of additional diagnostic information...
2017/12/14 02:00:14 CET : Getting list of files satisfying time range [12/13/2017 14:00:09 CET, 12/14/2017 02:00:14 CET]
2017/12/14 02:00:20 CET : Collecting ADR incident files...
2017/12/14 02:00:20 CET : Completed Local Collection
.-----------------------------------------.
|            Collection Summary           |
+--------------+-----------+-------+------+
| Host         | Status    | Size  | Time |
+--------------+-----------+-------+------+
| oel74db12cr2 | Completed | 269kB |  11s |
'--------------+-----------+-------+------'
Logs are being collected to: /u01/app/oracle/admin/oracle.tfa/oel74db12cr2/oracle/repository/srdc_dbunixresources_collection_Thu_Dec_14_02_00_05_CET_2017_node_local
/u01/app/oracle/admin/oracle.tfa/oel74db12cr2/oracle/repository/srdc_dbunixresources_collection_Thu_Dec_14_02_00_05_CET_2017_node_local/oel74db12cr2.tfa_srdc_dbunixresources_Thu_Dec_14_02_00_05_CET_2017.zip

L’outil ORATOP permet de superviser une instances de bases de données (Near Real-time Monitoring of Databases).

Usage : /u01/app/oracle/tfa/oel74db12cr2/tfa_home/bin/tfactl.pl [run] oratop -database   
Options:
-database  Database name to run oratop
 : default will be / as sysdba. Specify a different user using
 {username[/password][@connect_identifier] | / }
 [AS {SYSDBA|SYSOPER}]
 connect_identifier: host[:port]/[service_name]
:
-d : real-time (RT) wait events, section 3 (default is Cumulative)
-k : FILE#:BLOCK#, section 4 lt is (EVENT/LATCH)
-m : MODULE/ACTION, section 4 (default is USERNAME/PROGRAM)
-s : SQL mode, section 4 (default is process mode)
-c : database service mode (default is connect string)
-f : detailed format, 132 columns (default: standard, 80 columns)
-b : batch mode (default is text-based user interface)
-n : maximum number of iterations (requires number)
-i : interval delay, requires value in seconds (default: 5s)
e.g:
 /u01/app/oracle/tfa/oel74db12cr2/tfa_home/bin/tfactl.pl oratop -database testdb1
 /u01/app/oracle/tfa/oel74db12cr2/tfa_home/bin/tfactl.pl oratop -database testdb1 -bn1
/u01/app/oracle/tfa/oel74db12cr2/tfa_home/bin/tfactl.pl run oratop -database testdb1
 /u01/app/oracle/tfa/oel74db12cr2/tfa_home/bin/tfactl.pl run oratop -database testdb1 -bn1

En mode texte :

$ ./tfactl oratop -database TESTDB12 -i 10 -bn10
Oratop successfully tested.
ohome /u01/app/oracle/product/12.2.0/dbhome_1
ouser oracle
osid TESTDB12
db_running 1
running_local 1
oversion 12.2.0.1.0
Cycle 1 - oratop: Release 14.2.1 Production on Thu Dec 21 22:09:52 2017
Oracle 12c - TES 22:09:47 up: 382s,   1 ins,    0 sn,   0 us, 1.1G mt,    0% db
ID %CPU LOAD %DCU   AAS  ASC  ASI  ASW  AST IOPS %FR   PGA UTPS UCPS SSRT  %DBT
-------------------------------------------------------------------------------
 1    8    0    0     0    0    0    0    0    5   5  130M    0    0 190u     0
EVENT (C)                        TOT WAITS   TIME(s)  AVG_MS  PCT    WAIT_CLASS
-------------------------------------------------------------------------------
DB CPU                                            14           36
db file sequential read               4326         8     2.0   22      User I/O
enq: JG - queue lock                    11         7   665.4   18         Other
oracle thread bootstrap                119         5    49.8   15         Other
control file heartbeat                   1         4  4004.0   10         Other
ID   SID     SPID USR PROG S  PGA SQLID/BLOCKER OPN  E/T STA STE EVENT/*LA  W/T
-------------------------------------------------------------------------------
 1   242     5817 SYS orat D 4.2M d7wtth4kv4k1c SEL    0 ACT CPU cpu runqu   2u
Cycle 2 - oratop: Release 14.2.1 Production on Thu Dec 21 22:10:02 2017
Oracle 12c - TES 22:09:47 up: 392s,   1 ins,    0 sn,   0 us, 1.1G mt,    0% db
ID %CPU LOAD %DCU   AAS  ASC  ASI  ASW  AST IOPS %FR   PGA UTPS UCPS SSRT  %DBT
-------------------------------------------------------------------------------
 1    8    0    0     0    0    0    0    0    5   5  130M    0    0 190u     0
EVENT (C)                        TOT WAITS   TIME(s)  AVG_MS  PCT    WAIT_CLASS
-------------------------------------------------------------------------------
DB CPU                                            14           36
db file sequential read               4326         8     2.0   22      User I/O
enq: JG - queue lock                    11         7   665.4   18         Other
oracle thread bootstrap                119         5    49.8   15         Other
control file heartbeat                   1         4  4004.0   10         Other
ID   SID     SPID USR PROG S  PGA SQLID/BLOCKER OPN  E/T STA STE EVENT/*LA  W/T
-------------------------------------------------------------------------------
 1   242     5817 SYS orat D 4.2M d7wtth4kv4k1c SEL    0 ACT CPU cpu runqu   6u

Et en mode graphique :

Pour information, oratop ne fonctionne pas en version 12.2.1.3.0, bug corrigé en 12.2.1.3.1 : Oratop Does Not Work In Trace File Analyzer 12.2.1.3.0 (Doc ID 2339713.1).

1 réflexion sur “Trace File Analyzer (TFA) : pas un simple outil de collecte pour le support Oracle”

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *