Oracle Files Administration Guide 9.0.3 Part Number A97358-01 |
|
| View PDF |
This chapter includes important information about on-going system maintenance, tuning, and recovery. As with any production system, your implementation of Oracle Files should include a basic disaster recovery plan that ensures 24x7 operations, regardless of system failures. To manage potentially exponential growth of data in an Oracle Files instance, the system provides some unique capabilities, such as LOB (large objects) file aging. This chapter includes these topics.
Planning for failures is one of the most important jobs of any system administrator or DBA. Be sure to implement a daily or weekly backup plan that meets the needs of your business and operations environment. Take advantage of the Oracle database backup capabilities, built right into the database.
Always back up the system before upgrading, migrating new data, or making other major changes.
See Oracle9i Backup and Recovery Concepts for additional information.
Oracle Files data is comprised of content and metadata. The majority of data stored in Oracle Files is content and is stored online in LOBs (Large Objects). Content stored in Oracle Files is stored in database tablespaces. Oracle Files makes use of the Large Object (LOB) facility of the Oracle database. All documents are stored as Binary Large Objects (BLOBs), which is one type of LOB provided by the database. See "Provide Adequate Storage to Improve Performance".
Through LOB management, Oracle Files provides data archiving. Content that has not been accessed for a specified interval is then automatically moved from BLOBs to BFILE. The content is still accessible, of course, and is visible as any normal content would be when users are browsing or searching.
As the amount of content stored increases, it may become desirable to move content to a cheaper medium. BFILE support provides off-line and near-line storage.
In both offline and near-line storage, content that is infrequently accessed is moved from expensive online media, such as a disk array, to a cheaper off-line medium, such as tape. The metadata and search indexes are kept online and are readily available. The difference between off-line and near-line storage is in the automation and transparency of moving the data between storage media. Near-line storage is also known as a Hierarchical Storage Manager (HSM) system.
Oracle Files provides transparent access to content stored as either a LOB (online storage) or a BFILE (near-line storage). Writing to a BFILE results in the content being stored in a BLOB. A BFILE is a read-only Oracle data type consisting of an Oracle Directory object and a filename. When an application writes to a document whose content is stored in a read-only BFILE, the content ends up being stored in a BLOB. The new content will be indexed.
End users will be unaware of where their content is stored. Administrators, however, must be aware of where content is stored. For example, when managing indexing by Oracle Text, administrators need to know that content can be indexed only in a LOB. Once a document is indexed, the read-only nature of BFILEs allows Oracle Files to maintain the index indefinitely.
BFILE aging is not implemented by default. You must activate the Content agent and configure the frequency value. The Content agent moves content that has not been accessed in a specified amount of time to a BFILE. See the Content agent properties in Appendix C, "Server Configuration Properties".
The file the BFILE points to is located in the database's $ORACLE_HOME/ifsbfiles/<ifsschema>
. UNIX users may configure the real location of the BFILE using symbolic links. Windows users must use this BFILE base path.
From the base path, the Content agent associates a relative path (the complete path to the BFILE's file is the base path plus a relative path). The path now looks like this:
$ORACLE_HOME/ifsbfiles/<schemaname>
/<yyyy>/<dd>/<mm>/hhmmss/ifsbfile_<id>
where:
/<yyyy>/<dd>/<mm>/hhmmss/
is the relative path created by the Content agent based on the server configuration properties you set for the Content agent.
ifsbfile_<id>
is the filenaming pattern that associates a unique ID to each piece of content.
You can create a symbolic link to connect the base file location to a different location--an HSM system, a tape backup system, etc.--rather than limit your BFILE storage to the database's Oracle_Home
.
Performance is typically affected by network input/output (I/O), hard-disk drive I/O, memory (random access memory) I/O, or some combination of these three or other factors. Adjusting one of the factors sometimes moves the bottleneck to a new location, so you must approach the tuning task in a logical manner and ensure that you're not simply moving a bottleneck to a new location.
The performance tips in this section cover the basics and include:
See the Oracle9i Database Performance Tuning Guide and Reference for complete information.
Oracle Files uses Oracle's Cost-Based Optimizer (CBO) to determine the most efficient way to execute SQL statements. For the CBO to work properly, the Oracle Files 'analyze.sql
' script should be run as part of regular Oracle Files operations, especially after large volume changes to the data, such as after users have loaded a large number of files into the instance. For more information about the Cost-Based Optimizer, see the Oracle9i Database Performance Tuning Guide and Reference.
Run the script during non-busy periods to avoid impeding performance for users.
The analyze.sql
script, which makes calls to the DBMS_STATS package, exports schema statistics to a backup table, so you can restore statistics later, if necessary, as discussed in "Restoring Prior Statistics". To run the script:
cd $ORACLE_HOME/files/admin/sql sqlplus ifssys/<password> @analyze.sql ifssys
where: ifssys/<password>
is the schema user name and password and ifssys
is the name of the schema to analyze.
This script may take awhile to run, especially if Oracle Files contains a large number of documents.
If, for any reason, you want to restore previous statistics, you can do so by executing the following statements:
SQL> select stat_id from my_stat_backup_table; SQL> call dbms_stats.import_schema_stats ('ifssys', 'my_stat_backup_table', <name of stat id>,'ifssys');
Before gathering new statistics, the analyze.sql
script exports backup statistics to the IFS_BACKUP_STATS table, marking the set of statistics with a timestamp. You can query the table for existing saved sets by executing this SQL statement:
SQL> select distinct statid from ifs_backup_stats;
This query returns a list of all statistics by statid
(the date and time stamp). For example:
STATID ------------------------------ 01-MAY-02 02:15.36 04-MAY-02 20:00.15 08-MAY-02 02:15.48 11-MAY-02 06:21.40 11-MAY-02 20:15.37
You can then restore the statistics from a day and time when you know your performance was better. For example, if you find that after using the statistics from your 8:00 pm running of analyze
that performance is no better or worse, then you can restore your statistics from earlier that day, or from a prior set altogether.
SQL> @import_backup_stats.sql <user_name> '08-MAY-02 06:21.40'
The largest consumption of disk space occurs on the disks that actually contain the documents residing in Oracle Files, namely the Indexed Medias and Non-Indexed Medias tablespaces. This section explains how the documents are stored and how to calculate the amount of space those documents will require.
LOBs provide for transactional semantics much like the normal data stored in a database. To meet the criteria of transactional semantics, LOBs must be broken down into smaller pieces which are individually modifiable and recoverable. These smaller pieces are referred to as chunks. Chunks are actually a group of one or more sequential database blocks from a tablespace that contains a LOB column.
Both database blocks and chunk information within those blocks (BlockOverhead) impose some amount of overhead for the stored data. BlockOverhead is presently 60 bytes per block and consists of the block header, the LOB header, and the block checksum. Oracle Files configures its LOBs to have a 32 K chunk size. As an example, assume that the DB_BLOCK_SIZE parameter of the database is set to 8192 (8 K). A chunk would require four contiguous blocks and impose an overhead of 240 bytes. The usable space within a chunk would be 32768-240=32528 bytes.
Each document stored in Oracle Files will consist of some integral number of chunks. Using the previous example, a 500K document will actually use 512000/32528=15.74=16 chunks. Sixteen chunks will take up 16*32 K = 524288 bytes. The chunking overhead for storing this document would then be 524288-512000=12288 bytes which is 2.4 percent of the original document's size. The chunk size used by Oracle Files is set to optimize access times for documents. Note that small documents, less than one chunk, will incur a greater disk space percentage overhead since they must use at least a single chunk.
Another structure required for transactional semantics on LOBs is the LOB Index. Each lob index entry can point to eight chunks of a specific LOB object (NumLobPerIndexEntry = 8). Continuing the example, whereas a 500 K document takes up 16 chunks, two index entries would be required for that object. Each entry takes 46 bytes (LobIndexEntryOverhead) and is then stored in an Oracle B-Tree index, which in turn has its own overhead depending upon how fragmented that index becomes.
The last factor affecting LOB space utilization is the PCTVERSION parameter used when creating the LOB column. For information about how PCTVERSION works, please consult the Oracle9i SQL Reference Guide.
Oracle Files uses the default PCTVERSION of 10 percent for the LOB columns it creates. This reduces the possibility of "ORA-22924 snapshot too old" errors occurring in read consistent views. By default, a minimum of a 10 percent increase in chunking space must be added in to the expected disk usage to allow for persistent PCTVERSION chunks.
For large systems where disk space is an issue, Oracle recommends that you set the PCTVERSION to 1 to reduce disk storage requirements. This may be done at any time in a running system with these SQL commands:
alter table odmm_contentstore modify lob (globalindexedblob) (pctversion 1); alter table odmm_contentstore modify lob (emailindexedblob) (pctversion 1); alter table odmm_contentstore modify lob (emailindexedblob_t) (pctversion 1); alter table odmm_contentstore modify lob (intermediablob) (pctversion 1); alter table odmm_contentstore modify lob (intermediablob_t) (pctversion 1); alter table odmm_nonindexedstore modify lob (nonindexedblob2) (pctversion 1);
The steps for calculating LOB tablespace usage are as follows:
chunks = roundup(FileSize/(ChunkSize-((ChunkSize/BlockSize) * BlockOverhead)))
For example, if FileSize = 100,000, ChunkSize = 32768, Blocksize = 8192, and BlockOverhead = 60, then Chunks = roundup (100000 /(32768 - ((32768 / 8192) * 60)))= 4 Chunks.
FileDiskSpaceInBytes = roundup(chunks*ChunkSize*PctversionFactor) + roundup(chunks/NumLobPerIndexEntry*LobIndexEntryOverhead)
Continuing the example from above, chunks = 4, ChunkSize = 32768, PctversionFactor = 1.1, NumLobPerIndexEntry = 8, and LobIndexEntryOverhead = 46, so FileDiskSpaceInBytes = roundup (4 * 32768 * 1.1) + (roundup(4/8) * 46) = 144226 FileDiskSpaceInBytes.
TableSpaceUsage = sum(FileDiskSpaceInBytes) for all files stored
Oracle Files creates multiple LOB columns. The space calculation must be made for each tablespace based upon the amount of content that will qualify for storage in each tablespace.
The Oracle Files server keeps persistent information about the file system and the contents of that file system in database tables. These tables and their associated structures are stored in the Oracle Files Primary tablespace. These structures are required to support both the file system and the various protocols and APIs that make use of that file system.
The administration and planning tasks of this space should be very similar to operations on a normal Oracle database installation. You should plan for approximately 6 K of overhead per document to be used from this tablespace, or about 2 percent of the overall content. If there is a significant number of custom metadata, such as attributes, subclasses or categories, this overhead should be much larger.
The initial disk space allocated for this tablespace is approximately 50 MB for a default installation. Of the 50 MB, 16 MB is actually used at the completion of installation. This includes instantiations for all required tables and indexes and the metadata required for the 700+ files that are loaded into Oracle Files as part of the installation. Different tables and indexes within this tablespace will grow at different rates depending on which features of Oracle Files get used in a particular installation.
After ensuring that you've run statistics properly and have enough free hard-disk drive to support the tablespaces, you may still have performance problems. If that's the case, you must determine whether the performance bottleneck is caused by the Oracle9i Database Server or by Oracle Files.
To isolate the problem, start by looking at which processes are running and how many resources they are consuming.
top
(on UNIX) or Task Manager(on Windows NT/2000) as you reproduce the problem.If the bottleneck is the Oracle shadow process, get the SQL statement which is causing the largest number of buffer gets, and run Explain Plan on it.
If you see full table scans, then that's the cause of the problem; the optimizer isn't choosing an appropriate plan. Report that problem to Oracle Support. Additional work must be done to isolate the problem.
You may have too little memory. For example, if you see any "java.lang.OutOfMemoryError" errors in your log files, increase your MX settings for that JVM.
If users are complaining about poor response times, and top
(on UNIX) or its equivalent (for example, Task Manager on Windows NT/2000), shows a Java process running at 100 percent of a CPU for a minute or longer, then the MX setting for Java may be too small.
% java -verbosegc
An example of a return message is:
GC[1] in 305 ms: (6144kb, 6% free) -> (14Mb, 61% free)
GC[1]
indicates a complete garbage collection. GC[0]
indicates a young space only garbage collection. In this example, the collection took 305 ms. At the start of the collection, heap size was 6144 kb, with 6 percent free. The heap expanded during collection 14 Mb heap with 61 percent free.
If the bottleneck is an Oracle Files Java process, start by checking the percentage of cache hits for the Oracle Files service. Starting from the Oracle Enterprise Manager Application Server home page (http://hostname:1810
):
The goal is to have a high percentage of Cache Hits; as much as 100 percent is possible. If the percentage of Cache Hits for the service is less than 95 percent, the size of the Committed Data Cache may be too small.
This will increase your memory usage on the middle-tier machine by approximately 3 K per object.
For the ReadOnly or Writable connection pool, increase the "Target max. number of connections" and "Absolute max. number of connections" if any of the following is true:
Each additional Target or Absolute connection will use approximately 8 MB per connection on the middle tier and 1 MB per connection on the database.
For the ReadOnly or Writable Connection Pool, increase the "TargetSize" and "MaximumSize" if any of the following is true:
Note that each additional CurrentSize connection will use approximately 8 MB per connection on the middle tier and 1 MB per connection on the database.
The log file will have connection pool statistics similar to these:
Cache performance for S_LibraryObject cache CACHESIZE=409 OBJECTCOUNT=409 PUTCOUNT=818 REMOVECOUNT=0 FINDCOUNT=14617 HITCOUNT=13949 MISSCOUNT=668 HITRATIO=0.9542997879181775 MISSRATIO=0.04570021208182254 Cache performance for FolderPath cache CACHESIZE=15 CacheSizeEstimate=15 ACCESSSEQUENCE=599 SequenceAtLastPurge=0 PUTCOUNT=15 REMOVECOUNT=0 PURGECOUNT=0 FINDCOUNT=557 HITCOUNT=433 MISSCOUNT=124 HITRATIO=0.77737881508079 MISSRATIO=0.22262118491921004 Cache performance for committed S_LibraryObjectData cache CACHESIZE=473 CacheSizeEstimate=576 ACCESSSEQUENCE=6821 SequenceAtLastPurge=0 PUTCOUNT=576 REMOVECOUNT=0 PURGECOUNT=0 FINDCOUNT=27092 HITCOUNT=26338 MISSCOUNT=754 HITRATIO=0.972168905950096 <=== THIS IS THE NUMBER TO WATCH MISSRATIO=0.02783109404990403 Cache performance for LibraryObject cache CACHESIZE=221 OBJECTCOUNT=221 PUTCOUNT=221 REMOVECOUNT=0 FINDCOUNT=1473 HITCOUNT=1252 MISSCOUNT=221 HITRATIO=0.8499660556687033 MISSRATIO=0.1500339443312967
If you have written a custom Java application, you can use any of the following to get statistics about a session or service (the session variable is your authenticated LibrarySession):
// LO data cache System.out.println(session.invokeServerMethod("DYNGetCommittedLibraryObjectData
CachePerformanceString", null)); // Folder path cache System.out.println(session.invokeServerMethod("DYNGetFolderPathCachePerformanceS tring", null)); // LO cache System.out.println(session.getLibraryObjectCachePerformance().toString()); // Writeable connection pool System.out.println(session.invokeServerMethod("DYNGetWriteableConnectionPoolPerf ormanceString", null)); // Readonly connection pool System.out.println(session.invokeServerMethod("DYNGetReadonlyConnectionPoolPerfo rmanceString", null));
The changes you make on this page apply only to the running service session. To change settings permanently, so that they are used each time you restart the domain or this particular node, you must:
If the Thread Count on a custom application continues to increase over time, and never decreases, you may have a thread leak in your software. Be sure that your code is calling LibrarySession.disconnect()
whenever a user disconnects or is timed out.