|Oracle8i Designing and Tuning for Performance
Release 2 (8.1.6)
Part Number A76992-01
This chapter introduces networking issues that affect tuning.
This chapter contains the following sections:
The techniques used to determine the source of problems vary depending on the configuration. The three types of configurations are:
Table 22-1 lists how to tell what type of database configuration you have.
Pre-Spawn Dedicated Server
It is possible to connect to dedicated server with a database configured for MTS by placing the parameter (
DEDICATED) in the connect descriptor.
LSNRCTL control utility's
services statement lists every dispatcher registered with it. This list includes the dispatchers process ID. You can check the alert log to confirm that the dispatcher have been started successfully.
lsnrctl services: LSNRCTL for Solaris: Version 220.127.116.11.0 - Production on 27-MAY-99 13:38:02 (c) Copyright 1999 Oracle Corporation. All rights reserved. Connecting to (ADDRESS=(PROTOCOL=TCP)(Host=ecdc2)(Port=1521)) Services Summary... ORCL has 2 service handler(s) DEDICATED SERVER established:0 refused:0 LOCAL SERVER DISPATCHER established:0 refused:0 current:0 max:1 state:ready D000 <machine: ecdc2, pid: 16011> (ADDRESS=(PROTOCOL=tcp)(DEV=20)(HOST=18.104.22.168)(PORT=55304)) The command completed successfully.
MTS_DISPATCHERline is correctly set. For example:
MTS_DISPATCHERS = "(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=hostname)(PORT=1492)(queuesize=32 ))) (DISPATCHERS = 1) (LISTENER = alias) (SERVICE = servicename) (SESSIONS = 1000) (CONNECTIONS = 1000) (MULTIPLEX = ON) (POOL = ON) (TICK = 5)"
One, and only one, of the following attributes is required:
DESCRIPTION provide support for the specification of additional network attributes beyond
PROTOCOL. In the example above, the entire line with "
DESCRIPTION" can be substituted by (
PROTOCOL=TCP). The attributes
TICKS are all optional.
MTS_MAX_DISPATCHERline is correctly set. For example:
This line should reflect the total number of dispatchers you may want to start.
MTS_MAX_SERVERSline is correctly set. For example:
This line sets the upper bound on the total number of shared servers PMON can create, based on the peak load of the system. This should be set high enough so that all requests can be serviced, but not so high that the system swaps if they are reached. The purpose of this parameter is to prevent the server from swapping. Run the following script to see what the highwater mark is for the number of servers running, and then set
MTS_MAX_SERVERS to more then this.
MTS_SERVERSline is correctly set. For example:
This is the total number of shared servers started when the database is started. It also represents the total number of shared servers PMON tries to keep. It should be the total number of servers expected to always be used when the database is active.
MTS_MAX_SERVERS is intended to handle peak load.
LSNRCTL control utility's
services statement to see if there are excessive connection refusals. Check the listener's log file to see if this is a connection problem. For example:
LSNRCTL> set displaymode normal Service display mode is NORMAL LSNRCTL> services Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=net)(QUEUESIZE=32))) Services Summary... Service "net.regress.rdbms.dev.us.oracle.com" has 1 instances. Instance "net" Status: READY Total handlers: 3 Relevant handlers: 3 DEDICATED established:0 refused:0 current:0 max:0 state:ready Session: NS D001 established:0 refused:0 current:0 max:16383 state:ready (ADDRESS=(PROTOCOL=tcp)(HOST=dlsun1013.us.oracle.com)(PORT=52217)) Session: NS D000 established:0 refused:0 current:0 max:16383 state:ready (ADDRESS=(PROTOCOL=tcp)(HOST=dlsun1013.us.oracle.com)(PORT=52216)) Session: NS
Under normal conditions, the number refused should be zero. Shut down the listener, and restart it to erase these statistics. If, after the listener restarts, the refused count is increasing, then the connections are being refused. If the refused count stays at zero, and if the problem you are troubleshooting is occurring, then your problem is not with the connections being refused.
Connection refusals can occur for many reasons. Examine the listener log to see what the connect per second rate is. Run the listener log analyzer script to check.
The listener is a queue-based process. It receives connect requests from the lower level protocol stack. It has a limited queue stack (which is configurable to the operating system maximum). It can only process one connection at a time, and there is a limit to the number of connections per second the process can handle.
If the rate at which the connect requests arrive exceeds that limit, then the requests will be queued. The queue stack is also limited, but you can configure it. If there are more listener processes, then the requests made against each process will be fewer and, therefore, will be handled more quickly.
Increasing the listener queue is done in the
ora file. The
ora file can contain many listeners, each by a different name. It is assumed that only one of those listed is having a problem. If not, then apply this method to all applicable listeners. To increase the listener queue, add (
queuesize = number) to the
ora file. For example:
Stop and restart the listener to initialize this new parameter. If you are not currently running an MTS configuration, then you should consider doing so. It is faster for the listener to handle a client request in an MTS configuration than it is in a dedicated server or a pre-spawned dedicated server configuration.
Pre-spawned (pre-started) processes can improve connect time with a dedicated server. This is particularly true of heavily loaded systems not using multi-threaded servers, where connect time is slow. If this is enabled, then the listener can redirect the connection to an existing process with no wait time whenever a connection request arrives. Connection requests do not have to wait for new processes to be started.
Determine if the pre-spawn configuration was properly configured and sized for this system.
Pre-spawning dedicated servers have intents. One is to have faster connect times to the database by having the server shadow processes created before the client makes a connect request. Once connected, the listener creates the next shadow process for the next connect. Pre-spawning is also useful in a controlling resource starved system or access into the system. It lets you cap the number of shadow processes that can be pre-spawned. After this limit is reached, all new connections come in as dedicated.
If there is no activity on the database and there are no users connected, then the number of pre-spawn servers is the number listed in the
ora file for
POOL_SIZE. Otherwise, depending on the number of connections to the database, they will range from the minimum (
POOL_SIZE) to the maximum (
PRESPAWN_MAX). For example:
LISTENER = (ADDRESS_LIST =(ADDRESS= (PROTOCOL= TCP)(Host= ecdc2)(Port= 1521))) SID_LIST_LISTENER = (SID_LIST = (SID_DESC = (ORACLE_HOME = /u01/oracle/product/oracle/8.1.6) (SID_NAME = ORCL) (PRESPAWN_MAX = 12) (PRESPAWN_LIST = (PRESPAWN_DESC = (PROTOCOL = TCP) (POOL_SIZE = 1) (TIMEOUT = 1)))))
In the above example, with no database activity, there will be one pre-spawn process. During periods of high activity there will be a maximum of 12. After this point, any connect requests that arrive have their shadow processes created in the same manner as a dedicated connection.
To check if there are the correct number of pre-spawn processes, use the
services statement and the operating system command to list running processes (
ps on UNIX). For example:
lsnrctl services: LSNRCTL for Solaris: Version 22.214.171.124.0 - Production on 26-MAY-99 18:22:49 (c) Copyright 1999 Oracle Corporation. All rights reserved. Connecting to (ADDRESS=(PROTOCOL=TCP)(Host=ecdc2)(Port=1521)) Services Summary... ORCL has 2 service handler(s) DEDICATED SERVER established:0 refused:0 LOCAL SERVER PRESPAWNED SERVER established:0 refused:0 current:0 max:1 state:ready PID:15587 (ADDRESS=(PROTOCOL=tcp)(DEV=8)(HOST=126.96.36.199)(PORT=55221)) The command completed successfully ps -ef | grep oracle oracle 15587 1 0 17:54:21 ? 0:00 oracleORCL / (DESCRIPTION=(COMMAND=prespawn)(PROTOCOL=TCP)(SERVICE_ID=2)(HANDLER_
The first statement shows that there is one pre-spawn server process, which is confirmed by the
If there were more pre-spawn servers listed by
ps than set by the
PRESPAWN_MAX parameter, then there are processes that are defunct.
If there were other process listed in the
ps command, like
then there may not be enough pre-spawn servers to handle the load for this configuration. Extra processes like this imply that the maximum number of pre-spawn servers needs to be increased.
There should be, at a minimum, the same number of idle pre-spawn servers as the
POOL parameter. This can be examined by looking at
LSNRCTL services to see how many pre-spawn servers have no current connections. For example:
lsnrctl services: LSNRCTL for Solaris: Version 188.8.131.52.0 - Production on 26-MAY-99 18:22:49 (c) Copyright 1999 Oracle Corporation. All rights reserved. Connecting to (ADDRESS=(PROTOCOL=TCP)(Host=ecdc2)(Port=1521)) Services Summary... ORCL has 2 service handler(s) DEDICATED SERVER established:0 refused:0 LOCAL SERVER PRESPAWNED SERVER established:0 refused:0 current:0 max:1 state:ready PID:15587 (ADDRESS=(PROTOCOL=tcp)(DEV=8)(HOST=184.108.40.206)(PORT=55221)) The command completed successfully
To determine if there is a problem with the listener or pre-spawn servers, test to see if the behavior is due to pre-spawn or something else.
Create a listener not configured for pre-spawn. By placing (
SERVER=DEDICATED) in the connect descriptor, it still connects to a pre-spawn server process. You need to create a listener that does not pre-spawn for this test.
For more information on setting up
Determine how much RAM is present so that you do not cause the server to swap when the number of pre-spawn servers is increased.
Find out what the physical size of the pre-spawn server is. On some systems, the command
ps gives a false size to a process. Sometimes it gives you the size of the process plus any common memory it shares, like the SGA. Check with the system administrator of the system to get the proper utility.
Find out the amount of RAM that is free in the system. This amount must not include the amount of swap.
Divide the total amount of free RAM by the size of the pre-spawn server process. This gives you an approximate top number of pre-spawn servers that you can add to the system without the fear of it beginning to swap. The actual number of servers to pre-spawn depends on the suspected number of simultaneous connections plus the expected connect rate. The first number determines the setting for
PRESPAWN_MAX, while the other number determines the setting for
This section encompasses Local Area Network (LAN) and Wide Area Network (WAN) troubleshooting methods.
Networks entail overhead that adds a certain amount of delay to processing. To optimize performance, you must ensure that your network throughput is fast, and you should try to reduce the number of messages that must be sent over the network. It can be difficult to measure the delay the network adds.
Three dynamic performance views are useful for measuring the network delay:
AVERAGE_WAIT column indicates the amount of time that Oracle waits between messages. You can use this statistic as a yardstick to evaluate the effectiveness of the network.
EVENT column lists the events for which active sessions are waiting. The "sqlnet message from client" wait event indicates that the shared or foreground process is waiting for a message from a client. If this wait event has occurred, then you can check to see whether the message has been sent by the user or received by Oracle.
You can investigate hang-ups by looking at
V$SESSION_WAIT to see what the sessions are waiting for. If a client has sent a message, then you can determine whether Oracle is responding to it or is still waiting for it.
V$SESSTAT you can see the number of bytes that have been received from the client, the number of bytes sent to the client, and the number of calls the client has made.
The most critical aspects of a network that contribute to performance are latency and bandwidth.
The term latency refers to a time delay; for example, the gap between the time a device requests access to a network and the time it receives permission to transmit.
Bandwidth is the throughput capacity of a network medium or protocol. Variations in the network signals can cause degradation on the network. Sources of degradation can be cables that are too long or wrong cable type. External noise sources, such as elevators, air handlers, or florescent lights, can also cause problems.
Local Area Network Topologies:
Wide Area Network Topologies:
Table 22-2 lists the most common ratings for various topologies.
|Topology or Carrier||Bandwidth|
1 Gigabit Ethernet
155 Megabits/second (OC3), 622 Megabits/second (OC12)
T-1 (US only)
T-3 (US only)
Committed Information Rate, which can be up to the carrier speed, but usually is not.
This can be up to the carrier speed.
This can be up to the carrier speed. It is usually used with slower modems.
Dial Up Modems
56 Kilobits/second. It is usually accompanied with data compression for faster throughput.
This section describes several techniques for enhancing performance and solving network problems.
The first step in solving network problem is to understand the overall topology. Gather as much information about the network that you can. This kind of information usually manifests itself as a network diagram. Your diagram should contain the types of network technology used in the Local Area Network and the Wide Area Network. It should also contain addresses of the various network segments.
Examine this information. Obvious bottlenecks include:
There are many problems that can cause a performance breakdown. Follow this checklist:
If nothing is revealed, then find the network route from the client to the data server. Understanding the travel times on a network gives you an idea as to the time a transaction will take. Client-server communication requires many small packets. High latency on a network slows the transaction down due to the time interval between sending a request and getting the response.
Use trace route (
trcroute or equivalent) from the client to the server to get address information for each device in the path. For example:
tracert usmail05 Tracing route to usmail05.us.oracle.com [220.127.116.11]over a maximum of 30 hops: 1 <10 ms <10 ms 10 ms whq1davis-rtr-749-f1-0-a.us.oracle.com [18.104.22.168] 2 <10 ms <10 ms <10 ms whq4op3-rtr-723-f0-0.us.oracle.com [22.214.171.124] 3 220 ms 210 ms 231 ms usmail05.us.oracle.com [126.96.36.199] Trace complete.
Ping each device in turn to get the timings. Use large packets to get the slowest times. Make sure you set the "don't fragment bit" so that routers do not spend time disassembling and reassembling the packet. Also note that the packet size is 1472. This is for Ethernet. Ethernet packets are 1536 octets (actual 8 bit bytes) in size. ICPM packets (this is what ping is designed to use) have 64 octets of header. Evaluate the area where the slowness seems to occur. For example:
ping -l 1472 -n 1 -f 188.8.131.52 Pinging 184.108.40.206 with 1472 bytes of data: Reply from 220.127.116.11: bytes=1472 time<10ms TTL=255 ping -l 1472 -n 1 -f 18.104.22.168 Pinging 22.214.171.124 with 1472 bytes of data: Reply from 126.96.36.199: bytes=1472 time=10ms TTL=254 ping -l 1472 -n 1 -f 188.8.131.52 Pinging 184.108.40.206 with 1472 bytes of data: Reply from 220.127.116.11: bytes=1472 time=271ms TTL=253
The above example validates trace route. Ideally, you would ping from the workstation to 18.104.22.168, from 22.214.171.124 to 126.96.36.199, then from 188.8.131.52 to 184.108.40.206. This would show the exact latency on each segment traveled.
This section helps you determine the problem with your bottleneck.
Net8 tracing reveals whether an error is Oracle-specific or due to conditions that the operating system is passing to the Transparent Network Substrate (Oracle TNS layer).
Enable Net8 tracing at the Oracle server, the listener, and at a client suspected of having the problem you are trying to resolve.
To enable tracing at the server, find the
ora file for the server and create the following lines in it:
To enable tracing at the client, find the
ora file for the client and create the following lines in it:
To enable tracing at the listener, find the
ora file and create the following line in it:
Reproduce the problem, so that you generate traces on the client and server. Now analyze the traces generated.
If the problem is with the network and not Net8, then you must determine the following:
For example, perhaps the system is fine in the building where the Data Center is, but it is slow in other buildings that are several miles away.
Not all Oracle error codes represent pure Oracle troubles.
ORA-3113 is the most common error which points to an underlying network problem.
Enabling tracing on the server can generate a large amount of trace, or large number of trace, files. To prevent this, you can set up a separate environment that traces itself. This configuration works for dedicated and pre-spawn connections. First, log into the server's operating system as the Oracle software owner. Create a temporary directory to keep configuration files and trace files that will be created. Copy the
Now, modify the
If you are getting an Oracle error message, then look into the trace file to find the error. For troubleshooting bugs, Net8 trace analysis takes some time to fully find the problem. However, high level simple trace analysis is rather simple.
If the problem is with Net8, then use Net8 tracing to show you where the problem lies. If there are errors in the trace files, then do they appear in only the client traces, only in the server traces, or in both?
The problem is on the client. However, if you are getting
ORA-3114 errors, then the problem is on the server.
The problem is on the server. However, if you are getting
ORA-3114 errors, then the problem is on the client.
If you are getting
ORA-3114 errors, then the problem is on the Network. Troubleshoot the server first. If it is fine, then the client is at fault.
The multi-threaded server (MTS) is an advanced solution for many customers, and it can be more complex to troubleshoot. Check the initialization parameter file for any MTS parameters. Look at the operating system to see if any of the MTS processes are present.
Check for dispatchers by looking for names like
ora_d001, etc. For example:
Check for shared servers by looking for names like
ora_s001, etc. For example:
Reduce network calls by using array interfaces. Instead of fetching one row at a time, it is more efficient to fetch ten rows with a single network round trip.
Before sending data across the network, Net8 buffers data into the Session Data Unit (SDU). It sends the data stored in this buffer when the buffer is full or when an application tries to read the data. When large amounts of data are being retrieved and when packet size is consistently the same, it may speed retrieval to adjust the default SDU size.
Optimal SDU size depends on the normal transport size. Use a sniffer to find out the frame size, or set tracing on to its highest level to check the number of packets sent and received and to determine whether they are fragmented. Tune your system to limit the amount of fragmentation.
Use Net8 Assistant to configure a change to the default SDU size on both the client and the server; SDU size should generally be the same on both.
When a session is established, Net8 packages and sends data between server and client using packets. Use the
NODELAY parameter in the
ora file, which causes packets to be flushed on to the network more frequently. If you are streaming large amounts of data, then there is no buffering and hence no delay.
Although Net8 supports many networking protocols, TCP tends to have the best scalability.
In Net8, you can use the Connection Manager to conserve system resources by multiplexing. Multiplexing means funneling many client sessions through a single transport connection to a server destination. In this way, you can increase the number of sessions that a process can handle. This applies only to MTS configurations.
Alternately, you can use Connection Manager to control client access to dedicated servers. In addition, Connection Manager provides multiple protocol support allowing a client and server with different networking protocols to communicate.
For more information on Connection Manager, see Net8 Administrator's Guide.