A CONNECT_TIMEOUT element specifies the maximum time in seconds that the SPIDER should wait for a connection to be made to a host during crawling.
If you do not specify a value, the SPIDER waits indefinitely, unless a TIMEOUT element has a value specified to limit the overall time of crawling operations.
<!ELEMENT CONNECT_TIMEOUT EMPTY> <!ATTLIST CONNECT_TIMEOUT VALUE CDATA #REQUIRED >
The following section describes the CONNECT_TIMEOUT element's attribute.
VALUE
Specifies the maximum number of seconds to wait for a host connection. This value must be an integer greater than zero.
The CONNECT_TIMEOUT element has no sub-elements.
This example shows initialization values of a Spider component, including connection and proxy information. The CONNECT_TIMEOUT value is set to 15 seconds.
<SPIDER_INIT>
<!-- Abort fetch operations that take longer than 120 seconds. -->
<TIMEOUT VALUE="120"/>
<!-- Abort fetch operations when a connection isn't made within 15 seconds.
-->
<CONNECT_TIMEOUT VALUE="15"/>
<!-- Abort fetch operations if the transfer rate drops below 1024 bytes/second for
5 seconds. -->
<MIN_TRANSFER_RATE MIN_RATE="1024" MAX_TIME="5"/>
<!-- Have the Spider tell web servers that it's a Netscape client making the
fetch request. -->
<AGENT_NAME>Netscape</AGENT_NAME>
<!-- Configure the Spider to use the proxy server running on host1.acme.com:8080
when fetching HTTP URLs and host2.acme.com:8443 when fetching HTTPS URLs but not
to use a proxy for URLs served by either host3.com or host4.com. -->
<PROXY_CONFIG>
<PROXY_HTTP HOST="host1.acme.com" PORT="8080"/>
<PROXY_HTTPS HOST="host2.acme.com" PORT="8443"/>
<PROXY_BYPASS>host3.com</PROXY_BYPASS>
<PROXY_BYPASS>host4.com</PROXY_BYPASS>
</PROXY_CONFIG>
<!-- Spider will use proxy host1.acme.com:8080 for this URL -->
<ROOT_URL>http://www.endeca.com</ROOT_URL>
<!-- Spider will use proxy host2.acme.com:8443 for this URL -->
<ROOT_URL>https://outlook.endeca.com</ROOT_URL>
<!-- Spider won't use a proxy for this URL -->
<ROOT_URL>http://host3.com:6000</ROOT_URL>
<!-- Spider won't use a proxy for this URL -->
<ROOT_URL>http://host4.com:6000</ROOT_URL>
</SPIDER_INIT>