The proxy server comes with several command line utilities that let you configure, change, generate, and repair your cache directory structure. Most of these utilities are duplications of the Server Manager pages, but you might want to use the utilities if you need to schedule the maintenance (for example, as a cron job). All of the utilities are located in the extras directory.
From the command-line prompt, go to the server_root/proxy-serverid directory.
Type ./start -shell
The following sections describe the various utilities.
The proxy has a utility called cbuild which is an offline cache database manager. This utility allows you create a new cache structure or modify an existing cache structure using the command-line interface. You can use the Server Manager pages to enable the proxy to use the newly created cache.The utility does not update the server.xml file. cbuild cannot resize a cache that has multiple partitions. The server.xml file has a element called CACHE that has a cachecapacity parameter. When the cache is created or modified by cbuild, the cachecapacity parameter should be manually updated in the server.xml file.
<PARTITION partitionname="part1" partitiondir="/home/build/install9 /proxy-server1/cache" maxsize="1600" minspace="5" enabled="true"/> <CACHE enabled="true" cachecapacity="2000" cachedir="/tmp/cache">
You can invoke the cbuild utility in two modes. The first mode is:
cbuild -d conf-dir -c cache-dir -s cache size cbuild -d conf-dir -c cache-dir -s cache size -r
For example:
cbuild -d server_root/proxy-serverid/config -c server_root/proxy-serverid/cache -s 512 cbuild -d server_root/proxy-serverid/config -c server_root/proxy-serverid/cache -s 512 -r
where:
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config.
cache-dir is the directory for your cache structure.
cache size is the maximum size to which the cache can grow This option cannot be used along with the cache-dim parameter. The maximum size is 65135 MB.
-r resizes an existing cache structure provided it has a single partition. This is not required for creating a new cache.
The second mode you can run cbuild is:
cbuild -d conf-dir -c cache-dir -n cache-dim cbuild -d conf-dir -c cache-dir -n cache-dim -r
For example:
cbuild -d server_root/proxy-serverid/config -c server_root/proxy-serverid/cache -n 3 cbuild -d server_root/proxy-serverid/config -c server_root/proxy-serverid/cache -n 3 -r
where:
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config.
cache-dir is the directory for your cache structure.
cache-dim determines the the number of sections. For example, in Figure 12–2 the section shown as s3.4, the 3 indicates the the dimension. The default value of cache-dim is 0 and the maximum value is 8.
-r resizes an existing cache structure provided it has a single partition. This is not required for creating a new cache.
The proxy has a utility called urldb that manages the URL list in the cache. You can use this utility to list the URLs that are cached. You can also selectively expire and remove cached objects from the cache database.
The urldb commands can be categorised into three groups based on the -o option:
domains
sites
urls
To list domains, enter the following at the command line:
urldb -o matching_domains -e reg_exp -d conf-dir
For example:
urldb -o matching_domains -e “.*phoenix.*” -d server_root/proxy-serverid/config
where
matching_domains lists domains that match regular expression
reg_exp is the regular expression used
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config.
To list all the matching sites in a domain, enter the following at the command line:
urldb -o matching_sites_in_domain -e reg_exp -m domain_name -d conf-dir
For example:
urldb -o matching_sites_in_domain -e “.*atlas” -m phoenix.com -d server_root/proxy-serverid/config
where
matching_sites_in_domain lists all the sites in a domain that match the regular expression
reg_exp is the regular expression used
domain_name is the name of the domain
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config
To list all the matching sites, enter the following at the command line:
urldb -o all_matching_sites -e reg_exp -d conf-dir
For example:
urldb -o all_matching_sites -e “.*atlas.*” -d server_root/proxy-serverid/config
where
all_matching_sites lists all the sites that match the regular expression
reg_exp is the regular expression used
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config
To list matching urls in a site, enter the following at the command line:
urldb -o matching_urls_from_site -e reg_exp -s site_name -d conf-dir
For example:
urldb -o matching_urls_from_site -e “http://.*atlas.*” -s atlas.phoenix.com -d server_root/proxy-serverid/config
where
matching_urls_from_site lists all urls from site that match the regular expression
reg_exp is the regular expression used
site_name is the name of the site
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config
To expire or remove matching urls in a site, enter the following at the command line:
urldb -o matching_urls_from_site -e reg_exp -s site_name -x e -d conf-dir urldb -o matching_urls_from_site -e reg_exp -s site_name -x r -d conf-dir
For example:
urldb -o matching_urls_from_site -e “http://.*atlas.*” -s atlas.phoenix.com -x e -d iserver_root/proxy-serverid/config
where
matching_urls_from_site lists all urls from site that match the regular expression
reg_exp is the regular expression used
site_name is the name of the site
-x e is the option to expire the matching URLs from the c ache database. This option can not be used with the domain and site modes
-x r is the option to remove the matching URLs from the cache database
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config
To list all matching urls , enter the following at the command line:
urldb -o all_matching_urls -e reg_exp -d conf-dir
For example:
urldb -o all_matching_urls -e “.*cgi-bin.*” -d server_root/proxy-serverid/config
where
all_matching_urls lists all the URLs that match the regular expression
reg_exp is the regular expression used
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config
To expire or remove all matching urls , enter the following at the command line:
urldb -o all_matching_urls -e reg_exp -x e -d conf-dir urldb -o all_matching_urls -e reg_exp -x r -d conf-dir
For example:
urldb -o all_matching_urls -e “.*cgi-bin.*” -x e -d server_root/proxy-serverid/config
where
all_matching_urls lists all the URLs that match the regular expression
reg_exp is the regular expression used
-x e is the option to expire the matching URLs from the cache database
-x r is the option to remove the matching URLs from the cache database
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config
To expire or remove a list of URLs , enter the following at the command line:
urldb -l url-list -x e -e reg_exp -d conf-dir urldb -l url-list -x r -e reg_exp -d conf-dir
For example:
urldb -l url.lst -x e -e “.*cgi-bin.*” -d server_root/proxy-serverid/config
where
url-list is the list of URLs that need to be expired. This option can be used for providing the URL list.
-x e is the option to expire the matching URLs from the cache database.
-x r is the option to remove the matching URLs from the cache database.
reg_exp is the regular expression used
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config.
The cachegc utility allows you to clean up the cache database of objects may have expired or are too old to be cached in the directory due to cache size constraints.
Ensure that the CacheGC is not running in the proxy instance when the cachegc utility is used.
The cachegc utility can be used in the following way:
cachegc -f leave-fs-full-percent -u gc-high-margin-percent -l gc-low-margin-percent -e extra-margin-percent -d conf-dir
For example:
cachegc -f 50 -u 80 -l 60 -e 5 -d server_root/proxy-serverid/config
where
leave-fs-full-percent determines the percentage of the cache partition size below which garbage collection will not go
gc-high-margin-percent controls the percentage of the maximum cache size that, when reached, triggers garbage collection
gc-low-margin-percent controls the percentage of the maximum cache size that the garbage collector targets
extra-margin-percent is used by the garbage collector to determine the fraction of the cache to remove.
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config.
The bu utility updates the cache and works in two modes. In the first mode, it iterates through the cache database and updates all the URLs that are present in the cache by sending HTTP requests for each. In the second mode, it starts with a given URL and does a breadth first iteration of all the links from that URL to the depth that you specify and fetches pages to the cache. bu is a RFC compliant robot.
bu -n hostname -p port -t time-lmt -f contact-address -s sleep-time -o object -r n -d conf-dir
For example:
bu -n phoenix -p 80 -t 3600 -f admin@phoenix.com -s 60 -o nova -r n -d server_root/proxy-serverid/config
where
hostname is the hostname of the machine on which proxy is running. The default value is the localhost.
port is the port on which proxy server is running. The default port is 8080.
time-lmt is the time limit to which the utility will run
contact-address determines the contact address that would be sent in the HTTP requests that are sent from bu. The default value is worm@proxy-name.
sleep-time is the sleep time between two consecutive requests. The default value is 5 seconds.
object is the object specified in bu.conf that we are executing currently
-r n option determines whether the robot.txt policy is followed. The default value is y.
conf-dir is the configuration directory of the proxy instance. It is located in the following path server_root/proxy-serverid/config.