Schedules define the frequency of updating the index with information about each source.
Creatable
name
--NAME=object_name -n object_name
Property | Value |
---|---|
lastCrawled |
The date of the last scheduled crawl in the format Day, DD MMM YYYY HH:MM:SS GMT |
logFilePath |
The full path to the crawler log files |
nextCrawl |
The date of the next scheduled crawl in the same format as lastCrawled . |
scheduleError |
The text of the last error message |
status |
DISABLED , EXECUTING , FAILED , LAUNCHING , PARTIALLY_FAILED , SCHEDULED , or STOPPED |
activate create createAll deactivate delete deleteAll deleteList export exportAll exportList getAllObjectKeys getAllStates getState getStateList start stop update updateAll
Home - Schedules - Create or Edit Schedule
A <search:schedules>
element describes the schedules for crawling sources:
<search:schedules> <search:schedule> <search:name> <search:crawlingMode> <search:recrawlPolicy> <search:frequency> <!-- For hourly crawls: --> <search:hourly> <search:hoursBtwnLaunches> <!-- For daily crawls: --> <search:daily> <search:daysBtwnLaunches> <search:startHour> <!-- For weekly crawls: --> <search:weekly> <search:weeksBtwnLaunches> <search:startDayOfWeek> <search:startHour> <!-- For monthly crawls: --> <search:monthly> <search:monthsBtwnLaunches> <search:startDayOfMonth> <search:startHour> <!-- For manual crawls: --> <search:manual> <!-- For all crawls: --> <search:assignedSources> <search:assignedSource>
Element Descriptions
Contains one or more <search:schedule>
elements, one for each schedule.
Describes a schedule for crawling sources. It contains these elements:
<search:name> <search:crawlingMode> <search:recrawlPolicy> <search:frequency> <search:assignedSources>
The name of the schedule. Required.
One of these crawling modes:
ACCEPT_ALL
: Crawls and indexes all URLs in the source, and extracts and indexes any links found in the URLs of Web sources. If the URL has been crawled before, then it is reindexed only after it changes.
EXAMINE_URLS
: Crawls but does not index any URLs in the source. It also crawls any links found in those URLs. Use this mode when first crawling a new source, so that you can examine the documents and refine the crawling parameters if necessary before indexing.
INDEX_ONLY
: Crawls and indexes all URLs in the source. It does not extract any links from those URLs. In general, select this option for a source that has been crawled previously using EXAMINE_URLS
.
One of these recrawl policies:
PROCESS_ALL
: Recrawls all documents in the source.
PROCESS_CHANGED
: Crawls only documents that changed after the last crawl. For file sources, documents are also crawled if the parent directory changed.
Controls the intervals between starting a schedule. It contains one of these elements:
<search:hourly> <search:daily> <search:weekly> <search:monthly> <search:manual>
Describes an hourly schedule. It contains a <search:hoursBtwnLaunches>
element.
Number of hours between starting crawls, in the range of 1
to 23
.
Describes a daily schedule. It contains these elements:
<search:daysBtwnLaunches> <search:startHour>
Number of days between starting crawls, in the range of 1
to 99
.
The time the crawl begins using a 24-hour clock, such as 9
for 9:00 a.m. or 23
for 11:00 p.m.
Describes a weekly schedule. It contains these elements:
<search:weeksBtwnLaunches> <search:startDayOfWeek> <search:startHour>
Number of weeks between starting crawls, in the range of 1
to 12
.
The day of the week that the crawl begins, such as MONDAY
or TUESDAY
.
Describes a monthly schedule. It contains these elements:
<search:monthsBtwnLaunches> <search:startDayOfMonth> <search:startHour>
Number of months between starting crawls, in the range of 1
to 12
.
An integer value for the day of the month that the crawl begins, such as 1
or 15
.
Describes a manual search.
Contains one or more <search:assignedSource>
elements, one for each source that is crawled using this schedule.
The name of a source crawled using this schedule. The source cannot be a mailing-list source or a federated source.
This XML document creates a schedule for mySource
that runs every third Monday at 11:00 p.m.:
<?xml version="1.0" encoding="UTF-8"?> <search:config productVersion="11.1.2.0.0" xmlns:search="http://xmlns.oracle.com/search"> <search:schedules> <search:schedule> <search:name>schedule1</search:name> <search:crawlingMode>INDEX_ONLY</search:crawlingMode> <search:recrawlPolicy>PROCESS_ALL</search:recrawlPolicy> <search:frequency> <search:weekly> <search:weeksBtwnLaunches>3</search:weeksBtwnLaunches> <search:startDayOfWeek>MONDAY</search:startDayOfWeek> <search:startHour>23</search:startHour> </search:weekly> </search:frequency> <search:assignedSources> <search:assignedSource>mySource</search:assignedSource> </search:assignedSources> </search:schedule> </search:schedules> </search:config>