schedule

Schedules define the frequency of updating the index with information about each source.

Object Type

Creatable

Object Key

name

Object Key Command Syntax

--NAME=object_name

-n object_name

State Properties

Property Value
lastCrawled The date of the last scheduled crawl in the format Day, DD MMM YYYY HH:MM:SS GMT
logFilePath The full path to the crawler log files
nextCrawl The date of the next scheduled crawl in the same format as lastCrawled.
scheduleError The text of the last error message
status DISABLED, EXECUTING, FAILED, LAUNCHING, PARTIALLY_FAILED, SCHEDULED, or STOPPED

Supported Operations

activate
create
createAll
deactivate
delete
deleteAll
deleteList
export
exportAll
exportList
getAllObjectKeys
getAllStates
getState
getStateList
start
stop
update
updateAll

Administration GUI Page

Home - Schedules - Create or Edit Schedule

XML Description

A <search:schedules> element describes the schedules for crawling sources:

<search:schedules>
   <search:schedule>
      <search:name>
      <search:crawlingMode>
      <search:recrawlPolicy>
      <search:frequency>

<!-- For hourly crawls: -->
         <search:hourly>
            <search:hoursBtwnLaunches>

<!-- For daily crawls: -->
         <search:daily>
            <search:daysBtwnLaunches>
            <search:startHour>

<!-- For weekly crawls: -->
         <search:weekly>
            <search:weeksBtwnLaunches>
            <search:startDayOfWeek>
            <search:startHour>

<!-- For monthly crawls: -->
         <search:monthly>
            <search:monthsBtwnLaunches>
            <search:startDayOfMonth>
            <search:startHour>

<!-- For manual crawls: -->
         <search:manual>

<!-- For all crawls: -->
      <search:assignedSources>
         <search:assignedSource>

Element Descriptions 

<search:schedules>

Contains one or more <search:schedule> elements, one for each schedule.

<search:schedule>

Describes a schedule for crawling sources. It contains these elements:

<search:name>
<search:crawlingMode>
<search:recrawlPolicy>
<search:frequency>
<search:assignedSources>
<search:name>

The name of the schedule. Required.

<search:crawlingMode>

One of these crawling modes:

  • ACCEPT_ALL: Crawls and indexes all URLs in the source, and extracts and indexes any links found in the URLs of Web sources. If the URL has been crawled before, then it is reindexed only after it changes.

  • EXAMINE_URLS: Crawls but does not index any URLs in the source. It also crawls any links found in those URLs. Use this mode when first crawling a new source, so that you can examine the documents and refine the crawling parameters if necessary before indexing.

  • INDEX_ONLY: Crawls and indexes all URLs in the source. It does not extract any links from those URLs. In general, select this option for a source that has been crawled previously using EXAMINE_URLS.

<search:recrawlPolicy>

One of these recrawl policies:

  • PROCESS_ALL: Recrawls all documents in the source.

  • PROCESS_CHANGED: Crawls only documents that changed after the last crawl. For file sources, documents are also crawled if the parent directory changed.

<search:frequency>

Controls the intervals between starting a schedule. It contains one of these elements:

<search:hourly>
<search:daily>
<search:weekly>
<search:monthly>
<search:manual>
<search:hourly>

Describes an hourly schedule. It contains a <search:hoursBtwnLaunches> element.

<search:hoursBtwnLaunches>

Number of hours between starting crawls, in the range of 1 to 23.

<search:daily>

Describes a daily schedule. It contains these elements:

<search:daysBtwnLaunches>
<search:startHour>
<search:daysBtwnLaunches>

Number of days between starting crawls, in the range of 1 to 99.

<search:startHour>

The time the crawl begins using a 24-hour clock, such as 9 for 9:00 a.m. or 23 for 11:00 p.m.

<search:weekly>

Describes a weekly schedule. It contains these elements:

<search:weeksBtwnLaunches>
<search:startDayOfWeek>
<search:startHour>
<search:weeksBtwnLaunches>

Number of weeks between starting crawls, in the range of 1 to 12.

<search:startDayOfWeek>

The day of the week that the crawl begins, such as MONDAY or TUESDAY.

<search:monthly>

Describes a monthly schedule. It contains these elements:

<search:monthsBtwnLaunches>
<search:startDayOfMonth>
<search:startHour>
<search:monthsBtwnLaunches>

Number of months between starting crawls, in the range of 1 to 12.

<search:startDayOfMonth>

An integer value for the day of the month that the crawl begins, such as 1 or 15.

<search:manual>

Describes a manual search.

<search:assignedSources>

Contains one or more <search:assignedSource> elements, one for each source that is crawled using this schedule.

<search:assignedSource>

The name of a source crawled using this schedule. The source cannot be a mailing-list source or a federated source.

Example

This XML document creates a schedule for mySource that runs every third Monday at 11:00 p.m.:

<?xml version="1.0" encoding="UTF-8"?>
<search:config productVersion="11.1.2.0.0" xmlns:search="http://xmlns.oracle.com/search">
   <search:schedules>
      <search:schedule>
         <search:name>schedule1</search:name>
         <search:crawlingMode>INDEX_ONLY</search:crawlingMode>
         <search:recrawlPolicy>PROCESS_ALL</search:recrawlPolicy>
         <search:frequency>
            <search:weekly>
               <search:weeksBtwnLaunches>3</search:weeksBtwnLaunches>
               <search:startDayOfWeek>MONDAY</search:startDayOfWeek>
               <search:startHour>23</search:startHour>
            </search:weekly>
         </search:frequency>
         <search:assignedSources>
            <search:assignedSource>mySource</search:assignedSource>
         </search:assignedSources>
      </search:schedule>
   </search:schedules>
</search:config>