| Package | Description |
|---|---|
| org.apache.nutch.crawl |
Crawl control code.
|
| org.apache.nutch.fetcher |
The Nutch robot.
|
| org.apache.nutch.protocol |
| Modifier and Type | Method and Description |
|---|---|
static CrawlDatum |
CrawlDatum.read(DataInput in) |
| Modifier and Type | Method and Description |
|---|---|
static boolean |
CrawlDatum.hasDbStatus(CrawlDatum datum) |
static boolean |
CrawlDatum.hasFetchStatus(CrawlDatum datum) |
void |
CrawlDatum.set(CrawlDatum that)
Copy the contents of another instance into this instance.
|
| Modifier and Type | Method and Description |
|---|---|
CrawlDatum |
FetcherOutput.getCrawlDatum() |
| Constructor and Description |
|---|
FetcherOutput(CrawlDatum crawlDatum,
Content content,
ParseImpl parse) |
| Modifier and Type | Method and Description |
|---|---|
ProtocolOutput |
Protocol.getProtocolOutput(Text url,
CrawlDatum datum)
Returns the
Content for a fetchlist entry. |
RobotRules |
Protocol.getRobotRules(Text url,
CrawlDatum datum)
Retrieve robot rules applicable for this url.
|
Copyright © 2007, 2014, Oracle and/or its affiliates. All rights reserved.