Package | Description |
---|---|
org.apache.nutch.crawl |
Crawl control code.
|
org.apache.nutch.fetcher |
The Nutch robot.
|
org.apache.nutch.protocol |
Modifier and Type | Method and Description |
---|---|
static CrawlDatum |
CrawlDatum.read(DataInput in) |
Modifier and Type | Method and Description |
---|---|
static boolean |
CrawlDatum.hasDbStatus(CrawlDatum datum) |
static boolean |
CrawlDatum.hasFetchStatus(CrawlDatum datum) |
void |
CrawlDatum.set(CrawlDatum that)
Copy the contents of another instance into this instance.
|
Modifier and Type | Method and Description |
---|---|
CrawlDatum |
FetcherOutput.getCrawlDatum() |
Constructor and Description |
---|
FetcherOutput(CrawlDatum crawlDatum,
Content content,
ParseImpl parse) |
Modifier and Type | Method and Description |
---|---|
ProtocolOutput |
Protocol.getProtocolOutput(Text url,
CrawlDatum datum)
Returns the
Content for a fetchlist entry. |
RobotRules |
Protocol.getRobotRules(Text url,
CrawlDatum datum)
Retrieve robot rules applicable for this url.
|
Copyright © 2007, 2014, Oracle and/or its affiliates. All rights reserved.