| Package | Description |
|---|---|
| com.endeca.itl.web.process | |
| org.apache.nutch.crawl |
Crawl control code.
|
| org.apache.nutch.fetcher |
The Nutch robot.
|
| org.apache.nutch.parse | |
| org.apache.nutch.protocol |
| Modifier and Type | Method and Description |
|---|---|
List<com.endeca.itl.record.Record> |
EndecaRecordGenerator.generate(Content content,
Parse parse) |
| Modifier and Type | Method and Description |
|---|---|
byte[] |
TextProfileSignature.calculate(Content content,
Parse parse) |
abstract byte[] |
Signature.calculate(Content content,
Parse parse) |
byte[] |
MD5Signature.calculate(Content content,
Parse parse) |
| Modifier and Type | Method and Description |
|---|---|
Content |
FetcherOutput.getContent() |
| Constructor and Description |
|---|
FetcherOutput(CrawlDatum crawlDatum,
Content content,
ParseImpl parse) |
| Modifier and Type | Method and Description |
|---|---|
Parse |
ParseFilters.filter(Content content,
Parse parse)
Run all defined filters.
|
Parse |
ParseFilter.filter(Content content,
Parse parse)
Adds metadata or modifies parse
|
Parse |
HtmlParseFilters.filter(Content content,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Deprecated.
Run all defined filters.
|
Parse |
HtmlParseFilter.filter(Content content,
Parse parse,
HTMLMetaTags metaTags,
DocumentFragment doc)
Deprecated.
Adds metadata or otherwise modifies a parse of HTML content, given
the DOM tree of a page.
|
Parse |
Parser.getParse(Content c)
Creates the parse for some content.
|
Parse |
ParseUtil.parse(Content content)
|
Parse |
ParseUtil.parseByExtensionId(String extId,
Content content)
|
| Modifier and Type | Method and Description |
|---|---|
Content |
ProtocolOutput.getContent() |
static Content |
Content.read(DataInput in) |
| Modifier and Type | Method and Description |
|---|---|
void |
ProtocolOutput.setContent(Content content) |
| Constructor and Description |
|---|
ProtocolOutput(Content content) |
ProtocolOutput(Content content,
ProtocolStatus status) |
Copyright © 2007, 2014, Oracle and/or its affiliates. All rights reserved.