Package | Description |
---|---|
com.endeca.itl.web.auth | |
com.endeca.itl.web.process | |
org.apache.hadoop.conf |
Configuration of system parameters.
|
org.apache.hadoop.io |
Generic i/o code for use when reading and writing data to the network,
to databases, and to files.
|
org.apache.nutch.crawl |
Crawl control code.
|
org.apache.nutch.fetcher |
The Nutch robot.
|
org.apache.nutch.metadata |
A Multi-valued Metadata container, and set
of constant fields for Nutch Metadata.
|
org.apache.nutch.net | |
org.apache.nutch.parse | |
org.apache.nutch.protocol |
Modifier and Type | Interface and Description |
---|---|
interface |
Authenticator |
Modifier and Type | Interface and Description |
---|---|
interface |
EndecaRecordGenerator |
interface |
EndecaRecordOutputter |
Modifier and Type | Class and Description |
---|---|
class |
Configured
Base class for things that may be configured with a
Configuration . |
Modifier and Type | Class and Description |
---|---|
class |
ObjectWritable
A polymorphic Writable that writes an instance with it's class name.
|
Modifier and Type | Class and Description |
---|---|
class |
MD5Signature
Default implementation of a page signature.
|
class |
Signature |
class |
TextProfileSignature
An implementation of a page signature.
|
Modifier and Type | Class and Description |
---|---|
class |
FetcherOutput |
Modifier and Type | Class and Description |
---|---|
class |
MetaWrapper
This is a simple decorator that adds metadata to any Object-s that can be
serialized by ObjectWritable.
|
Modifier and Type | Interface and Description |
---|---|
interface |
URLFilter
Interface used to limit which URLs enter Nutch.
|
interface |
URLNormalizer
Interface used to convert URLs to normal form and optionally perform substitutions
|
Modifier and Type | Interface and Description |
---|---|
interface |
HtmlParseFilter
Deprecated.
Extension point for DOM-based HTML parsers. Permits one to add additional
metadata to HTML parses. All plugins found which implement this extension
point are run sequentially on the parse.
|
interface |
ParseFilter |
interface |
Parser
A parser for content generated by a
Protocol
implementation. |
Modifier and Type | Class and Description |
---|---|
class |
ParseData
Data extracted from a page's content.
|
class |
ParseImpl
The result of parsing a page's raw content.
|
Modifier and Type | Interface and Description |
---|---|
interface |
Protocol
A retriever of url content.
|
Copyright © 2007, 2017, Oracle and/or its affiliates. All rights reserved.