A B C D E F G H I K L M N O P R S T U V W X Z _

A

ACCESS_DENIED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Access denied - authorization required, but missing/incorrect.
add(String, String) - Method in class org.apache.nutch.metadata.Metadata
Add a metadata name/value mapping.
add(String, String) - Method in class org.apache.nutch.metadata.SpellCheckedMetadata
 
addAttribute(String, String) - Method in class org.apache.nutch.plugin.Extension
Adds a attribute and is only used until model creation at plugin system start up.
addDefaultResource(String) - Method in class org.apache.hadoop.conf.Configuration
Add a default resource.
addDefaultResource(URL) - Method in class org.apache.hadoop.conf.Configuration
Add a default resource.
addDefaultResource(Path) - Method in class org.apache.hadoop.conf.Configuration
Add a default resource.
addDependency(String) - Method in class org.apache.nutch.plugin.PluginDescriptor
Adds a dependency
addExportedLibRelative(String) - Method in class org.apache.nutch.plugin.PluginDescriptor
Adds a exported library with a relative path to the plugin directory.
addExtension(Extension) - Method in class org.apache.nutch.plugin.ExtensionPoint
Install a coresponding extension to this extension point.
addExtension(Extension) - Method in class org.apache.nutch.plugin.PluginDescriptor
Adds a extension.
addExtensionPoint(ExtensionPoint) - Method in class org.apache.nutch.plugin.PluginDescriptor
Adds a extension point.
addFinalResource(String) - Method in class org.apache.hadoop.conf.Configuration
Add a final resource.
addFinalResource(URL) - Method in class org.apache.hadoop.conf.Configuration
Add a final resource.
addFinalResource(Path) - Method in class org.apache.hadoop.conf.Configuration
Add a final resource.
addMeta(String, String) - Method in class org.apache.nutch.metadata.MetaWrapper
Add metadata.
addName(Class, String) - Static method in class org.apache.hadoop.io.WritableName
Add an alternate name for a class.
addNotExportedLibRelative(String) - Method in class org.apache.nutch.plugin.PluginDescriptor
Adds a not exported library with a plugin directory relative path.
addNutchResources(Configuration) - Static method in class org.apache.nutch.util.NutchConfiguration
Add the standard Nutch resources to Configuration.
addPatternBackward(String) - Method in class org.apache.nutch.util.TrieStringMatcher
Adds any necessary nodes to the trie so that the given String can be decoded in reverse and the first character is represented by a terminal node.
addPatternForward(String) - Method in class org.apache.nutch.util.TrieStringMatcher
Adds any necessary nodes to the trie so that the given String can be decoded and the last character is represented by a terminal node.
APPLICATION_NAME - Static variable in interface org.apache.nutch.metadata.Office
 
arrayToString(String[]) - Static method in class org.apache.hadoop.util.StringUtils
Given an array of strings, return a comma-separated list of its elements.
ArrayWritable - Class in org.apache.hadoop.io
A Writable for arrays containing instances of a class.
ArrayWritable() - Constructor for class org.apache.hadoop.io.ArrayWritable
 
ArrayWritable(Class) - Constructor for class org.apache.hadoop.io.ArrayWritable
 
ArrayWritable(Class, Writable[]) - Constructor for class org.apache.hadoop.io.ArrayWritable
 
ArrayWritable(String[]) - Constructor for class org.apache.hadoop.io.ArrayWritable
 
authenticate(List<URI>, FetcherOutput, Protocol) - Method in interface com.endeca.itl.web.auth.Authenticator
Authenticates the crawler for a particular site during the crawl, if necessary.
AuthenticationFailedException - Exception in com.endeca.itl.web.auth
 
AuthenticationFailedException() - Constructor for exception com.endeca.itl.web.auth.AuthenticationFailedException
 
AuthenticationFailedException(String) - Constructor for exception com.endeca.itl.web.auth.AuthenticationFailedException
 
AuthenticationFailedException(Throwable) - Constructor for exception com.endeca.itl.web.auth.AuthenticationFailedException
 
AuthenticationFailedException(String, Throwable) - Constructor for exception com.endeca.itl.web.auth.AuthenticationFailedException
 
Authenticator - Interface in com.endeca.itl.web.auth
 
AuthenticatorConfiguration - Class in com.endeca.itl.web.auth
 
AuthenticatorConfiguration() - Constructor for class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
AuthenticatorConfiguration(String, String, String, String, String, boolean, List<NameValuePair>, Map<String, String>) - Constructor for class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
AuthenticatorManager - Class in com.endeca.itl.web.auth
 
AuthenticatorManager(Configuration) - Constructor for class com.endeca.itl.web.auth.AuthenticatorManager
 
AUTHOR - Static variable in interface org.apache.nutch.metadata.Office
 

B

BLOCKED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Thread was blocked http.max.delays times during fetching.
BooleanWritable - Class in org.apache.hadoop.io
A WritableComparable for booleans.
BooleanWritable() - Constructor for class org.apache.hadoop.io.BooleanWritable
 
BooleanWritable(boolean) - Constructor for class org.apache.hadoop.io.BooleanWritable
 
BooleanWritable.Comparator - Class in org.apache.hadoop.io
A Comparator optimized for BooleanWritable.
BooleanWritable.Comparator() - Constructor for class org.apache.hadoop.io.BooleanWritable.Comparator
 
bytesToCodePoint(ByteBuffer) - Static method in class org.apache.hadoop.io.Text
Returns the next code point at the current position in the buffer.
BytesWritable - Class in org.apache.hadoop.io
A byte sequence that is usable as a key or value.
BytesWritable() - Constructor for class org.apache.hadoop.io.BytesWritable
Create a zero-size sequence.
BytesWritable(byte[]) - Constructor for class org.apache.hadoop.io.BytesWritable
Create a BytesWritable using the byte array as the initial value.
BytesWritable.Comparator - Class in org.apache.hadoop.io
A Comparator optimized for BytesWritable.
BytesWritable.Comparator() - Constructor for class org.apache.hadoop.io.BytesWritable.Comparator
 
byteToHexString(byte[]) - Static method in class org.apache.hadoop.util.StringUtils
Given an array of bytes it will convert the bytes to a hex string representation of the bytes

C

CACHING_FORBIDDEN_ALL - Static variable in interface org.apache.nutch.metadata.Nutch
Don't show either original forbidden content or summaries.
CACHING_FORBIDDEN_CONTENT - Static variable in interface org.apache.nutch.metadata.Nutch
Don't show original forbidden content, but show summaries.
CACHING_FORBIDDEN_KEY - Static variable in interface org.apache.nutch.metadata.Nutch
Sites may request that search engines don't provide access to cached documents.
CACHING_FORBIDDEN_NONE - Static variable in interface org.apache.nutch.metadata.Nutch
Show both original forbidden content and summaries (default).
calculate(Content, Parse) - Method in class org.apache.nutch.crawl.MD5Signature
 
calculate(Content, Parse) - Method in class org.apache.nutch.crawl.Signature
 
calculate(Content, Parse) - Method in class org.apache.nutch.crawl.TextProfileSignature
 
CHAR_ENCODING_FOR_CONVERSION - Static variable in interface org.apache.nutch.metadata.Nutch
 
CHARACTER_COUNT - Static variable in interface org.apache.nutch.metadata.Office
 
charAt(int) - Method in class org.apache.hadoop.io.Text
Returns the Unicode Scalar Value (32-bit integer value) for the character at position.
CHECK_BLOCKING - Static variable in interface org.apache.nutch.protocol.Protocol
Property name.
CHECK_ROBOTS - Static variable in interface org.apache.nutch.protocol.Protocol
Property name.
children - Variable in class org.apache.nutch.util.TrieStringMatcher.TrieNode
 
childrenList - Variable in class org.apache.nutch.util.TrieStringMatcher.TrieNode
 
CircularDependencyException - Exception in org.apache.nutch.plugin
CircularDependencyException will be thrown if a circular dependency is detected.
CircularDependencyException(Throwable) - Constructor for exception org.apache.nutch.plugin.CircularDependencyException
 
CircularDependencyException(String) - Constructor for exception org.apache.nutch.plugin.CircularDependencyException
 
clean(String) - Static method in class org.apache.nutch.util.mime.MimeType
Cleans a content-type.
clear() - Method in class org.apache.nutch.crawl.MapWritable
 
clone() - Method in class org.apache.nutch.crawl.CrawlDatum
 
close() - Method in interface com.endeca.itl.web.process.EndecaRecordOutputter
 
com.endeca.itl.web.auth - package com.endeca.itl.web.auth
 
com.endeca.itl.web.process - package com.endeca.itl.web.process
 
COMMENTS - Static variable in interface org.apache.nutch.metadata.Office
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.hadoop.io.BooleanWritable.Comparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.hadoop.io.BytesWritable.Comparator
Compare the buffers in serialized form.
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.hadoop.io.FloatWritable.Comparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.hadoop.io.IntWritable.Comparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.hadoop.io.LongWritable.Comparator
 
compare(WritableComparable, WritableComparable) - Method in class org.apache.hadoop.io.LongWritable.DecreasingComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.hadoop.io.LongWritable.DecreasingComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.hadoop.io.MD5Hash.Comparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.hadoop.io.Text.Comparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.hadoop.io.WritableComparator
Optimization hook.
compare(WritableComparable, WritableComparable) - Method in class org.apache.hadoop.io.WritableComparator
Compare two WritableComparables.
compare(Object, Object) - Method in class org.apache.hadoop.io.WritableComparator
 
compare(byte[], int, int, byte[], int, int) - Method in class org.apache.nutch.crawl.CrawlDatum.Comparator
 
compare(Object, Object) - Method in class org.apache.nutch.crawl.SignatureComparator
 
compareBytes(byte[], int, int, byte[], int, int) - Static method in class org.apache.hadoop.io.WritableComparator
Lexicographic order of binary data.
compareTo(Object) - Method in class org.apache.hadoop.fs.Path
 
compareTo(Object) - Method in class org.apache.hadoop.io.BooleanWritable
 
compareTo(Object) - Method in class org.apache.hadoop.io.BytesWritable
Define the sort order of the BytesWritable.
compareTo(Object) - Method in class org.apache.hadoop.io.FloatWritable
Compares two FloatWritables.
compareTo(Object) - Method in class org.apache.hadoop.io.IntWritable
Compares two IntWritables.
compareTo(Object) - Method in class org.apache.hadoop.io.LongWritable
Compares two LongWritables.
compareTo(Object) - Method in class org.apache.hadoop.io.MD5Hash
Compares this object with the specified object for order.
compareTo(Object) - Method in class org.apache.hadoop.io.Text
Compare two Texts bytewise using standard UTF8 ordering.
compareTo(Object) - Method in class org.apache.hadoop.io.VIntWritable
Compares two VIntWritables.
compareTo(Object) - Method in class org.apache.hadoop.io.VLongWritable
Compares two VLongWritables.
compareTo(Object) - Method in class org.apache.nutch.crawl.CrawlDatum
Sort by decreasing score.
compareTo(Object) - Method in class org.apache.nutch.util.TrieStringMatcher.TrieNode
 
CompressedWritable - Class in org.apache.hadoop.io
A base-class for Writables which store themselves compressed and lazily inflate on field access.
CompressedWritable() - Constructor for class org.apache.hadoop.io.CompressedWritable
 
conf - Variable in class org.apache.nutch.crawl.Signature
 
conf - Variable in class org.apache.nutch.plugin.Plugin
 
CONFIG_DEFAULT - Static variable in class org.apache.nutch.util.NutchConfiguration
 
CONFIG_SITE - Static variable in class org.apache.nutch.util.NutchConfiguration
 
Configurable - Interface in org.apache.hadoop.conf
Something that may be configured with a Configuration.
Configuration - Class in org.apache.hadoop.conf
Provides access to configuration parameters.
Configuration() - Constructor for class org.apache.hadoop.conf.Configuration
A new configuration.
Configuration(File) - Constructor for class org.apache.hadoop.conf.Configuration
 
Configuration(List<File>) - Constructor for class org.apache.hadoop.conf.Configuration
 
Configuration(Configuration) - Constructor for class org.apache.hadoop.conf.Configuration
A new configuration with the same settings cloned from another.
Configured - Class in org.apache.hadoop.conf
Base class for things that may be configured with a Configuration.
Configured(Configuration) - Constructor for class org.apache.hadoop.conf.Configured
Construct a Configured.
containsKey(Writable) - Method in class org.apache.nutch.crawl.MapWritable
 
containsValue(Writable) - Method in class org.apache.nutch.crawl.MapWritable
 
Content - Class in org.apache.nutch.protocol
 
Content() - Constructor for class org.apache.nutch.protocol.Content
 
Content(String, String, byte[], String, Metadata, Configuration) - Constructor for class org.apache.nutch.protocol.Content
 
CONTENT_DISPOSITION - Static variable in interface org.apache.nutch.metadata.HttpHeaders
 
CONTENT_ENCODING - Static variable in interface org.apache.nutch.metadata.HttpHeaders
 
CONTENT_LANGUAGE - Static variable in interface org.apache.nutch.metadata.HttpHeaders
 
CONTENT_LENGTH - Static variable in interface org.apache.nutch.metadata.HttpHeaders
 
CONTENT_LOCATION - Static variable in interface org.apache.nutch.metadata.HttpHeaders
 
CONTENT_MD5 - Static variable in interface org.apache.nutch.metadata.HttpHeaders
 
CONTENT_TYPE - Static variable in interface org.apache.nutch.metadata.HttpHeaders
 
CONTRIBUTOR - Static variable in interface org.apache.nutch.metadata.DublinCore
An entity responsible for making contributions to the content of the resource.
COVERAGE - Static variable in interface org.apache.nutch.metadata.DublinCore
The extent or scope of the content of the resource.
CrawlDatum - Class in org.apache.nutch.crawl
 
CrawlDatum() - Constructor for class org.apache.nutch.crawl.CrawlDatum
 
CrawlDatum(int, float) - Constructor for class org.apache.nutch.crawl.CrawlDatum
 
CrawlDatum(int, float, float) - Constructor for class org.apache.nutch.crawl.CrawlDatum
 
CrawlDatum.Comparator - Class in org.apache.nutch.crawl
A Comparator optimized for CrawlDatum.
CrawlDatum.Comparator() - Constructor for class org.apache.nutch.crawl.CrawlDatum.Comparator
 
create(File) - Static method in class org.apache.nutch.util.NutchConfiguration
 
create(List<File>) - Static method in class org.apache.nutch.util.NutchConfiguration
Creates a configuration based on a list of directories.
createMerged(List<File>) - Static method in class org.apache.nutch.util.NutchConfiguration
Creates a configuration based on a list of files/directories that are passed in.
CreativeCommons - Interface in org.apache.nutch.metadata
A collection of Creative Commons properties names.
CREATOR - Static variable in interface org.apache.nutch.metadata.DublinCore
An entity primarily responsible for making the content of the resource.
CredentialsParseException - Exception in com.endeca.itl.web.auth
 
CredentialsParseException() - Constructor for exception com.endeca.itl.web.auth.CredentialsParseException
 
CredentialsParseException(String) - Constructor for exception com.endeca.itl.web.auth.CredentialsParseException
 
CredentialsParseException(Throwable) - Constructor for exception com.endeca.itl.web.auth.CredentialsParseException
 
CredentialsParseException(String, Throwable) - Constructor for exception com.endeca.itl.web.auth.CredentialsParseException
 

D

DataInputBuffer - Class in org.apache.hadoop.io
A reusable DataInput implementation that reads from an in-memory buffer.
DataInputBuffer() - Constructor for class org.apache.hadoop.io.DataInputBuffer
Constructs a new empty buffer.
DataOutputBuffer - Class in org.apache.hadoop.io
A reusable DataOutput implementation that writes to an in-memory buffer.
DataOutputBuffer() - Constructor for class org.apache.hadoop.io.DataOutputBuffer
Constructs a new empty buffer.
DATE - Static variable in interface org.apache.nutch.metadata.DublinCore
A date associated with an event in the life cycle of the resource.
decode(byte[]) - Static method in class org.apache.hadoop.io.Text
Converts the provided byte array to a String using the UTF-8 encoding.
decode(byte[], int, int) - Static method in class org.apache.hadoop.io.Text
 
decode(byte[], int, int, boolean) - Static method in class org.apache.hadoop.io.Text
Converts the provided byte array to a String using the UTF-8 encoding.
DEFAULT - Static variable in class org.apache.nutch.util.mime.MimeTypes
The default application/octet-stream MimeType
DEFAULT_PLUGIN - Static variable in class org.apache.nutch.parse.ParserFactory
Wildcard for default plugins.
define(Class, WritableComparator) - Static method in class org.apache.hadoop.io.WritableComparator
Register an optimized comparator for a WritableComparable implementation.
depth() - Method in class org.apache.hadoop.fs.Path
Return the number of elements in this path.
DESCRIPTION - Static variable in interface org.apache.nutch.metadata.DublinCore
An account of the content of the resource.
digest(byte[]) - Static method in class org.apache.hadoop.io.MD5Hash
Construct a hash value for a byte array.
digest(byte[], int, int) - Static method in class org.apache.hadoop.io.MD5Hash
Construct a hash value for a byte array.
digest(String) - Static method in class org.apache.hadoop.io.MD5Hash
Construct a hash value for a String.
digest(Text) - Static method in class org.apache.hadoop.io.MD5Hash
Construct a hash value for a String.
DIR_NAME - Static variable in class org.apache.nutch.parse.ParseData
 
DIR_NAME - Static variable in class org.apache.nutch.parse.ParseText
 
DIR_NAME - Static variable in class org.apache.nutch.protocol.Content
 
displayByteArray(byte[]) - Static method in class org.apache.hadoop.io.WritableUtils
 
DublinCore - Interface in org.apache.nutch.metadata
A collection of Dublin Core metadata names.

E

EmptyRobotRules - Class in org.apache.nutch.protocol
 
EmptyRobotRules() - Constructor for class org.apache.nutch.protocol.EmptyRobotRules
 
encode(String) - Static method in class org.apache.hadoop.io.Text
Converts the provided String to bytes using the UTF-8 encoding.
encode(String, boolean) - Static method in class org.apache.hadoop.io.Text
Converts the provided String to bytes using the UTF-8 encoding.
EndecaRecordGenerator - Interface in com.endeca.itl.web.process
 
EndecaRecordOutputter - Interface in com.endeca.itl.web.process
 
ensureInflated() - Method in class org.apache.hadoop.io.CompressedWritable
Must be called by all methods which access fields to ensure that the data has been uncompressed.
ensureInflated() - Method in class org.apache.nutch.protocol.Content
 
entries() - Method in class org.apache.hadoop.conf.Configuration
 
equals(Object) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
equals(Object) - Method in class org.apache.hadoop.fs.Path
 
equals(Object) - Method in class org.apache.hadoop.io.BooleanWritable
 
equals(Object) - Method in class org.apache.hadoop.io.BytesWritable
Are the two byte sequences equal?
equals(Object) - Method in class org.apache.hadoop.io.FloatWritable
Returns true iff o is a FloatWritable with the same value.
equals(Object) - Method in class org.apache.hadoop.io.IntWritable
Returns true iff o is a IntWritable with the same value.
equals(Object) - Method in class org.apache.hadoop.io.LongWritable
Returns true iff o is a LongWritable with the same value.
equals(Object) - Method in class org.apache.hadoop.io.MD5Hash
Returns true iff o is an MD5Hash whose digest contains the same values.
equals(Object) - Method in class org.apache.hadoop.io.Text
Returns true iff o is a Text with the same contents.
equals(Object) - Method in class org.apache.hadoop.io.VIntWritable
Returns true iff o is a VIntWritable with the same value.
equals(Object) - Method in class org.apache.hadoop.io.VLongWritable
Returns true iff o is a VLongWritable with the same value.
equals(Object) - Method in class org.apache.nutch.crawl.CrawlDatum
 
equals(Object) - Method in class org.apache.nutch.crawl.MapWritable
 
equals(Object) - Method in class org.apache.nutch.fetcher.FetcherOutput
 
equals(Object) - Method in class org.apache.nutch.metadata.Metadata
 
equals(Object) - Method in class org.apache.nutch.parse.Outlink
 
equals(Object) - Method in class org.apache.nutch.parse.ParseData
 
equals(Object) - Method in class org.apache.nutch.parse.ParseStatus
 
equals(Object) - Method in class org.apache.nutch.parse.ParseText
 
equals(Object) - Method in class org.apache.nutch.protocol.Content
 
equals(Object) - Method in class org.apache.nutch.protocol.ProtocolStatus
 
equals(Object) - Method in class org.apache.nutch.util.mime.MimeType
Indicates if an object is equal to this mime-type.
EXCEPTION - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Unspecified exception occured.
Extension - Class in org.apache.nutch.plugin
An Extension is a kind of listener descriptor that will be installed on a concrete ExtensionPoint that acts as kind of Publisher.
Extension(PluginDescriptor, String, String, String, Configuration, PluginRepository) - Constructor for class org.apache.nutch.plugin.Extension
 
ExtensionPoint - Class in org.apache.nutch.plugin
The ExtensionPoint provide meta information of a extension point.
ExtensionPoint(String, String, String) - Constructor for class org.apache.nutch.plugin.ExtensionPoint
Constructor

F

FAILED - Static variable in class org.apache.nutch.parse.ParseStatus
General failure.
FAILED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Content was not retrieved.
FAILED_EXCEPTION - Static variable in class org.apache.nutch.parse.ParseStatus
Parsing failed.
FAILED_INVALID_FORMAT - Static variable in class org.apache.nutch.parse.ParseStatus
Parsing failed.
FAILED_MISSING_CONTENT - Static variable in class org.apache.nutch.parse.ParseStatus
Parsing failed.
FAILED_MISSING_PARTS - Static variable in class org.apache.nutch.parse.ParseStatus
Parsing failed.
FAILED_TRUNCATED - Static variable in class org.apache.nutch.parse.ParseStatus
Parsing failed.
FETCH_DIR_NAME - Static variable in class org.apache.nutch.crawl.CrawlDatum
 
FetcherOutput - Class in org.apache.nutch.fetcher
 
FetcherOutput() - Constructor for class org.apache.nutch.fetcher.FetcherOutput
 
FetcherOutput(CrawlDatum, Content, ParseImpl) - Constructor for class org.apache.nutch.fetcher.FetcherOutput
 
filter(String) - Method in interface org.apache.nutch.net.URLFilter
 
filter(String, URL, URL) - Method in class org.apache.nutch.net.URLFilters
Run all defined filters.
filter(String, URL, URL) - Method in class org.apache.nutch.net.URLScopeFilter
 
filter(Content, Parse, HTMLMetaTags, DocumentFragment) - Method in interface org.apache.nutch.parse.HtmlParseFilter
Deprecated. Adds metadata or otherwise modifies a parse of HTML content, given the DOM tree of a page.
filter(Content, Parse, HTMLMetaTags, DocumentFragment) - Method in class org.apache.nutch.parse.HtmlParseFilters
Deprecated. Run all defined filters.
filter(Content, Parse) - Method in interface org.apache.nutch.parse.ParseFilter
Adds metadata or modifies parse
filter(Content, Parse) - Method in class org.apache.nutch.parse.ParseFilters
Run all defined filters.
finalize() - Method in class org.apache.nutch.plugin.Plugin
 
finalize() - Method in class org.apache.nutch.plugin.PluginRepository
 
find(String) - Method in class org.apache.hadoop.io.Text
 
find(String, int) - Method in class org.apache.hadoop.io.Text
Finds any occurence of what in the backing buffer, starting as position start.
FloatWritable - Class in org.apache.hadoop.io
A WritableComparable for floats.
FloatWritable() - Constructor for class org.apache.hadoop.io.FloatWritable
 
FloatWritable(float) - Constructor for class org.apache.hadoop.io.FloatWritable
 
FloatWritable.Comparator - Class in org.apache.hadoop.io
A Comparator optimized for FloatWritable.
FloatWritable.Comparator() - Constructor for class org.apache.hadoop.io.FloatWritable.Comparator
 
forceInflate() - Method in class org.apache.nutch.protocol.Content
By calling this method caller forces the next access to any property (via getters and setters) to check if decompressing of data is really required.
FORMAT - Static variable in interface org.apache.nutch.metadata.DublinCore
Typically, Format may include the media-type or dimensions of the resource.
format - Static variable in class org.apache.nutch.net.protocols.HttpDateFormat
 
formatPercent(double, int) - Static method in class org.apache.hadoop.util.StringUtils
Format a percentage for presentation to the user.
formatTimeDiff(long, long) - Static method in class org.apache.hadoop.util.StringUtils
Given a finish and start time in long milliseconds, returns a String in the format Xhrs, Ymins, Z sec, for the time difference between two times.
forName(String) - Method in class org.apache.nutch.util.mime.MimeTypes
Return a MimeType from its name.
fromHexString(String) - Static method in class org.apache.nutch.util.StringUtil
Convert a String containing consecutive (no inside whitespace) hexadecimal digits into a corresponding byte array.

G

generate(Content, Parse) - Method in interface com.endeca.itl.web.process.EndecaRecordGenerator
 
GENERATE_DIR_NAME - Static variable in class org.apache.nutch.crawl.CrawlDatum
 
GENERATE_TIME_KEY - Static variable in interface org.apache.nutch.metadata.Nutch
 
GenericWritable - Class in org.apache.hadoop.io
A wrapper for Writable instances.
GenericWritable() - Constructor for class org.apache.hadoop.io.GenericWritable
 
get(String, Object) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property.
get(String) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property, or null if no such property exists.
get(String, String) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property.
get() - Method in class org.apache.hadoop.io.ArrayWritable
 
get() - Method in class org.apache.hadoop.io.BooleanWritable
Returns the value of the BooleanWritable
get() - Method in class org.apache.hadoop.io.BytesWritable
Get the data from the BytesWritable.
get() - Method in class org.apache.hadoop.io.FloatWritable
Return the value of this FloatWritable.
get() - Method in class org.apache.hadoop.io.GenericWritable
Return the wrapped instance.
get() - Method in class org.apache.hadoop.io.IntWritable
Return the value of this IntWritable.
get() - Method in class org.apache.hadoop.io.LongWritable
Return the value of this LongWritable.
get() - Static method in class org.apache.hadoop.io.NullWritable
Returns the single instance of this class.
get() - Method in class org.apache.hadoop.io.ObjectWritable
Return the instance, or null if none.
get() - Method in class org.apache.hadoop.io.TwoDArrayWritable
 
get() - Method in class org.apache.hadoop.io.VIntWritable
Return the value of this VIntWritable.
get() - Method in class org.apache.hadoop.io.VLongWritable
Return the value of this LongWritable.
get(Class) - Static method in class org.apache.hadoop.io.WritableComparator
Get a comparator for a WritableComparable implementation.
get(Writable) - Method in class org.apache.nutch.crawl.MapWritable
 
get(String) - Method in class org.apache.nutch.metadata.Metadata
Get the value associated to a metadata name.
get(String) - Method in class org.apache.nutch.metadata.SpellCheckedMetadata
 
get(Configuration) - Static method in class org.apache.nutch.plugin.PluginRepository
 
get(String, Configuration) - Static method in class org.apache.nutch.util.mime.MimeTypes
Return a MimeTypes instance.
get(String, Log, Configuration) - Static method in class org.apache.nutch.util.mime.MimeTypes
Return a MimeTypes instance.
getActionUrl() - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
getAnchor() - Method in class org.apache.nutch.parse.Outlink
 
getArgs() - Method in class org.apache.nutch.parse.ParseStatus
 
getArgs() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
getAttribute(String) - Method in class org.apache.nutch.plugin.Extension
Returns a attribute value, that is setuped in the manifest file and is definied by the extension point xml schema.
getAuthConf() - Method in interface com.endeca.itl.web.auth.Authenticator
 
getAuthenticator(String) - Method in class com.endeca.itl.web.auth.AuthenticatorManager
 
getBaseHref() - Method in class org.apache.nutch.parse.HTMLMetaTags
A convenience method.
getBaseUrl() - Method in class org.apache.nutch.protocol.Content
The base url for relative links contained in the content.
getBoolean(String, boolean) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property as an boolean.
getBytes() - Method in class org.apache.hadoop.io.Text
Retuns the raw bytes.
getCapacity() - Method in class org.apache.hadoop.io.BytesWritable
Get the capacity, which is the maximum size that could handled without resizing the backing storage.
getClass(String, Class) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property as a Class.
getClass(String, Class, Class) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property as a Class.
getClass(String, Configuration) - Static method in class org.apache.hadoop.io.WritableName
Return the class for a name.
getClassByName(String) - Method in class org.apache.hadoop.conf.Configuration
Load a class by name.
getClassLoader() - Method in class org.apache.hadoop.conf.Configuration
Get the class loader for this job.
getClassLoader() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns a cached classloader for a plugin.
getClassName() - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
getClazz() - Method in class org.apache.nutch.plugin.Extension
Returns the full class name of the extension point implementation
getCode() - Method in interface org.apache.nutch.net.protocols.Response
Returns the response code.
getCode() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
getConf() - Method in interface org.apache.hadoop.conf.Configurable
Return the configuration used by this object.
getConf() - Method in class org.apache.hadoop.conf.Configured
 
getConf() - Method in class org.apache.hadoop.io.ObjectWritable
 
getConf() - Method in class org.apache.nutch.crawl.Signature
 
getConf() - Method in class org.apache.nutch.fetcher.FetcherOutput
 
getConf() - Method in class org.apache.nutch.parse.ParseData
 
getConf() - Method in class org.apache.nutch.parse.ParseImpl
 
getConfResourceAsInputStream(String) - Method in class org.apache.hadoop.conf.Configuration
Returns an input stream attached to the configuration resource with the given name.
getConfResourceAsReader(String) - Method in class org.apache.hadoop.conf.Configuration
Returns a reader attached to the configuration resource with the given name.
getContent() - Method in class org.apache.nutch.fetcher.FetcherOutput
 
getContent() - Method in interface org.apache.nutch.net.protocols.Response
Returns the full content of the response.
getContent() - Method in class org.apache.nutch.protocol.Content
The binary content retrieved.
getContent() - Method in class org.apache.nutch.protocol.ProtocolOutput
 
getContentMeta() - Method in class org.apache.nutch.parse.ParseData
The original Metadata retrieved from content
getContentType() - Method in exception org.apache.nutch.parse.ParserNotFound
 
getContentType() - Method in class org.apache.nutch.protocol.Content
The media type of the retrieved content.
getCrawlDatum() - Method in class org.apache.nutch.fetcher.FetcherOutput
 
getCrawlDelay() - Method in class org.apache.nutch.protocol.EmptyRobotRules
 
getCrawlDelay() - Method in interface org.apache.nutch.protocol.RobotRules
Get Crawl-Delay, in milliseconds.
getData() - Method in class org.apache.hadoop.io.DataOutputBuffer
Returns the current contents of the buffer.
getData() - Method in interface org.apache.nutch.parse.Parse
Other data extracted from the page.
getData() - Method in class org.apache.nutch.parse.ParseImpl
 
getDebugStream(Log) - Static method in class org.apache.nutch.util.LogUtil
 
getDeclaredClass() - Method in class org.apache.hadoop.io.ObjectWritable
Return the class this is meant to be.
getDependencies() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns a array of plugin ids.
getDescriptor() - Method in class org.apache.nutch.plugin.Extension
return the plugin descriptor.
getDescriptor() - Method in class org.apache.nutch.plugin.Plugin
Returns the plugin descriptor
getDigest() - Method in class org.apache.hadoop.io.MD5Hash
Returns the digest bytes.
getDOMRoot() - Method in class org.apache.nutch.parse.ParseData
Retrieve the DOM, if there is one.
getEmptyParse(Configuration) - Method in class org.apache.nutch.parse.ParseStatus
A convenience method.
getErrorStream(Log) - Static method in class org.apache.nutch.util.LogUtil
 
getExpireTime() - Method in class org.apache.nutch.protocol.EmptyRobotRules
 
getExpireTime() - Method in interface org.apache.nutch.protocol.RobotRules
Get expire time
getExportedLibUrls() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns a array exported librareis as URLs
getExtensionInstance() - Method in class org.apache.nutch.plugin.Extension
Return an instance of the extension implementatio.
getExtensionList(Extension[]) - Method in class org.apache.nutch.parse.ParseFilters
 
getExtensionPoint(String) - Method in class org.apache.nutch.plugin.PluginRepository
Returns a extension point indentified by a extension point id.
getExtensions(String) - Method in class org.apache.nutch.parse.ParserFactory
Finds the best-suited parse plugin for a given contentType.
getExtensions() - Method in class org.apache.nutch.plugin.ExtensionPoint
Returns a array of extensions that lsiten to this extension point
getExtensions() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns an array of extensions.
getExtenstionPoints() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns a array of extension points.
getFactory(Class) - Static method in class org.apache.hadoop.io.WritableFactories
Define a factory for a class.
getFatalStream(Log) - Static method in class org.apache.nutch.util.LogUtil
 
getFetchInterval() - Method in class org.apache.nutch.crawl.CrawlDatum
 
getFetchTime() - Method in class org.apache.nutch.crawl.CrawlDatum
 
getFile(String, String) - Method in class org.apache.hadoop.conf.Configuration
Returns a local file name under a directory named in dirsProp with the given path.
getFloat(String, float) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property as a float.
getFormattedTimeWithDiff(DateFormat, long, long) - Static method in class org.apache.hadoop.util.StringUtils
Formats time in ms and appends difference (finishTime - startTime) as returned by formatTimeDiff().
getGeneralTags() - Method in class org.apache.nutch.parse.HTMLMetaTags
Returns all collected values of the general meta tags.
getHeader(String) - Method in interface org.apache.nutch.net.protocols.Response
Returns the value of a named header.
getHeaders() - Method in interface org.apache.nutch.net.protocols.Response
Returns all the headers.
getHttpEquivTags() - Method in class org.apache.nutch.parse.HTMLMetaTags
Returns all collected values of the "http-equiv" meta tags.
getId() - Method in class org.apache.nutch.plugin.Extension
Return the unique id of the extension.
getId() - Method in class org.apache.nutch.plugin.ExtensionPoint
Returns the unique id of the extension point.
getInfoStream(Log) - Static method in class org.apache.nutch.util.LogUtil
 
getInt(String, int) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property as an integer.
getKeyClass() - Method in class org.apache.hadoop.io.WritableComparator
Returns the WritableComparable implementation class.
getLastModified() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
getLength() - Method in class org.apache.hadoop.io.DataInputBuffer
Returns the length of the input.
getLength() - Method in class org.apache.hadoop.io.DataOutputBuffer
Returns the length of the valid data currently in the buffer.
getLength() - Method in class org.apache.hadoop.io.Text
Returns the number of bytes in the byte array
getLocalPath(String, String) - Method in class org.apache.hadoop.conf.Configuration
Returns a local file under a directory named in dirsProp with the given path.
getLoginUrl() - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
getLong(String, long) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property as a long.
getMajorCode() - Method in class org.apache.nutch.parse.ParseStatus
 
getMessage() - Method in class org.apache.nutch.parse.ParseStatus
A convenience method.
getMessage() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
getMeta(String) - Method in class org.apache.nutch.metadata.MetaWrapper
Get metadata.
getMeta(String) - Method in class org.apache.nutch.parse.ParseData
Get a metadata single value.
getMetaData() - Method in class org.apache.nutch.crawl.CrawlDatum
returns a MapWritable if it was set or read in @see readFields(DataInput), returns empty map in case CrawlDatum was freshly created (lazily instantiated).
getMetadata() - Method in class org.apache.nutch.metadata.MetaWrapper
Get all metadata.
getMetadata() - Method in class org.apache.nutch.protocol.Content
Other protocol-specific data.
getMetaTag() - Method in class org.apache.nutch.parse.ParseData
Returns the HTML meta tags which are populated by parsing the meta tags in the head of an HTML document.
getMetaValues(String) - Method in class org.apache.nutch.metadata.MetaWrapper
Get multiple metadata.
getMethod() - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
getMimeType(File) - Method in class org.apache.nutch.util.mime.MimeTypes
Find the Mime Content Type of a file.
getMimeType(URL) - Method in class org.apache.nutch.util.mime.MimeTypes
Find the Mime Content Type of a document from its URL.
getMimeType(String) - Method in class org.apache.nutch.util.mime.MimeTypes
Find the Mime Content Type of a document from its name.
getMimeType(byte[]) - Method in class org.apache.nutch.util.mime.MimeTypes
Find the Mime Content Type of a stream from its content.
getMimeType(String, byte[]) - Method in class org.apache.nutch.util.mime.MimeTypes
Find the Mime Content Type of a document from its name and its content.
getMinLength() - Method in class org.apache.nutch.util.mime.MimeTypes
Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.
getMinorCode() - Method in class org.apache.nutch.parse.ParseStatus
 
getMode(String) - Static method in class org.apache.nutch.net.URLScopeFilter
 
getModifiedTime() - Method in class org.apache.nutch.crawl.CrawlDatum
 
getName() - Method in class org.apache.hadoop.fs.Path
Returns the final component of this path.
getName(Class) - Static method in class org.apache.hadoop.io.WritableName
Return the name for a class.
getName() - Method in class org.apache.nutch.plugin.ExtensionPoint
Returns the name of the extension point.
getName() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns the name of the plugin.
getName() - Method in class org.apache.nutch.util.mime.MimeType
Return the name of this mime-type.
getNoCache() - Method in class org.apache.nutch.parse.HTMLMetaTags
A convenience method.
getNoFollow() - Method in class org.apache.nutch.parse.HTMLMetaTags
A convenience method.
getNoIndex() - Method in class org.apache.nutch.parse.HTMLMetaTags
A convenience method.
getNormalizedName(String) - Static method in class org.apache.nutch.metadata.SpellCheckedMetadata
Get the normalized name of metadata attribute name.
getNotExportedLibUrls() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns a array of libraries as URLs that are not exported by the plugin.
getObject(String) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property, or null if no such property exists.
getOutlinks(String, Configuration) - Static method in class org.apache.nutch.parse.OutlinkExtractor
Extracts Outlink from given plain text.
getOutlinks(String, String, Configuration) - Static method in class org.apache.nutch.parse.OutlinkExtractor
Extracts Outlink from given plain text and adds anchor to the extracted Outlinks
getOutlinks() - Method in class org.apache.nutch.parse.ParseData
The outlinks of the page.
getParameters() - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
getParent() - Method in class org.apache.hadoop.fs.Path
Returns the parent of a path or null if at root.
getParse() - Method in class org.apache.nutch.fetcher.FetcherOutput
 
getParse(Content) - Method in interface org.apache.nutch.parse.Parser
Creates the parse for some content.
getParseMeta() - Method in class org.apache.nutch.parse.ParseData
Other content properties.
getParserById(String) - Method in class org.apache.nutch.parse.ParserFactory
Function returns a Parser instance with the specified extId, representing its extension ID.
getParsers(String, String) - Method in class org.apache.nutch.parse.ParserFactory
Function returns an array of Parsers for a given content type.
getPluginClass() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns the fully qualified name of the class which implements the abstarct Plugin class.
getPluginDescriptor(String) - Method in class org.apache.nutch.plugin.PluginRepository
Returns the descriptor of one plugin identified by a plugin id.
getPluginDescriptors() - Method in class org.apache.nutch.plugin.PluginRepository
Returns all registed plugin descriptors.
getPluginFolder(String) - Method in class org.apache.nutch.plugin.PluginManifestParser
Return the named plugin folder.
getPluginId() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns the unique identifier of the plug-in or null.
getPluginInstance(PluginDescriptor) - Method in class org.apache.nutch.plugin.PluginRepository
Returns a instance of a plugin.
getPluginPath() - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns the directory path of the plugin.
getPosition() - Method in class org.apache.hadoop.io.DataInputBuffer
Returns the current position in the input.
getPrimaryType() - Method in class org.apache.nutch.util.mime.MimeType
Return the primary type of this mime-type.
getProperties() - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
getProtocol(String) - Method in class org.apache.nutch.protocol.ProtocolFactory
Returns the appropriate Protocol implementation for a url.
getProtocolOutput(Text, CrawlDatum) - Method in interface org.apache.nutch.protocol.Protocol
Returns the Content for a fetchlist entry.
getProviderName() - Method in class org.apache.nutch.plugin.PluginDescriptor
 
getRefresh() - Method in class org.apache.nutch.parse.HTMLMetaTags
A convenience method.
getRefreshHref() - Method in class org.apache.nutch.parse.HTMLMetaTags
A convenience method.
getRefreshTime() - Method in class org.apache.nutch.parse.HTMLMetaTags
A convenience method.
getResource(String) - Method in class org.apache.hadoop.conf.Configuration
Returns the URL for the named resource.
getResourceString(String, Locale) - Method in class org.apache.nutch.plugin.PluginDescriptor
Returns a I18N'd resource string.
getResponseCode() - Method in class org.apache.nutch.crawl.CrawlDatum
 
getResponseCode() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
getRetriesSinceFetch() - Method in class org.apache.nutch.crawl.CrawlDatum
 
getRobotRules(Text, CrawlDatum) - Method in interface org.apache.nutch.protocol.Protocol
Retrieve robot rules applicable for this url.
getRobotsDelay() - Method in class org.apache.nutch.crawl.CrawlDatum
 
getSchema() - Method in class org.apache.nutch.plugin.ExtensionPoint
Returns a path to the xml schema of a extension point.
getScore() - Method in class org.apache.nutch.crawl.CrawlDatum
 
getSignature() - Method in class org.apache.nutch.crawl.CrawlDatum
 
getSignature(Configuration) - Static method in class org.apache.nutch.crawl.SignatureFactory
Return the default Signature implementation.
getSite() - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
getSize() - Method in class org.apache.hadoop.io.BytesWritable
Get the current size of the buffer.
getStatus() - Method in class org.apache.nutch.crawl.CrawlDatum
 
getStatus() - Method in class org.apache.nutch.parse.ParseData
The status of parsing the page.
getStatus() - Method in class org.apache.nutch.protocol.ProtocolOutput
 
getStatusName(byte) - Static method in class org.apache.nutch.crawl.CrawlDatum
 
getStrings(String) - Method in class org.apache.hadoop.conf.Configuration
Returns the value of the name property as an array of strings.
getStrings(String) - Static method in class org.apache.hadoop.util.StringUtils
returns an arraylist of strings
getSubType() - Method in class org.apache.nutch.util.mime.MimeType
Return the sub type of this mime-type.
getTargetPoint() - Method in class org.apache.nutch.plugin.Extension
Returns the Id of the extension point, that is implemented by this extension.
getText() - Method in interface org.apache.nutch.parse.Parse
The textual content of the page.
getText() - Method in class org.apache.nutch.parse.ParseImpl
 
getText() - Method in class org.apache.nutch.parse.ParseText
 
getTitle() - Method in class org.apache.nutch.parse.ParseData
The title of the page.
getToUrl() - Method in class org.apache.nutch.parse.Outlink
 
getTraceStream(Log) - Static method in class org.apache.nutch.util.LogUtil
 
getTypes() - Method in class org.apache.hadoop.io.GenericWritable
Return all classes that may be wrapped.
getUrl() - Method in interface org.apache.nutch.net.protocols.Response
Returns the URL used to retrieve this response.
getUrl() - Method in exception org.apache.nutch.parse.ParserNotFound
 
getUrl() - Method in class org.apache.nutch.protocol.Content
The url fetched.
getUrl() - Method in exception org.apache.nutch.protocol.ProtocolNotFound
 
getValueClass() - Method in class org.apache.hadoop.io.ArrayWritable
 
getValues(String) - Method in class org.apache.nutch.metadata.Metadata
Get the values associated to a metadata name.
getValues(String) - Method in class org.apache.nutch.metadata.SpellCheckedMetadata
 
getVersion() - Method in class org.apache.hadoop.io.VersionedWritable
Return the version number of the current implementation.
getVersion() - Method in class org.apache.nutch.parse.ParseData
 
getVersion() - Method in class org.apache.nutch.parse.ParseStatus
 
getVersion() - Method in class org.apache.nutch.parse.ParseText
 
getVersion() - Method in class org.apache.nutch.plugin.PluginDescriptor
 
getVersion() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
getVIntSize(long) - Static method in class org.apache.hadoop.io.WritableUtils
Get the encoded length if an integer is stored in a variable-length format
getWarnStream(Log) - Static method in class org.apache.nutch.util.LogUtil
 
GONE - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Resource is gone.
GZIPUtils - Class in org.apache.nutch.util
A collection of utility methods for working on GZIPed data.
GZIPUtils() - Constructor for class org.apache.nutch.util.GZIPUtils
 

H

halfDigest() - Method in class org.apache.hadoop.io.MD5Hash
Construct a half-sized version of this MD5.
hasDbStatus(CrawlDatum) - Static method in class org.apache.nutch.crawl.CrawlDatum
 
hasFetchStatus(CrawlDatum) - Static method in class org.apache.nutch.crawl.CrawlDatum
 
hashBytes(byte[], int) - Static method in class org.apache.hadoop.io.WritableComparator
Compute hash for binary data.
hashCode() - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
hashCode() - Method in class org.apache.hadoop.fs.Path
 
hashCode() - Method in class org.apache.hadoop.io.BooleanWritable
 
hashCode() - Method in class org.apache.hadoop.io.BytesWritable
 
hashCode() - Method in class org.apache.hadoop.io.FloatWritable
 
hashCode() - Method in class org.apache.hadoop.io.IntWritable
 
hashCode() - Method in class org.apache.hadoop.io.LongWritable
 
hashCode() - Method in class org.apache.hadoop.io.MD5Hash
Returns a hash code value for this object.
hashCode() - Method in class org.apache.hadoop.io.Text
hash function
hashCode() - Method in class org.apache.hadoop.io.VIntWritable
 
hashCode() - Method in class org.apache.hadoop.io.VLongWritable
 
hashCode() - Method in class org.apache.nutch.crawl.CrawlDatum
 
hashCode() - Method in class org.apache.nutch.crawl.MapWritable
 
hashCode() - Method in class org.apache.nutch.fetcher.FetcherOutput
 
hashCode() - Method in class org.apache.nutch.metadata.Metadata
 
hashCode() - Method in class org.apache.nutch.parse.Outlink
 
hashCode() - Method in class org.apache.nutch.parse.ParseData
 
hashCode() - Method in class org.apache.nutch.parse.ParseStatus
 
hashCode() - Method in class org.apache.nutch.parse.ParseText
 
hashCode() - Method in class org.apache.nutch.protocol.Content
 
hashCode() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
hashCode() - Method in class org.apache.nutch.util.mime.MimeType
 
hasMagic() - Method in class org.apache.nutch.util.mime.MimeType
 
hexStringToByte(String) - Static method in class org.apache.hadoop.util.StringUtils
Given a hexstring this will return the byte array corresponding to the string
HTMLMetaTags - Class in org.apache.nutch.parse
This class holds the information about HTML "meta" tags extracted from a page.
HTMLMetaTags() - Constructor for class org.apache.nutch.parse.HTMLMetaTags
 
HtmlParseFilter - Interface in org.apache.nutch.parse
Deprecated. Extension point for DOM-based HTML parsers. Permits one to add additional metadata to HTML parses. All plugins found which implement this extension point are run sequentially on the parse.
HtmlParseFilters - Class in org.apache.nutch.parse
Deprecated. Creates and caches HtmlParseFilter implementing plugins.
HtmlParseFilters(Configuration) - Constructor for class org.apache.nutch.parse.HtmlParseFilters
Deprecated.  
HttpDateFormat - Class in org.apache.nutch.net.protocols
class to handle HTTP dates.
HttpDateFormat() - Constructor for class org.apache.nutch.net.protocols.HttpDateFormat
 
HttpHeaders - Interface in org.apache.nutch.metadata
A collection of HTTP header names.
humanReadableInt(long) - Static method in class org.apache.hadoop.util.StringUtils
Given an integer, return a string that is in an approximate, but human readable format.

I

IDENTIFIER - Static variable in interface org.apache.nutch.metadata.DublinCore
Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system.
IntWritable - Class in org.apache.hadoop.io
A WritableComparable for ints.
IntWritable() - Constructor for class org.apache.hadoop.io.IntWritable
 
IntWritable(int) - Constructor for class org.apache.hadoop.io.IntWritable
 
IntWritable.Comparator - Class in org.apache.hadoop.io
A Comparator optimized for IntWritable.
IntWritable.Comparator() - Constructor for class org.apache.hadoop.io.IntWritable.Comparator
 
isAbsolute() - Method in class org.apache.hadoop.fs.Path
True if the directory of this path is absolute.
isAllowed(URL) - Method in class org.apache.nutch.protocol.EmptyRobotRules
 
isAllowed(URL) - Method in interface org.apache.nutch.protocol.RobotRules
Returns false if the robots.txt file prohibits us from accessing the given url, or true otherwise.
isEmpty() - Method in class org.apache.nutch.crawl.MapWritable
 
isEmpty(String) - Static method in class org.apache.nutch.util.StringUtil
Checks if a string is empty (ie is null or empty).
isMultiValued(String) - Method in class org.apache.nutch.metadata.Metadata
Returns true if named value is multivalued.
isParserRegistered(String) - Method in class org.apache.nutch.parse.ParserFactory
 
isParserRegistered(String) - Method in class org.apache.nutch.parse.ParseUtil
 
isPermanentFailure() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
isPreAuthenticate() - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
isSuccess() - Method in class org.apache.nutch.parse.ParseStatus
A convenience method.
isSuccess() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
isTransientFailure() - Method in class org.apache.nutch.protocol.ProtocolStatus
 

K

keySet() - Method in class org.apache.nutch.crawl.MapWritable
 
KEYWORDS - Static variable in interface org.apache.nutch.metadata.Office
 

L

LANGUAGE - Static variable in interface org.apache.nutch.metadata.DublinCore
A language of the intellectual content of the resource.
LAST_AUTHOR - Static variable in interface org.apache.nutch.metadata.Office
 
LAST_MODIFIED - Static variable in interface org.apache.nutch.metadata.HttpHeaders
 
LAST_PRINTED - Static variable in interface org.apache.nutch.metadata.Office
 
LAST_SAVED - Static variable in interface org.apache.nutch.metadata.Office
 
leftPad(String, int) - Static method in class org.apache.nutch.util.StringUtil
Returns a copy of s padded with leading spaces so that it's length is length.
LICENSE_LOCATION - Static variable in interface org.apache.nutch.metadata.CreativeCommons
 
LICENSE_URL - Static variable in interface org.apache.nutch.metadata.CreativeCommons
 
listDefaultAndFinalResources() - Method in class org.apache.hadoop.conf.Configuration
 
LOCATION - Static variable in interface org.apache.nutch.metadata.HttpHeaders
 
LOG - Static variable in class org.apache.nutch.crawl.MapWritable
 
LOG - Static variable in class org.apache.nutch.net.URLNormalizers
 
LOG - Static variable in class org.apache.nutch.parse.ParserFactory
 
LOG - Static variable in class org.apache.nutch.plugin.PluginDescriptor
 
LOG - Static variable in class org.apache.nutch.plugin.PluginManifestParser
 
LOG - Static variable in class org.apache.nutch.plugin.PluginRepository
 
LOG - Static variable in class org.apache.nutch.protocol.ProtocolFactory
 
LogUtil - Class in org.apache.nutch.util
Utility class for logging.
LogUtil() - Constructor for class org.apache.nutch.util.LogUtil
 
longestMatch(String) - Method in class org.apache.nutch.util.PrefixStringMatcher
Returns the longest prefix of input that is matched, or null if no match exists.
longestMatch(String) - Method in class org.apache.nutch.util.SuffixStringMatcher
Returns the longest suffix of input that is matched, or null if no match exists.
longestMatch(String) - Method in class org.apache.nutch.util.TrieStringMatcher
Returns the longest substring of input that is matched by a pattern in the trie, or null if no match exists.
LongWritable - Class in org.apache.hadoop.io
A WritableComparable for longs.
LongWritable() - Constructor for class org.apache.hadoop.io.LongWritable
 
LongWritable(long) - Constructor for class org.apache.hadoop.io.LongWritable
 
LongWritable.Comparator - Class in org.apache.hadoop.io
A Comparator optimized for LongWritable.
LongWritable.Comparator() - Constructor for class org.apache.hadoop.io.LongWritable.Comparator
 
LongWritable.DecreasingComparator - Class in org.apache.hadoop.io
A decreasing Comparator optimized for LongWritable.
LongWritable.DecreasingComparator() - Constructor for class org.apache.hadoop.io.LongWritable.DecreasingComparator
 

M

main(String[]) - Static method in class org.apache.hadoop.conf.Configuration
For debugging.
main(String[]) - Static method in class org.apache.nutch.net.protocols.HttpDateFormat
 
main(String[]) - Static method in class org.apache.nutch.util.PrefixStringMatcher
 
main(String[]) - Static method in class org.apache.nutch.util.StringUtil
 
main(String[]) - Static method in class org.apache.nutch.util.SuffixStringMatcher
 
majorCodes - Static variable in class org.apache.nutch.parse.ParseStatus
 
MapWritable - Class in org.apache.nutch.crawl
A writable map, with a similar behavior as java.util.HashMap.
MapWritable() - Constructor for class org.apache.nutch.crawl.MapWritable
 
MapWritable(MapWritable) - Constructor for class org.apache.nutch.crawl.MapWritable
Copy constructor.
matchChar(TrieStringMatcher.TrieNode, String, int) - Method in class org.apache.nutch.util.TrieStringMatcher
Returns the next TrieStringMatcher.TrieNode visited, given that you are at node, and the the next character in the input is the idx'th character of s.
matches(String) - Method in class org.apache.nutch.util.mime.MimeType
 
matches(byte[]) - Method in class org.apache.nutch.util.mime.MimeType
 
matches(String) - Method in class org.apache.nutch.util.PrefixStringMatcher
Returns true if the given String is matched by a prefix in the trie
matches(String) - Method in class org.apache.nutch.util.SuffixStringMatcher
Returns true if the given String is matched by a suffix in the trie
matches(String) - Method in class org.apache.nutch.util.TrieStringMatcher
Returns true if the given String is matched by a pattern in the trie
MD5_LEN - Static variable in class org.apache.hadoop.io.MD5Hash
 
MD5Hash - Class in org.apache.hadoop.io
A Writable for MD5 hash values.
MD5Hash() - Constructor for class org.apache.hadoop.io.MD5Hash
Constructs an MD5Hash.
MD5Hash(String) - Constructor for class org.apache.hadoop.io.MD5Hash
Constructs an MD5Hash from a hex string.
MD5Hash(byte[]) - Constructor for class org.apache.hadoop.io.MD5Hash
Constructs an MD5Hash with a specified value.
MD5Hash.Comparator - Class in org.apache.hadoop.io
A WritableComparator optimized for MD5Hash keys.
MD5Hash.Comparator() - Constructor for class org.apache.hadoop.io.MD5Hash.Comparator
 
MD5Signature - Class in org.apache.nutch.crawl
Default implementation of a page signature.
MD5Signature() - Constructor for class org.apache.nutch.crawl.MD5Signature
 
Metadata - Class in org.apache.nutch.metadata
A multi-valued metadata container.
Metadata() - Constructor for class org.apache.nutch.metadata.Metadata
Constructs a new, empty metadata.
MetaWrapper - Class in org.apache.nutch.metadata
This is a simple decorator that adds metadata to any Object-s that can be serialized by ObjectWritable.
MetaWrapper() - Constructor for class org.apache.nutch.metadata.MetaWrapper
 
MetaWrapper(Object, Configuration) - Constructor for class org.apache.nutch.metadata.MetaWrapper
 
MetaWrapper(Metadata, Object, Configuration) - Constructor for class org.apache.nutch.metadata.MetaWrapper
 
MimeType - Class in org.apache.nutch.util.mime
Defines a Mime Content Type.
MimeType(String) - Constructor for class org.apache.nutch.util.mime.MimeType
Creates a MimeType from a String.
MimeType(String, String) - Constructor for class org.apache.nutch.util.mime.MimeType
Creates a MimeType with the given primary type and sub type.
MimeTypeException - Exception in org.apache.nutch.util.mime
A class to encapsulate MimeType related exceptions.
MimeTypeException() - Constructor for exception org.apache.nutch.util.mime.MimeTypeException
Constructs a MimeTypeException with no specified detail message.
MimeTypeException(String) - Constructor for exception org.apache.nutch.util.mime.MimeTypeException
Constructs a MimeTypeException with the specified detail message.
MimeTypes - Class in org.apache.nutch.util.mime
This class is a MimeType repository.
MissingDependencyException - Exception in org.apache.nutch.plugin
MissingDependencyException will be thrown if a plugin dependency cannot be found.
MissingDependencyException(Throwable) - Constructor for exception org.apache.nutch.plugin.MissingDependencyException
 
MissingDependencyException(String) - Constructor for exception org.apache.nutch.plugin.MissingDependencyException
 
mLogger - Static variable in class org.apache.nutch.parse.ParseFilters
 
mLogger - Static variable in class org.apache.nutch.parse.ParseUtil
 
MODIFIED - Static variable in interface org.apache.nutch.metadata.DublinCore
Date on which the resource was changed.
MOVED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Resource has moved permanently.

N

names() - Method in class org.apache.nutch.metadata.Metadata
Returns an array of the names contained in the metadata.
newInstance(Class, Configuration) - Static method in class org.apache.hadoop.io.WritableFactories
Create a new instance of a class with a defined factory.
newInstance(Class) - Static method in class org.apache.hadoop.io.WritableFactories
Create a new instance of a class with a defined factory.
newInstance() - Method in interface org.apache.hadoop.io.WritableFactory
Return a new instance.
newInstance(Class, Configuration) - Static method in class org.apache.hadoop.util.ReflectionUtils
Create an object for the given class and initialize it from conf
newKey() - Method in class org.apache.hadoop.io.WritableComparator
Construct a new WritableComparable instance.
nodeChar - Variable in class org.apache.nutch.util.TrieStringMatcher.TrieNode
 
normalize(String, String) - Method in interface org.apache.nutch.net.URLNormalizer
 
normalize(String, String) - Method in class org.apache.nutch.net.URLNormalizers
Normalize
NOTFETCHING - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Not fetching.
NOTFOUND - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Resource was not found.
NOTMODIFIED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Unchanged since the last fetch.
NOTPARSED - Static variable in class org.apache.nutch.parse.ParseStatus
Parsing was not performed.
NullWritable - Class in org.apache.hadoop.io
Singleton Writable with no data.
Nutch - Interface in org.apache.nutch.metadata
A collection of Nutch internal metadata constants.
NutchConfiguration - Class in org.apache.nutch.util
Utility to create Hadoop Configurations that include Nutch-specific resources.

O

ObjectWritable - Class in org.apache.hadoop.io
A polymorphic Writable that writes an instance with it's class name.
ObjectWritable() - Constructor for class org.apache.hadoop.io.ObjectWritable
 
ObjectWritable(Object) - Constructor for class org.apache.hadoop.io.ObjectWritable
 
ObjectWritable(Class, Object) - Constructor for class org.apache.hadoop.io.ObjectWritable
 
Office - Interface in org.apache.nutch.metadata
A collection of "Office" documents properties names.
org.apache.hadoop.conf - package org.apache.hadoop.conf
Configuration of system parameters.
org.apache.hadoop.fs - package org.apache.hadoop.fs
 
org.apache.hadoop.io - package org.apache.hadoop.io
Generic i/o code for use when reading and writing data to the network, to databases, and to files.
org.apache.hadoop.util - package org.apache.hadoop.util
Common utilities.
org.apache.nutch.crawl - package org.apache.nutch.crawl
Crawl control code.
org.apache.nutch.fetcher - package org.apache.nutch.fetcher
The Nutch robot.
org.apache.nutch.metadata - package org.apache.nutch.metadata
A Multi-valued Metadata container, and set of constant fields for Nutch Metadata.
org.apache.nutch.net - package org.apache.nutch.net
 
org.apache.nutch.net.protocols - package org.apache.nutch.net.protocols
 
org.apache.nutch.parse - package org.apache.nutch.parse
 
org.apache.nutch.plugin - package org.apache.nutch.plugin
The Nutch Plugin System.
org.apache.nutch.protocol - package org.apache.nutch.protocol
 
org.apache.nutch.util - package org.apache.nutch.util
 
org.apache.nutch.util.mime - package org.apache.nutch.util.mime
 
ORIGINAL_CHAR_ENCODING - Static variable in interface org.apache.nutch.metadata.Nutch
 
Outlink - Class in org.apache.nutch.parse
 
Outlink() - Constructor for class org.apache.nutch.parse.Outlink
 
Outlink(String, String, Configuration) - Constructor for class org.apache.nutch.parse.Outlink
An outlink with a normalized URL.
Outlink(String, String, Configuration, boolean) - Constructor for class org.apache.nutch.parse.Outlink
An outlink.
OutlinkExtractor - Class in org.apache.nutch.parse
Extractor to extract Outlinks / URLs from plain text using Regular Expressions.
OutlinkExtractor() - Constructor for class org.apache.nutch.parse.OutlinkExtractor
 
outputRecords(List<Record>) - Method in interface com.endeca.itl.web.process.EndecaRecordOutputter
 

P

PAGE_COUNT - Static variable in interface org.apache.nutch.metadata.Office
 
Parse - Interface in org.apache.nutch.parse
The result of parsing a page's raw content.
parse(Content) - Method in class org.apache.nutch.parse.ParseUtil
Performs a parse by iterating through a List of preferred Parsers until a successful parse is performed and a Parse object is returned.
PARSE_DIR_NAME - Static variable in class org.apache.nutch.crawl.CrawlDatum
 
parseByExtensionId(String, Content) - Method in class org.apache.nutch.parse.ParseUtil
Method parses a Content object using the Parser specified by the parameter extId, i.e., the Parser's extension ID.
parseCharacterEncoding(String) - Static method in class org.apache.nutch.util.StringUtil
Parse the character encoding from the specified content type header.
ParseData - Class in org.apache.nutch.parse
Data extracted from a page's content.
ParseData() - Constructor for class org.apache.nutch.parse.ParseData
 
ParseData(ParseStatus, String, Outlink[], Metadata) - Constructor for class org.apache.nutch.parse.ParseData
 
ParseData(ParseStatus, String, Outlink[], Metadata, Metadata) - Constructor for class org.apache.nutch.parse.ParseData
 
ParseData(ParseStatus, String, Outlink[], Metadata, Metadata, DocumentFragment, HTMLMetaTags) - Constructor for class org.apache.nutch.parse.ParseData
 
ParseException - Exception in org.apache.nutch.parse
 
ParseException() - Constructor for exception org.apache.nutch.parse.ParseException
 
ParseException(String) - Constructor for exception org.apache.nutch.parse.ParseException
 
ParseException(String, Throwable) - Constructor for exception org.apache.nutch.parse.ParseException
 
ParseException(Throwable) - Constructor for exception org.apache.nutch.parse.ParseException
 
ParseFilter - Interface in org.apache.nutch.parse
 
PARSEFILTER_ORDER - Static variable in class org.apache.nutch.parse.HtmlParseFilters
Deprecated.  
ParseFilters - Class in org.apache.nutch.parse
 
ParseFilters(Configuration) - Constructor for class org.apache.nutch.parse.ParseFilters
 
ParseImpl - Class in org.apache.nutch.parse
The result of parsing a page's raw content.
ParseImpl() - Constructor for class org.apache.nutch.parse.ParseImpl
 
ParseImpl(Parse) - Constructor for class org.apache.nutch.parse.ParseImpl
 
ParseImpl(String, ParseData) - Constructor for class org.apache.nutch.parse.ParseImpl
 
ParseImpl(ParseText, ParseData) - Constructor for class org.apache.nutch.parse.ParseImpl
 
parsePluginFolder(String[]) - Method in class org.apache.nutch.plugin.PluginManifestParser
Returns a list of all found plugin descriptors.
Parser - Interface in org.apache.nutch.parse
A parser for content generated by a Protocol implementation.
ParserFactory - Class in org.apache.nutch.parse
Creates and caches Parser plugins.
ParserFactory(Configuration) - Constructor for class org.apache.nutch.parse.ParserFactory
 
ParserNotFound - Exception in org.apache.nutch.parse
 
ParserNotFound(String) - Constructor for exception org.apache.nutch.parse.ParserNotFound
 
ParserNotFound(String, String) - Constructor for exception org.apache.nutch.parse.ParserNotFound
 
ParserNotFound(String, String, String) - Constructor for exception org.apache.nutch.parse.ParserNotFound
 
ParseStatus - Class in org.apache.nutch.parse
 
ParseStatus() - Constructor for class org.apache.nutch.parse.ParseStatus
 
ParseStatus(int, int, String[]) - Constructor for class org.apache.nutch.parse.ParseStatus
 
ParseStatus(int) - Constructor for class org.apache.nutch.parse.ParseStatus
 
ParseStatus(int, String[]) - Constructor for class org.apache.nutch.parse.ParseStatus
 
ParseStatus(int, int) - Constructor for class org.apache.nutch.parse.ParseStatus
 
ParseStatus(int, int, String) - Constructor for class org.apache.nutch.parse.ParseStatus
Simplified constructor for passing just a text message.
ParseStatus(int, String) - Constructor for class org.apache.nutch.parse.ParseStatus
Simplified constructor for passing just a text message.
ParseStatus(Throwable) - Constructor for class org.apache.nutch.parse.ParseStatus
 
ParseText - Class in org.apache.nutch.parse
 
ParseText() - Constructor for class org.apache.nutch.parse.ParseText
 
ParseText(String) - Constructor for class org.apache.nutch.parse.ParseText
 
ParseUtil - Class in org.apache.nutch.parse
A Utility class containing methods to simply perform parsing utilities such as iterating through a preferred list of Parsers to obtain Parse objects.
ParseUtil(Configuration) - Constructor for class org.apache.nutch.parse.ParseUtil
 
Path - Class in org.apache.hadoop.fs
Names a file or directory in a FileSystem.
Path(String, String) - Constructor for class org.apache.hadoop.fs.Path
Resolve a child path against a parent path.
Path(Path, String) - Constructor for class org.apache.hadoop.fs.Path
Resolve a child path against a parent path.
Path(String, Path) - Constructor for class org.apache.hadoop.fs.Path
Resolve a child path against a parent path.
Path(Path, Path) - Constructor for class org.apache.hadoop.fs.Path
Resolve a child path against a parent path.
Path(String) - Constructor for class org.apache.hadoop.fs.Path
Construct a path from a String.
Path(String, String, String) - Constructor for class org.apache.hadoop.fs.Path
Construct a Path from components.
Pluggable - Interface in org.apache.nutch.plugin
Defines the capability of a class to be plugged into Nutch.
Plugin - Class in org.apache.nutch.plugin
A nutch-plugin is an container for a set of custom logic that provide extensions to the nutch core functionality or another plugin that provides an API for extending.
Plugin(PluginDescriptor, Configuration) - Constructor for class org.apache.nutch.plugin.Plugin
Constructor
PluginClassLoader - Class in org.apache.nutch.plugin
The PluginClassLoader contains only classes of the runtime libraries setuped in the plugin manifest file and exported libraries of plugins that are required pluguin.
PluginClassLoader(URL[], ClassLoader) - Constructor for class org.apache.nutch.plugin.PluginClassLoader
Construtor
PluginDescriptor - Class in org.apache.nutch.plugin
The PluginDescriptor provide access to all meta information of a nutch-plugin, as well to the internationalizable resources and the plugin own classloader.
PluginDescriptor(String, String, String, String, String, String, Configuration) - Constructor for class org.apache.nutch.plugin.PluginDescriptor
Constructor
PluginManifestParser - Class in org.apache.nutch.plugin
The PluginManifestParser parser just parse the manifest file in all plugin directories.
PluginManifestParser(Configuration, PluginRepository) - Constructor for class org.apache.nutch.plugin.PluginManifestParser
 
PluginRepository - Class in org.apache.nutch.plugin
The plugin repositority is a registry of all plugins.
PluginRepository(Configuration) - Constructor for class org.apache.nutch.plugin.PluginRepository
 
PluginRuntimeException - Exception in org.apache.nutch.plugin
PluginRuntimeException will be thrown until a exception in the plugin managemnt occurs.
PluginRuntimeException(Throwable) - Constructor for exception org.apache.nutch.plugin.PluginRuntimeException
 
PluginRuntimeException(String) - Constructor for exception org.apache.nutch.plugin.PluginRuntimeException
 
preCrawlAuthenticate(Protocol) - Method in interface com.endeca.itl.web.auth.Authenticator
Authenticates the crawler for a particular site before the crawl starts.
preCrawlAuthenticate(Protocol) - Method in class com.endeca.itl.web.auth.AuthenticatorManager
 
PrefixStringMatcher - Class in org.apache.nutch.util
A class for efficiently matching Strings against a set of prefixes.
PrefixStringMatcher(String[]) - Constructor for class org.apache.nutch.util.PrefixStringMatcher
Creates a new PrefixStringMatcher which will match Strings with any prefix in the supplied array.
PrefixStringMatcher(Collection) - Constructor for class org.apache.nutch.util.PrefixStringMatcher
Creates a new PrefixStringMatcher which will match Strings with any prefix in the supplied Collection.
PROTO_NOT_FOUND - Static variable in class org.apache.nutch.protocol.ProtocolStatus
This protocol was not found.
PROTO_STATUS_KEY - Static variable in interface org.apache.nutch.metadata.Nutch
 
Protocol - Interface in org.apache.nutch.protocol
A retriever of url content.
ProtocolException - Exception in org.apache.nutch.net.protocols
Deprecated. Use ProtocolException instead.
ProtocolException() - Constructor for exception org.apache.nutch.net.protocols.ProtocolException
Deprecated.  
ProtocolException(String) - Constructor for exception org.apache.nutch.net.protocols.ProtocolException
Deprecated.  
ProtocolException(String, Throwable) - Constructor for exception org.apache.nutch.net.protocols.ProtocolException
Deprecated.  
ProtocolException(Throwable) - Constructor for exception org.apache.nutch.net.protocols.ProtocolException
Deprecated.  
ProtocolException - Exception in org.apache.nutch.protocol
 
ProtocolException() - Constructor for exception org.apache.nutch.protocol.ProtocolException
 
ProtocolException(String) - Constructor for exception org.apache.nutch.protocol.ProtocolException
 
ProtocolException(String, Throwable) - Constructor for exception org.apache.nutch.protocol.ProtocolException
 
ProtocolException(Throwable) - Constructor for exception org.apache.nutch.protocol.ProtocolException
 
ProtocolFactory - Class in org.apache.nutch.protocol
Creates and caches Protocol plugins.
ProtocolFactory(Configuration) - Constructor for class org.apache.nutch.protocol.ProtocolFactory
 
ProtocolNotFound - Exception in org.apache.nutch.protocol
 
ProtocolNotFound(String) - Constructor for exception org.apache.nutch.protocol.ProtocolNotFound
 
ProtocolNotFound(String, String) - Constructor for exception org.apache.nutch.protocol.ProtocolNotFound
 
ProtocolNotFound(String, String, Throwable) - Constructor for exception org.apache.nutch.protocol.ProtocolNotFound
 
ProtocolOutput - Class in org.apache.nutch.protocol
Simple aggregate to pass from protocol plugins both content and protocol status.
ProtocolOutput(Content, ProtocolStatus) - Constructor for class org.apache.nutch.protocol.ProtocolOutput
 
ProtocolOutput(Content) - Constructor for class org.apache.nutch.protocol.ProtocolOutput
 
ProtocolStatus - Class in org.apache.nutch.protocol
 
ProtocolStatus() - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, String[]) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, String[], int) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, String[], long) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, String[], long, int) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, int) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, long) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, long, int) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, Object) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, Object, int) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, Object, long) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(int, Object, long, int) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(Throwable) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
ProtocolStatus(Throwable, int) - Constructor for class org.apache.nutch.protocol.ProtocolStatus
 
PUBLISHER - Static variable in interface org.apache.nutch.metadata.DublinCore
An entity responsible for making the resource available.
put(Writable, Writable) - Method in class org.apache.nutch.crawl.MapWritable
 
putAll(MapWritable) - Method in class org.apache.nutch.crawl.MapWritable
 

R

read(DataInput) - Static method in class org.apache.hadoop.io.MD5Hash
Constructs, reads and returns an instance.
read(DataInput) - Static method in class org.apache.nutch.crawl.CrawlDatum
 
read(DataInput) - Static method in class org.apache.nutch.parse.Outlink
 
read(DataInput) - Static method in class org.apache.nutch.parse.ParseData
 
read(DataInput, Configuration) - Static method in class org.apache.nutch.parse.ParseImpl
 
read(DataInput) - Static method in class org.apache.nutch.parse.ParseStatus
 
read(DataInput) - Static method in class org.apache.nutch.parse.ParseText
 
read(DataInput) - Static method in class org.apache.nutch.protocol.Content
 
read(DataInput) - Static method in class org.apache.nutch.protocol.ProtocolStatus
 
readCompressedByteArray(DataInput) - Static method in class org.apache.hadoop.io.WritableUtils
 
readCompressedString(DataInput) - Static method in class org.apache.hadoop.io.WritableUtils
 
readCompressedStringArray(DataInput) - Static method in class org.apache.hadoop.io.WritableUtils
 
readDouble(byte[], int) - Static method in class org.apache.hadoop.io.WritableComparator
Parse a double from a byte array.
readEnum(DataInput, Class<T>) - Static method in class org.apache.hadoop.io.WritableUtils
Read an Enum value from DataInput, Enums are read and written using String values.
readFields(DataInput) - Method in class org.apache.hadoop.io.ArrayWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.BooleanWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.BytesWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.CompressedWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.FloatWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.GenericWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.IntWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.LongWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.MD5Hash
 
readFields(DataInput) - Method in class org.apache.hadoop.io.NullWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.ObjectWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.Text
deserialize
readFields(DataInput) - Method in class org.apache.hadoop.io.TwoDArrayWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.VersionedWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.VIntWritable
 
readFields(DataInput) - Method in class org.apache.hadoop.io.VLongWritable
 
readFields(DataInput) - Method in interface org.apache.hadoop.io.Writable
Reads the fields of this object from in.
readFields(DataInput) - Method in class org.apache.nutch.crawl.CrawlDatum
 
readFields(DataInput) - Method in class org.apache.nutch.crawl.MapWritable
 
readFields(DataInput) - Method in class org.apache.nutch.fetcher.FetcherOutput
 
readFields(DataInput) - Method in class org.apache.nutch.metadata.Metadata
 
readFields(DataInput) - Method in class org.apache.nutch.metadata.MetaWrapper
 
readFields(DataInput) - Method in class org.apache.nutch.parse.Outlink
 
readFields(DataInput) - Method in class org.apache.nutch.parse.ParseData
 
readFields(DataInput) - Method in class org.apache.nutch.parse.ParseImpl
 
readFields(DataInput) - Method in class org.apache.nutch.parse.ParseStatus
 
readFields(DataInput) - Method in class org.apache.nutch.parse.ParseText
 
readFields(DataInput) - Method in class org.apache.nutch.protocol.ProtocolStatus
 
readFieldsCompressed(DataInput) - Method in class org.apache.hadoop.io.CompressedWritable
Subclasses implement this instead of CompressedWritable.readFields(DataInput).
readFieldsCompressed(DataInput) - Method in class org.apache.nutch.protocol.Content
 
readFloat(byte[], int) - Static method in class org.apache.hadoop.io.WritableComparator
Parse a float from a byte array.
readInt(byte[], int) - Static method in class org.apache.hadoop.io.WritableComparator
Parse an integer from a byte array.
readLong(byte[], int) - Static method in class org.apache.hadoop.io.WritableComparator
Parse a long from a byte array.
readObject(DataInput, Configuration) - Static method in class org.apache.hadoop.io.ObjectWritable
Read a Writable, String, primitive type, or an array of the preceding.
readObject(DataInput, ObjectWritable, Configuration) - Static method in class org.apache.hadoop.io.ObjectWritable
Read a Writable, String, primitive type, or an array of the preceding.
readString(DataInput) - Static method in class org.apache.hadoop.io.Text
Read a UTF8 encoded string from in
readString(DataInput) - Static method in class org.apache.hadoop.io.WritableUtils
 
readStringArray(DataInput) - Static method in class org.apache.hadoop.io.WritableUtils
 
readUnsignedShort(byte[], int) - Static method in class org.apache.hadoop.io.WritableComparator
Parse an unsigned short from a byte array.
readVInt(byte[], int) - Static method in class org.apache.hadoop.io.WritableComparator
Reads a zero-compressed encoded integer from a byte array and returns it.
readVInt(DataInput) - Static method in class org.apache.hadoop.io.WritableUtils
Reads a zero-compressed encoded integer from input stream and returns it.
readVLong(byte[], int) - Static method in class org.apache.hadoop.io.WritableComparator
Reads a zero-compressed encoded long from a byte array and returns it.
readVLong(DataInput) - Static method in class org.apache.hadoop.io.WritableUtils
Reads a zero-compressed encoded long from input stream and returns it.
RecordOutputException - Exception in com.endeca.itl.web.process
 
RecordOutputException() - Constructor for exception com.endeca.itl.web.process.RecordOutputException
 
RecordOutputException(String) - Constructor for exception com.endeca.itl.web.process.RecordOutputException
 
RecordOutputException(String, Throwable) - Constructor for exception com.endeca.itl.web.process.RecordOutputException
 
REDIR_EXCEEDED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Too many redirects.
ReflectionUtils - Class in org.apache.hadoop.util
General reflection utils
ReflectionUtils() - Constructor for class org.apache.hadoop.util.ReflectionUtils
 
RELATION - Static variable in interface org.apache.nutch.metadata.DublinCore
A reference to a related resource.
remove(Writable) - Method in class org.apache.nutch.crawl.MapWritable
 
remove(String) - Method in class org.apache.nutch.metadata.Metadata
Remove a metadata and all its associated values.
remove(String) - Method in class org.apache.nutch.metadata.SpellCheckedMetadata
 
RequestFailedException - Exception in com.endeca.itl.web.auth
 
RequestFailedException() - Constructor for exception com.endeca.itl.web.auth.RequestFailedException
 
RequestFailedException(String) - Constructor for exception com.endeca.itl.web.auth.RequestFailedException
 
RequestFailedException(Throwable) - Constructor for exception com.endeca.itl.web.auth.RequestFailedException
 
RequestFailedException(String, Throwable) - Constructor for exception com.endeca.itl.web.auth.RequestFailedException
 
reset(byte[], int) - Method in class org.apache.hadoop.io.DataInputBuffer
Resets the data that the buffer reads.
reset(byte[], int, int) - Method in class org.apache.hadoop.io.DataInputBuffer
Resets the data that the buffer reads.
reset() - Method in class org.apache.hadoop.io.DataOutputBuffer
Resets the buffer to empty.
reset() - Method in class org.apache.nutch.parse.HTMLMetaTags
Sets all boolean values to false.
resolveEncodingAlias(String) - Static method in class org.apache.nutch.util.StringUtil
 
Response - Interface in org.apache.nutch.net.protocols
A response inteface.
RETRY - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Temporary failure.
REVISION_NUMBER - Static variable in interface org.apache.nutch.metadata.Office
 
rightPad(String, int) - Static method in class org.apache.nutch.util.StringUtil
Returns a copy of s padded with trailing spaces so that it's length is length.
RIGHTS - Static variable in interface org.apache.nutch.metadata.DublinCore
Information about rights held in and over the resource.
RobotRules - Interface in org.apache.nutch.protocol
This class holds the rules which were parsed from a robots.txt file, and can test paths against those rules.
ROBOTS_DENIED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Access denied by robots.txt rules.
root - Variable in class org.apache.nutch.util.TrieStringMatcher
 
RULES - Static variable in class org.apache.nutch.protocol.EmptyRobotRules
 

S

SCOPE_CRAWLDB - Static variable in class org.apache.nutch.net.URLNormalizers
Scope used when updating the CrawlDb with new URLs.
SCOPE_DEFAULT - Static variable in class org.apache.nutch.net.URLNormalizers
Default scope.
SCOPE_FETCHER - Static variable in class org.apache.nutch.net.URLNormalizers
Scope used by org.apache.nutch.fetcher.Fetcher when processing redirect URLs.
SCOPE_GENERATE_HOST_COUNT - Static variable in class org.apache.nutch.net.URLNormalizers
Scope used by org.apache.nutch.crawl.Generator.
SCOPE_INJECT - Static variable in class org.apache.nutch.net.URLNormalizers
Scope used by org.apache.nutch.crawl.Injector.
SCOPE_LINKDB - Static variable in class org.apache.nutch.net.URLNormalizers
Scope used when updating the LinkDb with new URLs.
SCOPE_OUTLINK - Static variable in class org.apache.nutch.net.URLNormalizers
Scope used when constructing new Outlink instances.
SCOPE_PARTITION - Static variable in class org.apache.nutch.net.URLNormalizers
Scope used by org.apache.nutch.crawl.PartitionUrlByHost.
SCORE_KEY - Static variable in interface org.apache.nutch.metadata.Nutch
 
SEGMENT_NAME_KEY - Static variable in interface org.apache.nutch.metadata.Nutch
 
SEPARATOR - Static variable in class org.apache.hadoop.fs.Path
The directory separator, a slash.
SEPARATOR_CHAR - Static variable in class org.apache.hadoop.fs.Path
 
set(String, Object) - Method in class org.apache.hadoop.conf.Configuration
Sets the value of the name property.
set(Writable[]) - Method in class org.apache.hadoop.io.ArrayWritable
 
set(boolean) - Method in class org.apache.hadoop.io.BooleanWritable
Set the value of the BooleanWritable
set(BytesWritable) - Method in class org.apache.hadoop.io.BytesWritable
Set the BytesWritable to the contents of the given newData.
set(byte[], int, int) - Method in class org.apache.hadoop.io.BytesWritable
Set the value to a copy of the given byte range
set(float) - Method in class org.apache.hadoop.io.FloatWritable
Set the value of this FloatWritable.
set(Writable) - Method in class org.apache.hadoop.io.GenericWritable
Set the instance that is wrapped.
set(int) - Method in class org.apache.hadoop.io.IntWritable
Set the value of this IntWritable.
set(long) - Method in class org.apache.hadoop.io.LongWritable
Set the value of this LongWritable.
set(MD5Hash) - Method in class org.apache.hadoop.io.MD5Hash
Copy the contents of another instance into this instance.
set(Object) - Method in class org.apache.hadoop.io.ObjectWritable
Reset the instance.
set(String) - Method in class org.apache.hadoop.io.Text
Set to contain the contents of a string.
set(byte[]) - Method in class org.apache.hadoop.io.Text
Set to a utf8 byte array
set(Text) - Method in class org.apache.hadoop.io.Text
copy a text.
set(byte[], int, int) - Method in class org.apache.hadoop.io.Text
Set the Text to range of bytes
set(Writable[][]) - Method in class org.apache.hadoop.io.TwoDArrayWritable
 
set(int) - Method in class org.apache.hadoop.io.VIntWritable
Set the value of this VIntWritable.
set(long) - Method in class org.apache.hadoop.io.VLongWritable
Set the value of this LongWritable.
set(CrawlDatum) - Method in class org.apache.nutch.crawl.CrawlDatum
Copy the contents of another instance into this instance.
set(String, String) - Method in class org.apache.nutch.metadata.Metadata
Set metadata name/value.
set(String, String) - Method in class org.apache.nutch.metadata.SpellCheckedMetadata
 
setActionUrl(String) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
setAll(Properties) - Method in class org.apache.nutch.metadata.Metadata
Copy All key-value pairs from properties.
setArgs(String[]) - Method in class org.apache.nutch.parse.ParseStatus
 
setArgs(String[]) - Method in class org.apache.nutch.protocol.ProtocolStatus
 
setAuthConf(AuthenticatorConfiguration) - Method in interface com.endeca.itl.web.auth.Authenticator
Initializes the Authenticator with the given configuration.
setBaseHref(URL) - Method in class org.apache.nutch.parse.HTMLMetaTags
Sets the baseHref.
setBoolean(String, boolean) - Method in class org.apache.hadoop.conf.Configuration
Sets the value of the name property to an integer.
setCapacity(int) - Method in class org.apache.hadoop.io.BytesWritable
Change the capacity of the backing storage.
setClass(String, Class, Class) - Method in class org.apache.hadoop.conf.Configuration
Sets the value of the name property to the name of a class.
setClassLoader(ClassLoader) - Method in class org.apache.hadoop.conf.Configuration
Set the class loader that will be used to load the various objects.
setClassName(String) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
setClazz(String) - Method in class org.apache.nutch.plugin.Extension
Sets the Class that implement the concret extension and is only used until model creation at system start up.
setCode(int) - Method in class org.apache.nutch.protocol.ProtocolStatus
 
setConf(Configuration) - Method in interface org.apache.hadoop.conf.Configurable
Set the configuration to be used by this object.
setConf(Configuration) - Method in class org.apache.hadoop.conf.Configured
 
setConf(Configuration) - Method in class org.apache.hadoop.io.ObjectWritable
 
setConf(Object, Configuration) - Static method in class org.apache.hadoop.util.ReflectionUtils
Check and set 'configuration' if necessary.
setConf(Configuration) - Method in class org.apache.nutch.crawl.Signature
 
setConf(Configuration) - Method in class org.apache.nutch.fetcher.FetcherOutput
 
setConf(Configuration) - Method in class org.apache.nutch.parse.ParseData
 
setConf(Configuration) - Method in class org.apache.nutch.parse.ParseImpl
 
setContent(byte[]) - Method in class org.apache.nutch.protocol.Content
 
setContent(Content) - Method in class org.apache.nutch.protocol.ProtocolOutput
 
setContentType(String) - Method in class org.apache.nutch.protocol.Content
 
setDescriptor(PluginDescriptor) - Method in class org.apache.nutch.plugin.Extension
Sets the plugin descriptor and is only used until model creation at system start up.
setDigest(String) - Method in class org.apache.hadoop.io.MD5Hash
Sets the digest value from a hex string.
setDOMRoot(DocumentFragment) - Method in class org.apache.nutch.parse.ParseData
Set the DOM.
setFactory(Class, WritableFactory) - Static method in class org.apache.hadoop.io.WritableFactories
Define a factory for a class.
setFetchInterval(float) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setFetchTime(long) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setId(String) - Method in class org.apache.nutch.plugin.Extension
Sets the unique extension Id and is only used until model creation at system start up.
setInt(String, int) - Method in class org.apache.hadoop.conf.Configuration
Sets the value of the name property to an integer.
setLastModified(long) - Method in class org.apache.nutch.protocol.ProtocolStatus
 
setLoginUrl(String) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
setLong(String, long) - Method in class org.apache.hadoop.conf.Configuration
Sets the value of the name property to a long.
setMajorCode(byte) - Method in class org.apache.nutch.parse.ParseStatus
 
setMessage(String) - Method in class org.apache.nutch.parse.ParseStatus
 
setMessage(String) - Method in class org.apache.nutch.protocol.ProtocolStatus
 
setMeta(String, String) - Method in class org.apache.nutch.metadata.MetaWrapper
Set metadata.
setMetaData(MapWritable) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setMetadata(Metadata) - Method in class org.apache.nutch.protocol.Content
Other protocol-specific data.
setMetaTag(HTMLMetaTags) - Method in class org.apache.nutch.parse.ParseData
 
setMethod(String) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
setMinorCode(short) - Method in class org.apache.nutch.parse.ParseStatus
 
setModifiedTime(long) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setName(Class, String) - Static method in class org.apache.hadoop.io.WritableName
Set the name that a class should be known as to something other than the class name.
setNextFetchTime() - Method in class org.apache.nutch.crawl.CrawlDatum
 
setNoCache() - Method in class org.apache.nutch.parse.HTMLMetaTags
Sets noCache to true.
setNoFollow() - Method in class org.apache.nutch.parse.HTMLMetaTags
Sets noFollow to true.
setNoIndex() - Method in class org.apache.nutch.parse.HTMLMetaTags
Sets noIndex to true.
setObject(String, Object) - Method in class org.apache.hadoop.conf.Configuration
Sets the value of the name property.
setParameters(List<NameValuePair>) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
setParseMeta(Metadata) - Method in class org.apache.nutch.parse.ParseData
 
setPreAuthenticate(boolean) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
setProperties(Map<String, String>) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
setQuietMode(boolean) - Method in class org.apache.hadoop.conf.Configuration
Make this class quiet.
setRefresh(boolean) - Method in class org.apache.nutch.parse.HTMLMetaTags
Sets refresh to the supplied value.
setRefreshHref(URL) - Method in class org.apache.nutch.parse.HTMLMetaTags
Sets the refreshHref.
setRefreshTime(int) - Method in class org.apache.nutch.parse.HTMLMetaTags
Sets the refreshTime.
setResponseCode(int) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setRetriesSinceFetch(int) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setRobotsDelay(long) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setScore(float) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setSignature(byte[]) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setSite(String) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
setSize(int) - Method in class org.apache.hadoop.io.BytesWritable
Change the size of the buffer.
setStatus(int) - Method in class org.apache.nutch.crawl.CrawlDatum
 
setStatus(ProtocolStatus) - Method in class org.apache.nutch.protocol.ProtocolOutput
 
setValueClass(Class) - Method in class org.apache.hadoop.io.ArrayWritable
 
shortestMatch(String) - Method in class org.apache.nutch.util.PrefixStringMatcher
Returns the shortest prefix of input that is matched, or null if no match exists.
shortestMatch(String) - Method in class org.apache.nutch.util.SuffixStringMatcher
Returns the shortest suffix of input that is matched, or null if no match exists.
shortestMatch(String) - Method in class org.apache.nutch.util.TrieStringMatcher
Returns the shortest substring of input that is matched by a pattern in the trie, or null if no match exists.
shutDown() - Method in class org.apache.nutch.plugin.Plugin
Shutdown the plugin.
Signature - Class in org.apache.nutch.crawl
 
Signature() - Constructor for class org.apache.nutch.crawl.Signature
 
SIGNATURE_KEY - Static variable in interface org.apache.nutch.metadata.Nutch
 
SignatureComparator - Class in org.apache.nutch.crawl
 
SignatureComparator() - Constructor for class org.apache.nutch.crawl.SignatureComparator
 
SignatureFactory - Class in org.apache.nutch.crawl
Factory class, which instantiates a Signature implementation according to the current Configuration configuration.
simpleHostname(String) - Static method in class org.apache.hadoop.util.StringUtils
Given a full hostname, return the word upto the first dot.
size() - Method in class org.apache.nutch.crawl.MapWritable
 
size() - Method in class org.apache.nutch.metadata.Metadata
Returns the number of metadata names in this metadata.
skip(DataInput) - Static method in class org.apache.hadoop.io.Text
Skips over one Text in the input.
skip(DataInput) - Static method in class org.apache.nutch.parse.Outlink
Skips over one Outlink in the input.
skipCompressedByteArray(DataInput) - Static method in class org.apache.hadoop.io.WritableUtils
 
SOURCE - Static variable in interface org.apache.nutch.metadata.DublinCore
A reference to a resource from which the present resource is derived.
SpellCheckedMetadata - Class in org.apache.nutch.metadata
A decorator to Metadata that adds spellchecking capabilities to property names.
SpellCheckedMetadata() - Constructor for class org.apache.nutch.metadata.SpellCheckedMetadata
 
startUp() - Method in class org.apache.nutch.plugin.Plugin
Will be invoked until plugin start up.
statNames - Static variable in class org.apache.nutch.crawl.CrawlDatum
 
STATUS_BLOCKED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_DB_FETCHED - Static variable in class org.apache.nutch.crawl.CrawlDatum
Page was successfully fetched.
STATUS_DB_GONE - Static variable in class org.apache.nutch.crawl.CrawlDatum
Page no longer exists.
STATUS_DB_MAX - Static variable in class org.apache.nutch.crawl.CrawlDatum
Maximum value of DB-related status.
STATUS_DB_REDIR_PERM - Static variable in class org.apache.nutch.crawl.CrawlDatum
Page permanently redirects to other page.
STATUS_DB_REDIR_TEMP - Static variable in class org.apache.nutch.crawl.CrawlDatum
Page temporarily redirects to other page.
STATUS_DB_UNFETCHED - Static variable in class org.apache.nutch.crawl.CrawlDatum
Page was not fetched yet.
STATUS_FAILED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_FAILURE - Static variable in class org.apache.nutch.parse.ParseStatus
 
STATUS_FETCH_CONTENT_LIMIT_EXCEEDED - Static variable in class org.apache.nutch.crawl.CrawlDatum
Fetching was successful but content was truncated
STATUS_FETCH_GONE - Static variable in class org.apache.nutch.crawl.CrawlDatum
Fetching unsuccessful - page is gone.
STATUS_FETCH_MAX - Static variable in class org.apache.nutch.crawl.CrawlDatum
Maximum value of fetch-related status.
STATUS_FETCH_REDIR_PERM - Static variable in class org.apache.nutch.crawl.CrawlDatum
Fetching permanently redirected to other page.
STATUS_FETCH_REDIR_TEMP - Static variable in class org.apache.nutch.crawl.CrawlDatum
Fetching temporarily redirected to other page.
STATUS_FETCH_RETRY - Static variable in class org.apache.nutch.crawl.CrawlDatum
Fetching unsuccessful, needs to be retried (transient errors).
STATUS_FETCH_SUCCESS - Static variable in class org.apache.nutch.crawl.CrawlDatum
Fetching was successful.
STATUS_GONE - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_INJECTED - Static variable in class org.apache.nutch.crawl.CrawlDatum
Page was newly injected.
STATUS_LINKED - Static variable in class org.apache.nutch.crawl.CrawlDatum
Page discovered through a link.
STATUS_NOTFETCHING - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_NOTFOUND - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_NOTMODIFIED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_NOTPARSED - Static variable in class org.apache.nutch.parse.ParseStatus
 
STATUS_REDIR_EXCEEDED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_RETRY - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_ROBOTS_DENIED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_SIGNATURE - Static variable in class org.apache.nutch.crawl.CrawlDatum
Page signature.
STATUS_SUCCESS - Static variable in class org.apache.nutch.parse.ParseStatus
 
STATUS_SUCCESS - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
STATUS_WOULDBLOCK - Static variable in class org.apache.nutch.protocol.ProtocolStatus
 
stringifyException(Throwable) - Static method in class org.apache.hadoop.util.StringUtils
Make a string representation of the exception.
stringToPath(String[]) - Static method in class org.apache.hadoop.util.StringUtils
 
stringToURI(String[]) - Static method in class org.apache.hadoop.util.StringUtils
 
StringUtil - Class in org.apache.nutch.util
A collection of String processing utility methods.
StringUtil() - Constructor for class org.apache.nutch.util.StringUtil
 
StringUtils - Class in org.apache.hadoop.util
General string utils
StringUtils() - Constructor for class org.apache.hadoop.util.StringUtils
 
SUBJECT - Static variable in interface org.apache.nutch.metadata.DublinCore
The topic of the content of the resource.
SUCCESS - Static variable in class org.apache.nutch.parse.ParseStatus
Parsing succeeded.
SUCCESS - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Content was retrieved without errors.
SUCCESS_REDIRECT - Static variable in class org.apache.nutch.parse.ParseStatus
Parsed content contains a directive to redirect to another URL.
suffix(String) - Method in class org.apache.hadoop.fs.Path
Adds a suffix to the final name in the path.
SuffixStringMatcher - Class in org.apache.nutch.util
A class for efficiently matching Strings against a set of suffixes.
SuffixStringMatcher(String[]) - Constructor for class org.apache.nutch.util.SuffixStringMatcher
Creates a new PrefixStringMatcher which will match Strings with any suffix in the supplied array.
SuffixStringMatcher(Collection) - Constructor for class org.apache.nutch.util.SuffixStringMatcher
Creates a new PrefixStringMatcher which will match Strings with any suffix in the supplied Collection

T

TEMP_MOVED - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Resource has moved temporarily.
TEMPLATE - Static variable in interface org.apache.nutch.metadata.Office
 
terminal - Variable in class org.apache.nutch.util.TrieStringMatcher.TrieNode
 
Text - Class in org.apache.hadoop.io
This class stores text using standard UTF8 encoding.
Text() - Constructor for class org.apache.hadoop.io.Text
 
Text(String) - Constructor for class org.apache.hadoop.io.Text
Construct from a string.
Text(Text) - Constructor for class org.apache.hadoop.io.Text
Construct from another text.
Text(byte[]) - Constructor for class org.apache.hadoop.io.Text
Construct from a byte array.
Text.Comparator - Class in org.apache.hadoop.io
A WritableComparator optimized for Text keys.
Text.Comparator() - Constructor for class org.apache.hadoop.io.Text.Comparator
 
TextProfileSignature - Class in org.apache.nutch.crawl
An implementation of a page signature.
TextProfileSignature() - Constructor for class org.apache.nutch.crawl.TextProfileSignature
 
TITLE - Static variable in interface org.apache.nutch.metadata.DublinCore
A name given to the resource.
toArray() - Method in class org.apache.hadoop.io.ArrayWritable
 
toArray() - Method in class org.apache.hadoop.io.TwoDArrayWritable
 
toDate(String) - Static method in class org.apache.nutch.net.protocols.HttpDateFormat
 
toHexString(byte[]) - Static method in class org.apache.nutch.util.StringUtil
Convenience call for StringUtil.toHexString(byte[], String, int), where sep = null; lineLen = Integer.MAX_VALUE.
toHexString(byte[], String, int) - Static method in class org.apache.nutch.util.StringUtil
Get a text representation of a byte[] as hexadecimal String, where each pair of hexadecimal digits corresponds to consecutive bytes in the array.
toLong(String) - Static method in class org.apache.nutch.net.protocols.HttpDateFormat
 
toString() - Method in class org.apache.hadoop.conf.Configuration
 
toString() - Method in class org.apache.hadoop.fs.Path
 
toString() - Method in class org.apache.hadoop.io.BytesWritable
Generate the stream of bytes as hex pairs separated by ' '.
toString() - Method in class org.apache.hadoop.io.FloatWritable
 
toString() - Method in class org.apache.hadoop.io.IntWritable
 
toString() - Method in class org.apache.hadoop.io.LongWritable
 
toString() - Method in class org.apache.hadoop.io.MD5Hash
Returns a string representation of this object.
toString() - Method in class org.apache.hadoop.io.Text
Convert text back to string
toString() - Method in exception org.apache.hadoop.io.VersionMismatchException
Returns a string representation of this object.
toString() - Method in class org.apache.hadoop.io.VIntWritable
 
toString() - Method in class org.apache.hadoop.io.VLongWritable
 
toString() - Method in class org.apache.nutch.crawl.CrawlDatum
 
toString() - Method in class org.apache.nutch.crawl.MapWritable
 
toString() - Method in class org.apache.nutch.fetcher.FetcherOutput
 
toString() - Method in class org.apache.nutch.metadata.Metadata
 
toString(Date) - Static method in class org.apache.nutch.net.protocols.HttpDateFormat
Get the HTTP format of the specified date.
toString(Calendar) - Static method in class org.apache.nutch.net.protocols.HttpDateFormat
 
toString(long) - Static method in class org.apache.nutch.net.protocols.HttpDateFormat
 
toString() - Method in class org.apache.nutch.parse.HTMLMetaTags
 
toString() - Method in class org.apache.nutch.parse.Outlink
 
toString() - Method in class org.apache.nutch.parse.ParseData
 
toString() - Method in class org.apache.nutch.parse.ParseStatus
 
toString() - Method in class org.apache.nutch.parse.ParseText
 
toString() - Method in class org.apache.nutch.protocol.Content
 
toString() - Method in class org.apache.nutch.protocol.ProtocolStatus
 
toString() - Method in class org.apache.nutch.util.mime.MimeType
 
toStrings() - Method in class org.apache.hadoop.io.ArrayWritable
 
toUri() - Method in class org.apache.hadoop.fs.Path
Convert this to a URI.
TrieStringMatcher - Class in org.apache.nutch.util
TrieStringMatcher is a base class for simple tree-based string matching.
TrieStringMatcher() - Constructor for class org.apache.nutch.util.TrieStringMatcher
 
TrieStringMatcher.TrieNode - Class in org.apache.nutch.util
Node class for the character tree.
TwoDArrayWritable - Class in org.apache.hadoop.io
A Writable for 2D arrays containing a matrix of instances of a class.
TwoDArrayWritable(Class) - Constructor for class org.apache.hadoop.io.TwoDArrayWritable
 
TwoDArrayWritable(Class, Writable[][]) - Constructor for class org.apache.hadoop.io.TwoDArrayWritable
 
TYPE - Static variable in interface org.apache.nutch.metadata.DublinCore
The nature or genre of the content of the resource.

U

unzip(byte[]) - Static method in class org.apache.nutch.util.GZIPUtils
Returns an gunzipped copy of the input array.
unzipBestEffort(byte[]) - Static method in class org.apache.nutch.util.GZIPUtils
Returns an gunzipped copy of the input array.
unzipBestEffort(byte[], int) - Static method in class org.apache.nutch.util.GZIPUtils
Returns an gunzipped copy of the input array, truncated to sizeLimit bytes, if necessary.
uriToString(URI[]) - Static method in class org.apache.hadoop.util.StringUtils
 
URLFilter - Interface in org.apache.nutch.net
Interface used to limit which URLs enter Nutch.
URLFILTER_ORDER - Static variable in class org.apache.nutch.net.URLFilters
 
URLFilterChecker - Class in org.apache.nutch.net
Checks one given filter or all filters.
URLFilterChecker(Configuration) - Constructor for class org.apache.nutch.net.URLFilterChecker
 
URLFilterException - Exception in org.apache.nutch.net
 
URLFilterException() - Constructor for exception org.apache.nutch.net.URLFilterException
 
URLFilterException(String) - Constructor for exception org.apache.nutch.net.URLFilterException
 
URLFilterException(String, Throwable) - Constructor for exception org.apache.nutch.net.URLFilterException
 
URLFilterException(Throwable) - Constructor for exception org.apache.nutch.net.URLFilterException
 
URLFilters - Class in org.apache.nutch.net
Creates and caches URLFilter implementing plugins.
URLFilters(Configuration) - Constructor for class org.apache.nutch.net.URLFilters
 
URLNormalizer - Interface in org.apache.nutch.net
Interface used to convert URLs to normal form and optionally perform substitutions
URLNormalizers - Class in org.apache.nutch.net
This class uses a "chained filter" pattern to run defined normalizers.
URLNormalizers(Configuration, String) - Constructor for class org.apache.nutch.net.URLNormalizers
 
URLScopeFilter - Class in org.apache.nutch.net
Crawl scoping class A user can filter the urls by defining crawlscope.mode.
URLScopeFilter(Configuration) - Constructor for class org.apache.nutch.net.URLScopeFilter
 
URLScopeFilter.Mode - Enum in org.apache.nutch.net
 
utf8Length(String) - Static method in class org.apache.hadoop.io.Text
For the given string, returns the number of UTF-8 bytes required to encode the string.

V

validateUTF8(byte[]) - Static method in class org.apache.hadoop.io.Text
Check if a byte array contains valid utf-8
validateUTF8(byte[], int, int) - Static method in class org.apache.hadoop.io.Text
Check to see if a byte array is valid utf-8
valueOf(String) - Static method in enum org.apache.nutch.net.URLScopeFilter.Mode
Returns the enum constant of this type with the specified name.
values() - Method in class org.apache.nutch.crawl.MapWritable
 
values() - Static method in enum org.apache.nutch.net.URLScopeFilter.Mode
Returns an array containing the constants of this enum type, in the order they are declared.
VersionedWritable - Class in org.apache.hadoop.io
A base class for Writables that provides version checking.
VersionedWritable() - Constructor for class org.apache.hadoop.io.VersionedWritable
 
VersionMismatchException - Exception in org.apache.hadoop.io
Thrown by VersionedWritable.readFields(DataInput) when the version of an object being read does not match the current implementation version as returned by VersionedWritable.getVersion().
VersionMismatchException(byte, byte) - Constructor for exception org.apache.hadoop.io.VersionMismatchException
 
VIntWritable - Class in org.apache.hadoop.io
A WritableComparable for integer values stored in variable-length format.
VIntWritable() - Constructor for class org.apache.hadoop.io.VIntWritable
 
VIntWritable(int) - Constructor for class org.apache.hadoop.io.VIntWritable
 
VLongWritable - Class in org.apache.hadoop.io
A WritableComparable for longs in a variable-length format.
VLongWritable() - Constructor for class org.apache.hadoop.io.VLongWritable
 
VLongWritable(long) - Constructor for class org.apache.hadoop.io.VLongWritable
 

W

WORD_COUNT - Static variable in interface org.apache.nutch.metadata.Office
 
WORK_TYPE - Static variable in interface org.apache.nutch.metadata.CreativeCommons
 
WOULDBLOCK - Static variable in class org.apache.nutch.protocol.ProtocolStatus
Request was refused by protocol plugins, because it would block.
Writable - Interface in org.apache.hadoop.io
A simple, efficient, serialization protocol, based on DataInput and DataOutput.
WRITABLE_GENERATE_TIME_KEY - Static variable in interface org.apache.nutch.metadata.Nutch
 
WRITABLE_PROTO_STATUS_KEY - Static variable in interface org.apache.nutch.metadata.Nutch
 
WritableComparable - Interface in org.apache.hadoop.io
An interface which extends both Writable and Comparable.
WritableComparator - Class in org.apache.hadoop.io
A Comparator for WritableComparables.
WritableComparator(Class) - Constructor for class org.apache.hadoop.io.WritableComparator
Construct for a WritableComparable implementation.
WritableFactories - Class in org.apache.hadoop.io
Factories for non-public writables.
WritableFactory - Interface in org.apache.hadoop.io
A factory for a class of Writable.
WritableName - Class in org.apache.hadoop.io
Utility to permit renaming of Writable implementation classes without invalidiating files that contain their class name.
WritableUtils - Class in org.apache.hadoop.io
 
WritableUtils() - Constructor for class org.apache.hadoop.io.WritableUtils
 
write(Document) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
write(OutputStream) - Method in class org.apache.hadoop.conf.Configuration
Writes non-default properties in this configuration.
write(DataOutput) - Method in class org.apache.hadoop.io.ArrayWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.BooleanWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.BytesWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.CompressedWritable
 
write(DataInput, int) - Method in class org.apache.hadoop.io.DataOutputBuffer
Writes bytes from a DataInput directly into the buffer.
write(DataOutput) - Method in class org.apache.hadoop.io.FloatWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.GenericWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.IntWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.LongWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.MD5Hash
 
write(DataOutput) - Method in class org.apache.hadoop.io.NullWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.ObjectWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.Text
serialize write this object to out length uses zero-compressed encoding
write(DataOutput) - Method in class org.apache.hadoop.io.TwoDArrayWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.VersionedWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.VIntWritable
 
write(DataOutput) - Method in class org.apache.hadoop.io.VLongWritable
 
write(DataOutput) - Method in interface org.apache.hadoop.io.Writable
Writes the fields of this object to out.
write(DataOutput) - Method in class org.apache.nutch.crawl.CrawlDatum
 
write(DataOutput) - Method in class org.apache.nutch.crawl.MapWritable
 
write(DataOutput) - Method in class org.apache.nutch.fetcher.FetcherOutput
 
write(DataOutput) - Method in class org.apache.nutch.metadata.Metadata
 
write(DataOutput) - Method in class org.apache.nutch.metadata.MetaWrapper
 
write(DataOutput) - Method in class org.apache.nutch.parse.Outlink
 
write(DataOutput) - Method in class org.apache.nutch.parse.ParseData
 
write(DataOutput) - Method in class org.apache.nutch.parse.ParseImpl
 
write(DataOutput) - Method in class org.apache.nutch.parse.ParseStatus
 
write(DataOutput) - Method in class org.apache.nutch.parse.ParseText
 
write(DataOutput) - Method in class org.apache.nutch.protocol.ProtocolStatus
 
writeCompressed(DataOutput) - Method in class org.apache.hadoop.io.CompressedWritable
Subclasses implement this instead of CompressedWritable.write(DataOutput).
writeCompressed(DataOutput) - Method in class org.apache.nutch.protocol.Content
 
writeCompressedByteArray(DataOutput, byte[]) - Static method in class org.apache.hadoop.io.WritableUtils
 
writeCompressedString(DataOutput, String) - Static method in class org.apache.hadoop.io.WritableUtils
 
writeCompressedStringArray(DataOutput, String[]) - Static method in class org.apache.hadoop.io.WritableUtils
 
writeEnum(DataOutput, Enum) - Static method in class org.apache.hadoop.io.WritableUtils
writes String value of enum to DataOutput.
writeList(OutputStream, List<AuthenticatorConfiguration>) - Static method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
writeNameValue(Document, String, String, String) - Method in class com.endeca.itl.web.auth.AuthenticatorConfiguration
 
writeObject(DataOutput, Object, Class, Configuration) - Static method in class org.apache.hadoop.io.ObjectWritable
Write a Writable, String, primitive type, or an array of the preceding.
writeString(DataOutput, String) - Static method in class org.apache.hadoop.io.Text
Write a UTF8 encoded string to out
writeString(DataOutput, String) - Static method in class org.apache.hadoop.io.WritableUtils
 
writeStringArray(DataOutput, String[]) - Static method in class org.apache.hadoop.io.WritableUtils
 
writeVInt(DataOutput, int) - Static method in class org.apache.hadoop.io.WritableUtils
Serializes an integer to a binary stream with zero-compressed encoding.
writeVLong(DataOutput, long) - Static method in class org.apache.hadoop.io.WritableUtils
Serializes a long to a binary stream with zero-compressed encoding.

X

X_POINT_ID - Static variable in interface com.endeca.itl.web.auth.Authenticator
 
X_POINT_ID - Static variable in interface com.endeca.itl.web.process.EndecaRecordGenerator
The name of the extension point.
X_POINT_ID - Static variable in interface com.endeca.itl.web.process.EndecaRecordOutputter
 
X_POINT_ID - Static variable in interface org.apache.nutch.net.URLFilter
The name of the extension point.
X_POINT_ID - Static variable in interface org.apache.nutch.net.URLNormalizer
 
X_POINT_ID - Static variable in interface org.apache.nutch.parse.HtmlParseFilter
Deprecated. The name of the extension point.
X_POINT_ID - Static variable in interface org.apache.nutch.parse.ParseFilter
The name of the extension point.
X_POINT_ID - Static variable in interface org.apache.nutch.parse.Parser
The name of the extension point.
X_POINT_ID - Static variable in interface org.apache.nutch.protocol.Protocol
The name of the extension point.

Z

zip(byte[]) - Static method in class org.apache.nutch.util.GZIPUtils
Returns an gzipped copy of the input array.

_

_compare(Object, Object) - Static method in class org.apache.nutch.crawl.SignatureComparator
 
_compare(byte[], int, int, byte[], int, int) - Static method in class org.apache.nutch.crawl.SignatureComparator
 

A B C D E F G H I K L M N O P R S T U V W X Z _

Copyright © 2007, 2012, Oracle and/or its affiliates. All rights reserved.