Glossary

administrative IP address

The virtual IP (VIP) address exported by the 5800 system for administrative access to a cell.

API

Application programming interface. A set of routines, protocols, and tools that developers use to build software applications.

attribute

An entry in the schema that associates a name with a type. For example, the name Doctor might be of type string. Metadata is stored by assigning a value of the appropriate type to an attribute name, and attributes can also be used to create virtual file system views.

authorized client

Clients that are authorized to access data on the 5800 system By default, the system allows any client on the network to access the data stored on the 5800 system, but you can specify a list of authorized clients, which are the only clients that have access to the data.

cell

The basic building block of the 5800 system. A full-cell configuration consists of 16 storage nodes, two gigabit Ethernet switches, and one service node.

CLI

Command-line interface. Text-based form of communication with the 5800 system. You access the CLI by issuing the command ssh admin@adminIPaddress from a host on the same network as the 5800 system.

client

An application that runs on a personal computer or workstation and relies on a server to perform some operations.

cluster

A term sometimes used to refer to the 5800 system cell or cells in a configuration.

CPU

Central processing unit. The brains of the computer, sometimes referred to simply as the processor or central processor. The CPU is where most calculations take place

ctime

Creation time. The system metadata includes information on the creation time, data length, and data hash.

data hash

Hashes are used for accessing data or for security. A hash, also called a message digest, is a number generated from a string of text. The hash is substantially smaller than the text itself, and is generated by a formula in such a way that it is extremely unlikely that some other text will produce the same hash value.

data IP address

The virtual IP (VIP) exported by the 5800 system for access to the data stored on a cell.

data object

A stored file with an associated Object ID (OID).

disk mask

A current record of disk availability across the system.

DNS

Domain Name Service. A service that defines naming conventions that translate domain names into IP (Internet Protocol) addresses.

DTD

Document Type Definition. Defines the legal building blocks of an XML document. The DTD defines the document structure with a list of legal elements, thus providing an application-independent way of sharing data.

emulator

Software that imitates the behavior of a 5800 system, allowing you to test applications.

extended metadata

Metadata that is added by the user of the 5800 system. User metadata consists of name=value pairs. The name is defined in the system schema as of a certain type (for example, a string), and the value is associated with the name at the time data is stored.

file system view

See virtual file system view.

fragment

A piece of a file. Files over a certain size are stored in several chunks or fragments rather than in a single contiguous sequence of bits in one place. The 5800 system stores fragments of files across multiple disks and nodes using 5+2 encoding. Thus, when an object of any type (for example, an MP3 binary or a text file) is stored in the 5800 system, it is divided into five data fragments and two corresponding parity fragments.

FRU

Field-replaceable unit. Describes any hardware device, or more commonly a part or component of a device or system, that can easily be replaced by a skilled technician without having to send the entire device or system to be repaired. As the name implies, the unit can be replaced in the field (that is, at the user location).

fsView

Section of the metadata schema file where you specify virtual file system views. fsViews are also used to specify which indexes the system creates for responding to metadata queries.

full-cell

A 5800 system configuration that includes 16 storage nodes, two gigabit Ethernet switches, and one service node.

gateway

A router that connects the local subnet on which the 5800 system resides to the larger network. You must configure a default gateway for each 5800 system cell, to enable information about the system to be available on the network.

GB

Gigabyte. Represents 2 to the 30th power (1,073,741,824) bytes. One gigabyte is equal to 1,024 megabytes.

GUI

graphical user interface. A graphical form of communication with the 5800 system. You access the GUI by typing the administrative IP address and GUI port number in the URL line in a Java-enabled web browser connected to the same network as the 5800 system.

HADB

High-availability database. A highly available and scalable, always-on relational database management system used to store metadata on the 5800 system.

half-cell

A 5800 system configuration that includes eight storage nodes, two gigabit Ethernet switches, and one service node

hive

A multicell configuration including at least two full-cell (16-node) 5800 system storage nodes.

HTML

HyperText Markup Language. Designed to display data and focus on how data looks. The tags you use to mark up HTML documents and the document’s structure are predefined, so you can only use tags that are defined in the HTML standard.

HTTP

HyperText Transfer Protocol. Underlying protocol used by the World Wide Web. HTTP defines how messages are formatted and transmitted, and what actions web servers and browsers should take in response to various commands.

index

A sequence of columns in the metadata database against which queries are made.

metadata

Extra information about the data object. Describes how and when and by whom a particular set of data was collected, and how the data is formatted. There are two main types of metadata in the 5800 system: system and extended.

MP3

Moving Pictures Experts Group (MPEG), audio layer 3 file. Layer 3 is one of three coding schemes (layer 1, layer 2, and layer 3) for the compression of audio signals.

multicell

A configuration including more than one full-cell of sixteen 5800 system storage nodes. Also called a hive.

namespace

A collection of names, identified by a uniform resource identifier (URI), that XML uses to keep names from separate sources from colliding unintentionally. You can have as many namespaces as desired in the 5800 system metadata schema. There is also no limit on the number of namespaces that can be encapsulated within a given namespace level (subnamespaces).

NDMP

Network Data Management Protocol. An open standard backup protocol implemented on the 5800 system to allow you to back up the data stored on the system to tape and restore that data in the event of catastrophic system loss.

node

A processing location. A node can be a computer or some other device, such as a printer. Every node has a unique network address.

NTP

Network Time Protocol. An Internet standard protocol (built on top of TCP/IP) that assures accurate synchronization to the millisecond of computer clock times in a network.

object

Any item that can be individually selected and manipulated. For example, in object-oriented programming, an object is a self-contained entity that consists of both data and procedures to manipulate the data.

OID

Object ID. A unique identifier for each stored object included in the system metadata.

placement algorithm

Calculation that determines where to store the data and parity chunks of an object stored on the 5800 system. When a data object comes into the system, the gigabit Ethernet switch directs the store request to a storage node, and that node fragments the object and distributes the fragments to different disks in the system according to the placement algorithm.

query

A request for information from a database.

Reed-Solomon Encoding Algorithm

An encoding algorithm that protects data stored in the 5800 system. The Reed-Solomon (RS) algorithm is part of a code family that efficiently builds redundancy into a file to guarantee reliability in the face of multiple part failures in the storage system.

SATA

Serial Advanced Technology Attachment (ATA). An evolution of the Parallel ATA physical storage interface. Serial ATA is a serial link (a single cable with a minimum of four wires) that creates a point-to-point connection between devices. Transfer rates for Serial ATA begin at 150 MBps.

schema

Defines how the 5800 system metadata is structured. The schema consists of attributes, each of which has a defined type.

SDK

Software developer’s kit. Includes sample applications and command-line routines that demonstrate the 5800 system’s capabilities as well as provide good programming examples.

service node

A Sun Microsystems Sun Firetrademark X2100 M2 server with one 250-gigabyte serial ATA (SATA) disk drive. Used by the 5800 system for initial configuration and troubleshooting, and to upgrade the system software.

SMTP

Simple Mail Transfer Protocol. A protocol for sending email messages between servers. Most email systems that send mail over the Internet use SMTP to send messages from one server to another.

storage node

A node on which the 5800 system stores data. The storage node includes a single-core AMD Opteron processor, three GB of memory, four 500-GB disk drives, and two Ethernet ports.

string

A contiguous sequence of symbols or values, such as a character string (a sequence of characters) or a binary digit string (a sequence of binary values). One of the attribute types allowed for metadata on the 5800 system.

system metadata

Metadata that includes a unique identifier for each stored object, called the OID, as well as information on creation time (ctime), data length, and data hash. It is automatically maintained by the system.

table

Partition of the metadata schema. You partition the metadata schema into tables and specify each metadata field as a column within a particular table. You can greatly improve the performance of query and store operations by grouping metadata fields that commonly occur together in the same table and by separating metadata fields that do not commonly occur together into separate tables. Objects stored in the 5800 system become rows in one or more tables, depending on which fields are associated with that data.

virtual IP (VIP)

Virtual IP address. The 5800 system exports two public IP addresses, one to access the data and one to access administrative functions.

virtual file system view

Arrangements of the data stored in the 5800 system that allow you to use WebDAV to browse the files as though they were stored in a hierarchical path structure. A virtual file system view is defined using the metadata attributes in the metadata schema file.

WebDAV

Web-based Distributed Authoring and Versioning. A set of extensions to the HTTP/1.1 protocol that allows you to read, add, and delete files on remote web servers. Using the metadata schema file, you can set up virtual file system views in the 5800 system that allow you to use WebDAV to browse through data files on the system as though they were stored in a hierarchical path structure.

XML

Extensible markup language. XML offers a widely adopted standard way of representing text and data in a format that can be processed with relatively little human intervention and exchanged across diverse hardware, operating systems, and applications.