This chapter introduces Large Objects (LOBs) and discusses how LOB datatypes are used in application development. This chapter contains these topics:
Large Objects (LOBs) are a set of datatypes that are designed to hold large amounts of data. A LOB can hold up to a maximum size ranging from 8 terabytes to 128 terabytes depending on how your database is configured. Storing data in LOBs enables you to access and manipulate the data efficiently in your application.
This section introduces different types of data that you encounter when developing applications and discusses which kinds of data are suitable for large objects.
In the world today, applications must deal with the following kinds of data:
Simple structured data.
This data can be organized into simple tables that are structured based on business rules.
Complex structured data
This kind of data is complex in nature and is suited for the object-relational features of the Oracle database such as collections, references, and user-defined types.
This kind of data has a logical structure that is not typically interpreted by the database. For example, an XML document that is processed by your application or an external service, can be thought of as semi-structured data. The database provides technologies such as Oracle XML DB, Advanced Queueing, and Messages to help your application work with semi-structured data.
This kind of data is not broken down into smaller logical structures and is not typically interpreted by the database or your application. A photographic image stored as a binary file is an example of unstructured data.
Large objects are suitable for these last two kinds of data: semi-structured data and unstructured data. Large objects features allow you to store these kinds of data in the database as well as in operating system files that are accessed from the database.
With the growth of the internet and content-rich applications, it has become imperative that the database support a datatype that:
Can store unstructured and semi-structured data in an efficient manner.
Is optimized for large amounts of data.
Provides a uniform way of accessing data stored within the database or outside the database.
Examples of semi-structured data include document files such as XML documents or word processor files. These kinds of documents contain data in a logical structure that is processed or interpreted by an application, and is not broken down into smaller logical units when stored in the database.
Applications involving semi-structured data typically use large amounts of character data. The Character Large Object (
CLOB) and National Character Large Object (
NCLOB) datatypes are ideal for storing and manipulating this kind of data.
Binary File objects (
BFILE datatypes) can also store character data. You can use
BFILEs to load read-only data from operating system files into
NCLOB instances that you then manipulate in your application.
Unstructured data cannot be decomposed into standard components. For example, data about an employee can be structured into a name, which is stored as a string; an identifier, such as an ID number, a salary and so on. A photograph, on the other hand, consists of a long stream of 1s and 0s. These bits are used to switch pixels on or off so that you can see the picture on a display, but are not broken down into any finer structure for database storage.
Also, unstructured data such as text, graphic images, still video clips, full motion video, and sound waveforms tends to be large in size. A typical employee record may be a few hundred bytes, while even small amounts of multimedia data can be thousands of times larger.
SQL datatypes that are ideal for large amounts of unstructured binary data include the
BLOB datatype (Binary Large Object) and the
BFILE datatype (Binary File object).
The database supports LONG as well as LOB datatypes. When possible, change your existing applications to use LOBs instead of LONGs because of the added benefits that LOBs provide. LONG-to-LOB migration enables you to easily migrate your existing applications that access LONG columns, to use LOB columns.
Applications developed for use with Oracle Database version 7 and earlier, used the LONG or LONG RAW data type to store large amounts of unstructured data.
With the Oracle8i and later versions of the database, using LOB datatypes is recommended for storing large amounts of structured and semi-structured data. LOB datatypes have several advantages over LONG and LONG RAW types including:
LOB Capacity: LOBs can store much larger amounts of data. LOBs can store 4GB of data or more depending on you system configuration. LONG and LONG RAW types are limited to 2GB of data.
Number of LOB columns in a table: A table can have multiple LOB columns. LOB columns in a table can be of any LOB type. In Oracle Database Release 7.3 and higher, tables are limited to a single LONG or LONG RAW column.
Random piece-wise access: LOBs support random access to data, but LONGs support only sequential access.
LOBs can also be object attributes.
Different kinds of LOBs can be stored in the database or in external files.
Note:LOBs in the database are sometimes also referred to as internal LOBs or internal persistent LOBs.
LOBs in the database are stored inside database tablespaces in a way that optimizes space and provides efficient access. The following SQL datatypes are supported for declaring internal LOBs:
NCLOB. Details on these datatypes are given in "Large Object Datatypes".
Persistent and Temporary LOBs
Internal LOBs (LOBs in the database) can be either persistent or temporary. A persistent LOB is a LOB instance that exists in a table row in the database. A temporary LOB instance is created when you instantiate a LOB only within the scope of your local application.
A temporary instance becomes a persistent instance when you insert the instance into a table row.
Persistent LOBs use copy semantics and participate in database transactions. You can recover persistent LOBs in the event of transaction or media failure, and any changes to a persistent LOB value can be committed or rolled back. In other words, all the Atomicity Consistency Isolation Durability (ACID) properties that pertain to using database objects pertain to using persistent LOBs.
External LOBs are data objects stored in operating system files, outside the database tablespaces. The database accesses external LOBs using the SQL datatype
BFILE datatype is the only external LOB datatype.
BFILEs are read-only datatypes. The database allows read-only byte stream access to data stored in BFILEs. You cannot write to a BFILE from within your application.
The database uses reference semantics with
BFILE columns. Data stored in a table column of type
BFILE, is physically located in an operating system file, not in the database tablespace.
You typically use BFILEs to hold:
Binary data that does not change while your application is running, such as graphics.
Data that is loaded into other large object types, such as a BLOB or CLOB where the data can then be manipulated.
Data that is appropriate for byte-stream access, such as multimedia.
Read-only data that is relatively large in size, to avoid taking up large amounts database tablespace.
Any storage device accessed by your operating system can hold BFILE data, including hard disk drives, CD-ROMs, PhotoCDs and DVDs. The database can access
BFILEs provided the operating system supports stream-mode access to the operating system files.
Note:External LOBs do not participate in transactions. Any support for integrity and durability must be provided by the underlying file system as governed by the operating system.
When you use a LOB in an operation such as passing a LOB as a parameter, you are actually passing a LOB locator. For the most part, you can work with a LOB instance in your application without being concerned with the semantics of LOB locators. There is no need to dereference LOB locators, as is required with pointers in some programming languages.
There are some issues regarding the semantics of LOB locators and how LOB values are stored that you should be aware of. These details are covered in the context of the discussion where they apply throughout this guide.
In all programmatic environments, database semantics differ between internal LOBs and external LOBs as follows:
Internal LOBs use copy semantics.
With copy semantics, both the LOB locator and LOB value are logically copied during insert, update, or assignment operations. This ensures that each table cell or each variable containing a LOB, holds a unique LOB instance.
External LOBs use reference semantics.
With reference semantics, only the LOB locator is copied during insert operations. (Note that update operations do not apply to external LOBs as external LOBs are read-only. This is explained in more detail later in this section.)
Table 1-1 describes each large object datatype supported by the database and describes the kind of data each datatype is typically used for. The names of datatypes given here are the SQL datatypes provided by the database. In general, the descriptions given for the datatypes in this table and the rest of this book also apply to the corresponding datatypes provided for other programmatic environments. Also, note that the term "LOB" is generally used to refer to the set of all large object datatypes.
||Binary Large Object
Stores any kind of data in binary format. Typically used for multimedia data such as images, audio, and video.
||Character Large Object
Stores string data in the database character set format. Used for large strings or documents that use the database character set exclusively. Characters in the database character set are in a fixed width format.
||National Character Set Large Object
Stores string data in National Character Set format. Used for large strings or documents in the National Character Set. Supports characters of varying width format.
||External Binary File
A binary file stored outside of the database in the host operating system file system, but accessible from database tables.
Any kind of data, that is, any operating system file, can be stored in a
You can declare LOB datatypes as fields, or members, of object datatypes. For example, you can have an attribute of type
CLOB on an object type. In general, there is no difference in the usage of a LOB instance in a LOB column and the usage of a LOB instance that is a member or of an object datatype. Any difference in usage is called out when it applies. When used in this guide, the term LOB attribute refers to a LOB instance that is a member of an object datatype. Unless otherwise specified, discussions that apply to LOB columns also apply to LOB attributes.
You can use LOBs to create other user-defined datatypes or store other datatypes as LOBs. This section discusses some of the datatypes provided with the database as examples of datatypes that are stored or created with LOB types.
An instance of type
VARRAY in the database is stored as an array of LOBs when you create a table in the following scenarios:
If the VARRAY storage clause—
VARRAY varray_item STORE AS —is not specified, and the declared size of varray data is more than 4000 bytes.
If the varray column properties are specified using the STORE AS LOB clause—
VARRAY varray_item STORE AS LOB ...
A good example of how LOB datatypes can be used to store other datatypes is the
XMLType datatype. The
XMLType datatype is stored as a CLOB type. Setting up your table or column to store XMLType datatypes as CLOBs enables you to store schema-less XML documents in the database.
Oracle interMedia uses LOB datatypes to create datatypes specialized for use in multimedia application such as interMedia ORDAudio, ORDDoc, ORDImage, and ORDVideo. Oracle interMedia uses the database infrastructure to define object types, methods, and LOBs necessary to represent these specialized types of data in the database.