|Oracle 8i Data Cartridge Developer's Guide
Release 2 (8.1.6)
Part Number A76937-01
What Is a Data Cartridge?, 5 of 7
Figure 1-1 shows the standard services implemented by the Oracle8i server. This section describes some of these services, not to provide exhaustive descriptions but to highlight major Oracle8i capabilities as they relate to data cartridge development.
The Oracle8i universal data server provides both native and extensible type system services. Historically, mainstream applications have focused on accessing and modifying corporate data that is stored in tables composed of native SQL datatypes, such as
CHAR. Oracle8i adds support for new types, including:
This section discusses these new types.
An object type differs from native SQL datatypes in that it is user-defined and it specifies both the underlying persistent data (attributes) and the related behaviors (methods). Object types are used to extend the modeling capabilities provided by the native datatypes. You can use object types to make better models of complex entities in the real world by binding data attributes to semantic behaviors.
An object type can have one or more attributes. Each attribute has a name and a type. The type of an attribute can be a native SQL type, a
LOB, a collection, another object type, or a
REF type. The syntax for defining object types is discussed in Chapter 3.
A method is a procedure or a function that is part of an object type definition. Methods can access and manipulate attributes of the related object type. Methods can run within the execution environment of the Oracle8i server. Methods can also be dispatched outside the server as part of the extensible server execution environment.
Collections are SQL datatypes that contain multiple elements. Each element, or value, for a collection is the same datatype. In Oracle8i, collections of complex types can be
VARRAYs or nested tables.
A VARRAY contains a variable number of ordered elements. The
VARRAY datatype can be used for a column of a table or an attribute of an object type. The element type of a
VARRAY can be either a native datatype, such as
NUMBER, or an object type.
A nested table can be created using Oracle8i SQL to provide the semantics of an unordered collection. As with a
VARRAY, a nested table can be used to define a column of a table or an attribute of an object type.
If you create an object table in Oracle8i, you can obtain a reference that acts as a database pointer to an associated row object. References are important for navigating among object instances, particularly in client-side applications.
REF operator obtains a reference to a row object. Because
REFs rely on the underlying object identity, you can use
REF only with an object stored as a row in an object table or objects composed from an object view.
For further information about the
REF operator and examples of its use, see the chapter on object types in the PL/SQL User's Guide and Reference.
Oracle8i provides large object (
LOB) types to handle the storage demands of images, video clips, documents, and other forms of non-structured data. For an extensive coverage of Large Objects, please see Oracle8i Application Developer's Guide - Large Objects (LOBs). Large objects are stored in a way that optimizes space utilization and provides efficient access. Large objects are composed of locators and the related binary or character data. The
LOB locators are stored in-line with other table columns and, for internal
NCLOB), the data can be in a separate database storage area. For external
BFILE), the data is stored outside the database tablespaces in operating system files. A table can contain multiple
LOB columns (in contrast to the limit of one
RAW column per table). Each
LOB column can be stored in a separate tablespace, and even on different secondary storage devices.
Oracle8i SQL data definition language (DDL) extensions let you create, modify, and delete tables and object types that contain large objects (
LOBs). The Oracle8i SQL data manipulation language (DML) includes statements to insert and delete complete
LOBs. There is also an extensive set of statements for piece-wise reading, writing, and manipulating of
LOBs with PL/SQL and the Oracle Call Interface (OCI) software.
LOB types, both the locators and related data participate fully in the transactional model of the Oracle server. The data for
BFILEs does not participate in transactions; however,
BFILE locators are fully supported by Oracle server transactions. For more information about
LOBs and transactions, see the Oracle8i Application Developer's Guide - Large Objects (LOBs).
With SQL alone, the data residing within Oracle8i
LOBs is opaque and cannot be queried. However, you can use the various
LOB APIs to build modules, including methods of object types, to access and manipulate
LOB content. Further, unlike scalar quantities, a
LOB value cannot be indexed using built-in indexing schemes. The extensible indexing framework lets you define the semantics of data residing in
LOBs, and to manipulate the data using these semantics.
Oracle8i provides you a variety of interfaces and environments to access and manipulate
LOBs, which are described in great detail in Oracle8i Application Developer's Guide - Large Objects (LOBs).The use of
LOBs to store and manipulate binary and character data to represent your domain is discussed Chapter 6, "Working with Multimedia Datatypes".
The Oracle8i type system decouples the implementation of a member method for an object type from the specification of the method. Components of an Oracle8i data cartridge can be implemented using any of the popular programming languages. In Oracle8i, methods, functions, and procedures can be developed using PL/SQL, external C language routines, or Java. Thus, the database server runtime environment can be extended by user-defined methods, functions, and procedures.
In Oracle8i, Java offers data cartridge developers a powerful implementation choice for data cartridge behavior. In addition, PL/SQL offers a data cartridge developer a powerful procedural language that supports all the object extensions for SQL. With PL/SQL, program logic can execute on the server and perform traditional procedural language operations such as loops, if-then-else clauses, and array access.
While PL/SQL and Java are powerful, certain computation-intensive operations such as a Fast Fourier Transform or an image format conversion are handled more efficiently by C programs. With the Oracle8i Server, you can call C language programs from the server. Such programs are executed as in a separate address space than the server. This ensures that the database server is insulated from any program failures that might occur in external procedures and, under no circumstances, can an Oracle8i database be corrupted by such failures.
With certain reasonable restrictions, external routines can call back to the Oracle Server using OCI. Callbacks are particularly useful for processing
LOBs. For example, by using callbacks an external routine can perform piece-wise reads or writes of
LOBs stored in the database. External routines can also use callbacks to manipulate domain indexes stored as Index-Organized Tables in the database.
Typical database management systems support a few types of access methods (B+Trees, Hash Indexes) on a limited set of data types (numbers, strings, and so on). For simple data types such as integers and small strings, all aspects of indexing can be easily handled by the database system. In recent years, however, databases are being used to store different types of data such as text, spatial, image, video and audio that require content-based retrieval. This raises the need for indexing complex data types and also specialized indexing techniques.
Complex data types have application-specific formats, indexing requirements, and selection predicates. For example, there are many different means of document encoding (ODA, XML, plain text) and information retrieval techniques (keyword, full-text boolean, similarity, probabilistic, and so on). Similarly, R-trees are an efficient method of indexing spatial data. No database server can be built with support for all possible kinds of complex data and indexing. Oracle's solution is to build an extensible server which lets you define new index types as required.
The framework to develop new index types is based on the concept of cooperative indexing where a data cartridge and the Oracle server cooperate to build and maintain indexes for data types such as text, and spatial for application domains such as On-line-Analytical Processing (OLAP). The cartridge is responsible for defining the index structure, maintaining the index content during load and update operations, and searching the index during query processing. The index structure itself can either be stored in Oracle database as heap-organized, or an index-organized table, or externally as an operating system file.
To this end, Oracle8i introduces the concept of an indextype. The purpose of an indextype is to enable efficient search and retrieval functions for complex domains such as text, spatial, image, and OLAP by means of a data cartridge. An indextype is analogous to the sorted or bit-mapped index types that are built-in within the Oracle Server. The essential difference is that the implementation for an indextype is provided by the data cartridge developer, whereas the Oracle kernel implements built-in indexes. Once a new indextype has been implemented by a data cartridge developer, end users of the data cartridge can use it just as they would built-in indextypes.
With extensible indexing, the application
When the database system handles the physical storage of domain indexes, data cartridges
Typical database systems (RDBMS and ORDBMS) do not support extensible indexing. Consequently, many applications maintain file-based indexes for complex data residing in relational database tables. A considerable amount of code and effort is required to maintain consistency between external indexes and the related relational data, support compound queries (involving tabular values and external indexes), and to manage a system (backup, recovery, allocate storage, and so on) with multiple forms of persistent storage (files and databases). By supporting extensible indexes, the Oracle8i Server significantly reduces the level of effort needed to develop solutions involving high-performance access to complex datatypes.
The extensible optimizer functionality allows authors of user-defined functions and indexes to create statistics collection, selectivity, and cost functions. This information is used by the optimizer in choosing a query plan. The cost-based optimizer is thus extended to use the user-supplied information; the rule-based optimizer is unchanged.
The optimizer generates an execution plan for a SQL statement. An execution plan includes an access method for each table in the
FROM clause, and an ordering (called the join order) of the tables in the
FROM clause. System-defined access methods include indexes, hash clusters, and table scans. The optimizer chooses a plan by generating a set of join orders or permutations, computing the cost of each, and selecting the one with the lowest cost. For each table in the join order, the optimizer computes the cost of each possible access method and join method and chooses the one with the lowest cost. The cost of the join order is the sum of the access method and join method costs. The costs are calculated using algorithms which together compose the cost model. A cost model can include varying level of detail about the physical environment in which the query is executed. Our present cost model includes only the number of disk accesses with minor adjustments to compensate for the lack of detail. The optimizer uses statistics about the objects referenced in the query to compute the costs. The statistics are gathered using the
ANALYZE command. The optimizer uses these statistics to calculate cost and selectivity. The selectivity of a predicate is the fraction of rows in a table that will be chosen by the predicate. It is a number between 0 and 100 (expressed as percentage).
Extensible indexing functionality allows users to define new operators, index types, and domain indexes. For such user-defined operators and domain indexes, the extensible optimizer functionality will allow users to control the three main components used by the optimizer to select an execution plan: statistics, selectivity, and cost.