|Oracle® Database Concepts
10g Release 2 (10.2)
|PDF · Mobi · ePub|
This chapter provides an overview of the Oracle database server. The topics include:
An Oracle database is a collection of data treated as a unit. The purpose of a database is to store and retrieve related information. A database server is the key to solving the problems of information management. In general, a server reliably manages a large amount of data in a multiuser environment so that many users can concurrently access the same data. All this is accomplished while delivering high performance. A database server also prevents unauthorized access and provides efficient solutions for failure recovery.
Oracle Database is the first database designed for enterprise grid computing, the most flexible and cost effective way to manage information and applications. Enterprise grid computing creates large pools of industry-standard, modular storage and servers. With this architecture, each new system can be rapidly provisioned from the pool of components. There is no need for peak workloads, because capacity can be easily added or reallocated from the resource pools as needed.
The database has logical structures and physical structures. Because the physical and logical structures are separate, the physical storage of data can be managed without affecting the access to logical storage structures.
The section contains the following topics:
Grid computing is a new IT architecture that produces more resilient and lower cost enterprise information systems. With grid computing, groups of independent, modular hardware and software components can be connected and rejoined on demand to meet the changing needs of businesses.
The grid style of computing aims to solve some common problems with enterprise IT: the problem of application silos that lead to under utilized, dedicated hardware resources, the problem of monolithic, unwieldy systems that are expensive to maintain and difficult to change, and the problem of fragmented and disintegrated information that cannot be fully exploited by the enterprise as a whole.
Benefits of Grid Computing Compared to other models of computing, IT systems designed and implemented in the grid style deliver higher quality of service, lower cost, and greater flexibility. Higher quality of service results from having no single point of failure, a robust security infrastructure, and centralized, policy-driven management. Lower costs derive from increasing the utilization of resources and dramatically reducing management and maintenance costs. Rather than dedicating a stack of software and hardware to a specific task, all resources are pooled and allocated on demand, thus eliminating under utilized capacity and redundant capabilities. Grid computing also enables the use of smaller individual hardware components, thus reducing the cost of each individual component and providing more flexibility to devote resources in accordance with changing needs.
The grid style of computing treats collections of similar IT resources holistically as a single pool, while exploiting the distinct nature of individual resources within the pool. To address simultaneously the problems of monolithic systems and fragmented resources, grid computing achieves a balance between the benefits of holistic resource management and flexible independent resource control. IT resources managed in a grid include:
Infrastructure: the hardware and software that create a data storage and program execution environment
Applications: the program logic and flow that define specific business processes
Information: the meanings inherent in all different types of data used to conduct business
With virtualization, individual resources (e.g. computers, disks, application components and information sources) are pooled together by type then made available to consumers (e.g. people or software programs) through an abstraction. Virtualization means breaking hard-coded connections between providers and consumers of resources, and preparing a resource to serve a particular need without the consumer caring how that is accomplished.
With provisioning, when consumers request resources through a virtualization layer, behind the scenes a specific resource is identified to fulfill the request and then it is allocated to the consumer. Provisioning as part of grid computing means that the system determines how to meet the specific need of the consumer, while optimizing operation of the system as a whole.
The specific ways in which information, application or infrastructure resources are virtualized and provisioned are specific to the type of resource, but the concepts apply universally. Similarly, the specific benefits derived from grid computing are particular to each type of resource, but all share the characteristics of better quality, lower costs and increased flexibility.
Infrastructure Grid Infrastructure grid resources include hardware resources such as storage, processors, memory, and networks as well as software designed to manage this hardware, such as databases, storage management, system management, application servers, and operating systems.
Virtualization and provisioning of infrastructure resources mean pooling resources together and allocating to the appropriate consumers based on policies. For example, one policy might be to dedicate enough processing power to a web server that it can always provide sub-second response time. That rule could be fulfilled in different ways by the provisioning software in order to balance the requests of all consumers.
Treating infrastructure resources as a single pool and allocating those resources on demand saves money by eliminating under utilized capacity and redundant capabilities. Managing hardware and software resources holistically reduces the cost of labor and the opportunity for human error.
Spreading computing capacity among many different computers and spreading storage capacity across multiple disks and disk groups removes single points of failure so that if any individual component fails, the system as a whole remains available. Furthermore, grid computing affords the option to use smaller individual hardware components, such as blade servers and low cost storage, which enables incremental scaling and reduces the cost of each individual component, thereby giving companies more flexibility and lower cost.
Infrastructure is the dimension of grid computing that is most familiar and easy to understand, but the same concepts apply to applications and information.
Applications Grid Application resources in the grid are the encodings of business logic and process flow within application software. These may be packaged applications or custom applications, written in any programming language, reflecting any level of complexity. For example, the software that takes an order from a customer and sends an acknowledgement, the process that prints payroll checks, and the logic that routes a particular customer call to a particular agent are all application resources.
Historically, application logic has been intertwined with user interface code, data management code, and process or page flow and has lacked well-defined interfaces, which has resulted in monolithic applications that are difficult to change and difficult to integrate.
Service oriented architecture has emerged as a superior model for building applications, and service oriented architecture concepts align exactly with the core tenets of grid computing. Virtualization and provisioning of application resources involves publishing application components as services for use by multiple consumers, which may be people or processes, then orchestrating those services into more powerful business flows.
In the same way that grid computing enables better reuse and more flexibility of IT infrastructure resources, grid computing also treats bits of application logic as a resource, and enables greater reuse of application functionality and more flexibility in changing and building new composite applications.
Furthermore, applications that are orchestrated from published services are able to view activities in a business as a single whole, so that processes are standardized across geography and business units and processes are automated end-to-end. This generates more reliable business processes and lowers cost through increased automation and reduced variability.
Information Grid The third dimension to grid computing, after infrastructure and applications, is information. Today, information tends to be fragmented across a company, making it difficult to see the business as a whole or answer basic questions.about customers. Without information about who the customer is, and what they want to buy, information assets go underexploited.
In contrast, grid computing treats information holistically as a resource, similar to infrastructure and applications resources, and thus extracts more of its latent value. Information grid resources include all data in the enterprise and all metadata required to make that data meaningful. This data may be structured, semi-structured, or unstructured, stored in any location, such as databases, local file systems, or e-mail servers, and created by any application.
The core tenets of grid computing apply similarly to information as they do to infrastructure and applications. The infrastructure grid exploits the power of the network to allow multiple servers or storage devices to be combined toward a single task, then easily reconfigured as needs change. A service oriented architecture, or an applications grid, enables independently developed services, or application resources, to be combined into larger business processes, then adapted as needs change without breaking other parts of the composite application. Similarly, the information grid provides a way for information resources to be joined with related information resources to greater exploit the value of the inherent relationships among information, then for new connections to be made as situations change.
The relational database, for example, was an early information virtualization technology. Unlike its predecessors, the network database and hierarchical database models, in which all relationships between data had to be predetermined, relational database enabled flexible access to a general-purpose information resource. Today, XML furthers information virtualization by providing a standard way to represent information along with metadata, which breaks the hard link between information and a specific application used to create and view that information.
Information provisioning technologies include message queuing, data propagation, replication, extract-transform-load, as well as mapping and cleansing tools to ensure data quality. Data hubs, in which a central operational data store continually syncs with multiple live data sources, are emerging as a preferred model for establishing a single source of truth while maintaining the flexibility of distributed control.
Grid Resources Work Well Independently and Best Together By managing any single IT resource – infrastructure, applications, or information - using grid computing, regardless of how the other resources are treated, enterprises can realize higher quality, more flexibility, and lower costs. For example, there is no need to rewrite applications to benefit from an infrastructure grid. It is also possible to deploy an applications grid, or a service oriented architecture, without changing the way information is managed or the way hardware is configured.
It is possible, however, to derive even greater benefit by using grid computing for all resources. For example, the applications grid becomes even more valuable when you can set policies regarding resource requirements at the level of individual services and have execution of different services in the same composite application handled differently by the infrastructure - something that can only be done by an application grid in combination with an infrastructure grid. In addition, building an information grid by integrating more information into a single source of truth becomes tenable only when the infrastructure is configured as a grid, so it can scale beyond the boundary of a single computer.
On the path toward this grand vision of grid computing, companies need real solutions to support their incremental moves toward a more flexible and more productive IT architecture. The Oracle Database 10g family of software products implements much of the core grid technology to get companies started. And Oracle delivers this grid computing functionality in the context of holistic enterprise architecture, providing a robust security infrastructure, centralized management, intuitive, powerful development tools, and universal access. Oracle Database 10g includes:
Oracle Database 10g
Oracle Application Server 10g
Oracle Enterprise Manager 10g
Oracle Collaboration Suite 10g
Although the grid features of Oracle 10g span all of the products listed above, this discussion will focus on the grid computing capabilities of Oracle Database 10g.
Server Virtualization. Oracle Real Application Clusters 10g (RAC) enable a single database to run across multiple clustered nodes in a grid, pooling the processing resources of several standard machines. Oracle is uniquely flexible in its ability to provision workload across machines because it is the only database technology that does not require data to be partitioned and distributed along with the work. Oracle 10g Release 2 software includes enhancements for balancing connections across RAC instances, based on policies.
Storage Virtualization. The Oracle Automatic Storage Management (ASM) feature of Oracle Database 10g provides a virtualization layer between the database and storage so that multiple disks can be treated as a single disk group and disks can be dynamically added or removed while keeping databases online. Existing data will automatically be spread across available disks for performance and utilization optimization. In Oracle 10g Release 2, ASM supports multiple databases, which could be at different software version levels, accessing the same storage pool.
Grid Management. Because grid computing pools together multiple servers and disks and allocates them to multiple purposes, it becomes more important that individual resources are largely self-managing and that other management functions are centralized.
The Grid Control feature of Oracle Enterprise Manager 10g provides a single console to manage multiple systems together as a logical group. Grid Control manages provisioning of nodes in the grid with the appropriate full stack of software and enables configurations and security settings to be maintained centrally for groups of systems.
Another aspect to grid management is managing user identities in a way that is both highly secure and easy to maintain. Oracle Identity Management 10g includes an LDAP-compliant directory with delegated administration and now, in Release 2, federated identity management so that single sign-on capabilities can be securely shared across security domains. Oracle Identity Management 10g closely adheres to grid principles by utilizing a central point for applications to authenticate users - the single sign-on server - while, behind the scenes, distributing control of identities via delegation and federation to optimize maintainability and overall operation of the system.
Standard Web Services Support. In addition to the robust web services support in Oracle Application Server 10g, Oracle database 10g can publish and consume web services. DML and DDL operations can be exposed as web services, and functions within the database can make a web service appear as a SQL row source, enabling use of powerful SQL tools to analyze web service data in conjunction with relational and non-relational data.
Oracle Enterprise Manager 10g enhances Oracle's support for service oriented architectures by monitoring and managing web services and any other administrator-defined services, tracking end-to-end performance and performing root cause analysis of problems encountered.
Data Provisioning. Information starts with data, which must be provisioned wherever consumers need it. For example, users may be geographically distributed, and fast data access may be more important for these users than access to an identical resource. In these cases, data must be shared between systems, either in bulk or near real time. Oracle's bulk data movement technologies include Transportable Tablespaces and Data Pump.
For more fine-grained data sharing, the Oracle Streams feature of Oracle Database 10g captures database transaction changes and propagates them, thus keeping two or more database copies in sync as updates are applied. It also unifies traditionally distinct data sharing mechanisms, such as message queuing, replication, events, data warehouse loading, notifications and publish/subscribe, into a single technology.
Centralized Data Management. Oracle Database 10g manages all types of structured, semi-structured and unstructured information, representing, maintaining and querying each in its own optimal way while providing common access to all via SQL and XML Query. Along with traditional relational database structures, Oracle natively implements OLAP cubes, standard XML structures, geographic spatial data and unlimited sized file management, thus virtualizing information representation. Combining these information types enables connections between disparate types of information to be made as readily as new connections are made with traditional relational data.
Metadata Management. Oracle Warehouse Builder is more than a traditional batch ETL tool for creating warehouses. It enforces rules to achieve data quality, does fuzzy matching to automatically overcome data inconsistency, and uses statistical analysis to infer data profiles. With Oracle 10g Release 2, its metadata management capabilities are extended from scheduled data pulls to handle a transaction-time data push from an Oracle database implementing the Oracle Streams feature.
Oracle's series of enterprise data hub products (for example, Oracle Customer Data Hub) provide real-time synchronization of operational information sources so that companies can have a single source of truth while retaining separate systems and separate applications, which may include a combination of packaged, legacy and custom applications. In addition to the data cleansing and scheduling mechanisms, Oracle also provides a well-formed schema, established from years of experience building enterprise applications, for certain common types of information, such as customer, financial, and product information.
Metadata Inference. Joining the Oracle 10g software family is the new Oracle Enterprise Search product. Oracle Enterprise Search 10g crawls all information sources in the enterprise, whether public or secure, including e-mail servers, document management servers, file systems, web sites, databases and applications, then returns information from all of the most relevant sources for a given search query. This crawl and index process uses a series of heuristics specific to each data source to infer metadata about all enterprise information that is used to return the most relevant results to any query.
There are two common ways to architect a database: client/server or multitier. As internet computing becomes more prevalent in computing environments, many database management systems are moving to a multitier environment.
Multiprocessing uses more than one processor for a set of related jobs. Distributed processing reduces the load on a single processor by allowing different processors to concentrate on a subset of related tasks, thus improving the performance and capabilities of the system as a whole.
An Oracle database system can easily take advantage of distributed processing by using its client/server architecture. In this architecture, the database system is divided into two parts: a front-end or a client, and a back-end or a server.
The client is a database application that initiates a request for an operation to be performed on the database server. It requests, processes, and presents data managed by the server. The client workstation can be optimized for its job. For example, it might not need large disk capacity, or it might benefit from graphic capabilities.
Often, the client runs on a different computer than the database server, generally on a PC. Many clients can simultaneously run against one server.
The server runs Oracle software and handles the functions required for concurrent, shared data access. The server receives and processes the SQL and PL/SQL statements that originate from client applications. The computer that manages the server can be optimized for its duties. For example, it can have large disk capacity and fast processors.
A multitier architecture has the following components:
A client or initiator process that starts an operation
One or more application servers that perform parts of the operation. An application server provides access to the data for the client and performs some of the query processing, thus removing some of the load from the database server. It can serve as an interface between clients and multiple database servers, including providing an additional level of security.
An end or database server that stores most of the data used in the operation
This architecture enables use of an application server to do the following:
Validate the credentials of a client, such as a Web browser
Connect to an Oracle database server
Perform the requested operation on behalf of the client
If proxy authentication is being used, then the identity of the client is maintained throughout all tiers of the connection.
The following sections explain the physical database structures of an Oracle database, including datafiles, redo log files, and control files.
Every Oracle database has one or more physical datafiles. The datafiles contain all the database data. The data of logical database structures, such as tables and indexes, is physically stored in the datafiles allocated for a database.
The characteristics of datafiles are:
A datafile can be associated with only one database.
Datafiles can have certain characteristics set to let them automatically extend when the database runs out of space.
One or more datafiles form a logical unit of database storage called a tablespace.
Data in a datafile is read, as needed, during normal database operation and stored in the memory cache of Oracle. For example, assume that a user wants to access some data in a table of a database. If the requested information is not already in the memory cache for the database, then it is read from the appropriate datafiles and stored in memory.
Modified or new data is not necessarily written to a datafile immediately. To reduce the amount of disk access and to increase performance, data is pooled in memory and written to the appropriate datafiles all at once, as determined by the database writer process (DBWn) background process.
See Also:"Overview of the Oracle Instance" for more information about Oracle's memory and process structures
Every Oracle database has a control file. A control file contains entries that specify the physical structure of the database. For example, it contains the following information:
Names and locations of datafiles and redo log files
Time stamp of database creation
Oracle can multiplex the control file, that is, simultaneously maintain a number of identical control file copies, to protect against a failure involving the control file.
Every time an instance of an Oracle database is started, its control file identifies the database and redo log files that must be opened for database operation to proceed. If the physical makeup of the database is altered (for example, if a new datafile or redo log file is created), then the control file is automatically modified by Oracle to reflect the change. A control file is also used in database recovery.
Every Oracle database has a set of two or more redo log files. The set of redo log files is collectively known as the redo log for the database. A redo log is made up of redo entries (also called redo records).
The primary function of the redo log is to record all changes made to data. If a failure prevents modified data from being permanently written to the datafiles, then the changes can be obtained from the redo log, so work is never lost.
The information in a redo log file is used only to recover the database from a system or media failure that prevents database data from being written to the datafiles. For example, if an unexpected power outage terminates database operation, then data in memory cannot be written to the datafiles, and the data is lost. However, lost data can be recovered when the database is opened, after power is restored. By applying the information in the most recent redo log files to the database datafiles, Oracle restores the database to the time at which the power failure occurred.
The process of applying the redo log during a recovery operation is called rolling forward.
Oracle recommends that you create a server parameter file (SPFILE) as a dynamic means of maintaining initialization parameters. A server parameter file lets you store and manage your initialization parameters persistently in a server-side disk file.
Oracle Database Administrator's Guide for information on creating and changing parameter files
Each server and background process can write to an associated trace file. When an internal error is detected by a process, it dumps information about the error to its trace file. Some of the information written to a trace file is intended for the database administrator, while other information is for Oracle Support Services. Trace file information is also used to tune applications and instances.
See Also:Oracle Database Administrator's Guide
To restore a file is to replace it with a backup file. Typically, you restore a file when a media failure or user error has damaged or deleted the original file.
User-managed backup and recovery requires you to actually restore backup files before you can perform a trial recovery of the backups.
Server-managed backup and recovery manages the backup process, such as scheduling of backups, as well as the recovery process, such as applying the correct backup file when recovery is needed.
A database is divided into logical storage units called tablespaces, which group related logical structures together. For example, tablespaces commonly group together all application objects to simplify some administrative operations.
Each database is logically divided into one or more tablespaces. One or more datafiles are explicitly created for each tablespace to physically store the data of all logical structures in a tablespace. The combined size of the datafiles in a tablespace is the total storage capacity of the tablespace.
Every Oracle database contains a
SYSTEM tablespace and a
SYSAUX tablespace. Oracle creates them automatically when the database is created. The system default is to create a smallfile tablespace, which is the traditional type of Oracle tablespace. The
SYSAUX tablespaces are created as smallfile tablespaces.
Oracle also lets you create bigfile tablespaces. This allows Oracle Database to contain tablespaces made up of single large files rather than numerous smaller ones. This lets Oracle Database utilize the ability of 64-bit systems to create and manage ultralarge files. The consequence of this is that Oracle Database can now scale up to 8 exabytes in size. With Oracle-managed files, bigfile tablespaces make datafiles completely transparent for users. In other words, you can perform operations on tablespaces, rather than the underlying datafiles.
See Also:"Overview of Tablespaces"
A tablespace can be online (accessible) or offline (not accessible). A tablespace is generally online, so that users can access the information in the tablespace. However, sometimes a tablespace is taken offline to make a portion of the database unavailable while allowing normal access to the remainder of the database. This makes many administrative tasks easier to perform.
At the finest level of granularity, Oracle database data is stored in data blocks. One data block corresponds to a specific number of bytes of physical database space on disk. The standard block size is specified by the
DB_BLOCK_SIZE initialization parameter. In addition, you can specify up to five other block sizes. A database uses and allocates free database space in Oracle data blocks.
The next level of logical database space is an extent. An extent is a specific number of contiguous data blocks, obtained in a single allocation, used to store a specific type of information.
Above extents, the level of logical database storage is a segment. A segment is a set of extents allocated for a certain logical structure. The following table describes the different types of segments.
|Data segment||Each nonclustered table has a data segment. All table data is stored in the extents of the data segment.
For a partitioned table, each partition has a data segment.
Each cluster has a data segment. The data of every table in the cluster is stored in the cluster's data segment.
|Index segment||Each index has an index segment that stores all of its data.
For a partitioned index, each partition has an index segment.
|Temporary segment||Temporary segments are created by Oracle when a SQL statement needs a temporary database area to complete execution. When the statement finishes execution, the extents in the temporary segment are returned to the system for future use.|
|Rollback segment||If you are operating in automatic undo management mode, then the database server manages undo space using tablespaces. Oracle recommends that you use automatic undo management.
Earlier releases of Oracle used rollback segments to store undo information. The information in a rollback segment was used during database recovery for generating read-consistent database information and for rolling back uncommitted transactions for users.
Space management for these rollback segments was complex, and Oracle has deprecated that method. This book discusses the undo tablespace method of managing undo; this eliminates the complexities of managing rollback segment space, and lets you exert control over how long undo is retained before being overwritten.
Oracle does use a
Oracle dynamically allocates space when the existing extents of a segment become full. In other words, when the extents of a segment are full, Oracle allocates another extent for that segment. Because extents are allocated as needed, the extents of a segment may or may not be contiguous on disk.
A schema is a collection of database objects. A schema is owned by a database user and has the same name as that user. Schema objects are the logical structures that directly refer to the database's data. Schema objects include structures like tables, views, and indexes. (There is no relationship between a tablespace and a schema. Objects in the same schema can be in different tablespaces, and a tablespace can hold objects from different schemas.)
Some of the most common schema objects are defined in the following section.
Tables are the basic unit of data storage in an Oracle database. Database tables hold all user-accessible data. Each table has columns and rows. A table that has an employee database, for example, can have a column called employee number, and each row in that column is an employee's number.
Indexes are optional structures associated with tables. Indexes can be created to increase the performance of data retrieval. Just as the index in this manual helps you quickly locate specific information, an Oracle index provides an access path to table data.
When processing a request, Oracle can use some or all of the available indexes to locate the requested rows efficiently. Indexes are useful when applications frequently query a table for a range of rows (for example, all employees with a salary greater than 1000 dollars) or a specific row.
Indexes are created on one or more columns of a table. After it is created, an index is automatically maintained and used by Oracle. Changes to table data (such as adding new rows, updating rows, or deleting rows) are automatically incorporated into all relevant indexes with complete transparency to the users.
Views are customized presentations of data in one or more tables or other views. A view can also be considered a stored query. Views do not actually contain data. Rather, they derive their data from the tables on which they are based, referred to as the base tables of the views.
Like tables, views can be queried, updated, inserted into, and deleted from, with some restrictions. All operations performed on a view actually affect the base tables of the view.
Views provide an additional level of table security by restricting access to a predetermined set of rows and columns of a table. They also hide data complexity and store complex queries.
Clusters are groups of one or more tables physically stored together because they share common columns and are often used together. Because related rows are physically stored together, disk access time improves.
Like indexes, clusters do not affect application design. Whether a table is part of a cluster is transparent to users and to applications. Data stored in a clustered table is accessed by SQL in the same way as data stored in a nonclustered table.
A synonym is an alias for any table, view, materialized view, sequence, procedure, function, package, type, Java class schema object, user-defined object type, or another synonym. Because a synonym is simply an alias, it requires no storage other than its definition in the data dictionary.
See Also:Chapter 5, "Schema Objects" for more information on these and other schema objects
Each Oracle database has a data dictionary. An Oracle data dictionary is a set of tables and views that are used as a read-only reference about the database. For example, a data dictionary stores information about both the logical and physical structure of the database. A data dictionary also stores the following information:
The valid users of an Oracle database
Information about integrity constraints defined for tables in the database
The amount of space allocated for a schema object and how much of it is in use
A data dictionary is created when a database is created. To accurately reflect the status of the database at all times, the data dictionary is automatically updated by Oracle in response to specific actions, such as when the structure of the database is altered. The database relies on the data dictionary to record, verify, and conduct ongoing work. For example, during database operation, Oracle reads the data dictionary to verify that schema objects exist and that users have proper access to them.
See Also:Chapter 7, "The Data Dictionary"
An Oracle database server consists of an Oracle database and an Oracle instance. Every time a database is started, a system global area (SGA) is allocated and Oracle background processes are started. The combination of the background processes and memory buffers is called an Oracle instance.
Some hardware architectures (for example, shared disk systems) enable multiple computers to share access to data, software, or peripheral devices. Real Application Clusters (RAC) takes advantage of such architecture by running multiple instances that share a single physical database. In most applications, RAC enables access to a single database by users on multiple computers with increased performance.
An Oracle database server uses memory structures and processes to manage and access the database. All memory structures exist in the main memory of the computers that constitute the database system. Processes are jobs that work in the memory of these computers.
Oracle creates and uses memory structures to complete several jobs. For example, memory stores program code being run and data shared among users. Two basic memory structures are associated with Oracle: the system global area and the program global area. The following subsections explain each in detail.
The System Global Area (SGA) is a shared memory region that contains data and control information for one Oracle instance. Oracle allocates the SGA when an instance starts and deallocates it when the instance shuts down. Each instance has its own SGA.
Users currently connected to an Oracle database share the data in the SGA. For optimal performance, the entire SGA should be as large as possible (while still fitting in real memory) to store as much data in memory as possible and to minimize disk I/O.
Database buffers store the most recently used blocks of data. The set of database buffers in an instance is the database buffer cache. The buffer cache contains modified as well as unmodified blocks. Because the most recently (and often, the most frequently) used data is kept in memory, less disk I/O is necessary, and performance is improved.
The redo log buffer stores redo entries—a log of changes made to the database. The redo entries stored in the redo log buffers are written to an online redo log, which is used if database recovery is necessary. The size of the redo log is static.
The shared pool contains shared memory constructs, such as shared SQL areas. A shared SQL area is required to process every unique SQL statement submitted to a database. A shared SQL area contains information such as the parse tree and execution plan for the corresponding statement. A single shared SQL area is used by multiple applications that issue the same statement, leaving more shared memory for other uses.
See Also:"SQL Statements" for more information about shared SQL areas
A cursor is a handle or name for a private SQL area in which a parsed statement and other information for processing the statement are kept. (Oracle Call Interface, OCI, refers to these as statement handles.) Although most Oracle users rely on automatic cursor handling of Oracle utilities, the programmatic interfaces offer application designers more control over cursors.
For example, in precompiler application development, a cursor is a named resource available to a program and can be used specifically to parse SQL statements embedded within the application. Application developers can code an application so it controls the phases of SQL statement execution and thus improves application performance.
The Program Global Area (PGA) is a memory buffer that contains data and control information for a server process. A PGA is created by Oracle when a server process is started. The information in a PGA depends on the Oracle configuration.
See Also:Chapter 8, "Memory Architecture"
An Oracle database uses memory structures and processes to manage and access the database. All memory structures exist in the main memory of the computers that constitute the database system. Processes are jobs that work in the memory of these computers.
The architectural features discussed in this section enable the Oracle database to support:
Many users concurrently accessing a single database
The high performance required by concurrent multiuser, multiapplication database systems
Oracle creates a set of background processes for each instance. The background processes consolidate functions that would otherwise be handled by multiple Oracle programs running for each user process. They asynchronously perform I/O and monitor other Oracle process to provide increased parallelism for better performance and reliability.
There are numerous background processes, and each Oracle instance can use several background processes.
See Also:"Background Processes" for more information on some of the most common background processes
A process is a "thread of control" or a mechanism in an operating system that can run a series of steps. Some operating systems use the terms job or task. A process generally has its own private memory area in which it runs.
An Oracle database server has two general types of processes: user processes and Oracle processes.
User processes are created and maintained to run the software code of an application program (such as an OCI or OCCI program) or an Oracle tool (such as Enterprise Manager). User processes also manage communication with the server process through the program interface, which is described in a later section.
Oracle processes are invoked by other processes to perform functions on behalf of the invoking process.
Oracle creates server processes to handle requests from connected user processes. A server process communicates with the user process and interacts with Oracle to carry out requests from the associated user process. For example, if a user queries some data not already in the database buffers of the SGA, then the associated server process reads the proper data blocks from the datafiles into the SGA.
Oracle can be configured to vary the number of user processes for each server process. In a dedicated server configuration, a server process handles requests for a single user process. A shared server configuration lets many user processes share a small number of server processes, minimizing the number of server processes and maximizing the use of available system resources.
On some systems, the user and server processes are separate, while on others they are combined into a single process. If a system uses the shared server or if the user and server processes run on different computers, then the user and server processes must be separate. Client/server systems separate the user and server processes and run them on different computers.
See Also:Chapter 9, "Process Architecture"
This section describes Oracle Net Services, as well as how to start up the database.
Communication protocols define the way that data is transmitted and received on a network. Oracle Net Services supports communications on all major network protocols, including TCP/IP, HTTP, FTP, and WebDAV.
Using Oracle Net Services, application developers do not need to be concerned with supporting network communications in a database application. If a new protocol is used, then the database administrator makes some minor changes, while the application requires no modifications and continues to function.
Oracle Net, a component of Oracle Net Services, enables a network session from a client application to an Oracle database server. Once a network session is established, Oracle Net acts as the data courier for both the client application and the database server. It establishes and maintains the connection between the client application and database server, as well as exchanges messages between them. Oracle Net can perform these jobs because it is located on each computer in the network.
Start an instance.
Mount the database.
Open the database.
A database administrator can perform these steps using the SQL*Plus
STARTUP statement or Enterprise Manager. When Oracle starts an instance, it reads the server parameter file (SPFILE) or initialization parameter file to determine the values of initialization parameters. Then, it allocates an SGA and creates background processes.
The following example describes the most basic level of operations that Oracle performs. This illustrates an Oracle configuration where the user and associated server process are on separate computers (connected through a network).
An instance has started on the computer running Oracle (often called the host or database server).
A computer running an application (a local computer or client workstation) runs the application in a user process. The client application attempts to establish a connection to the server using the proper Oracle Net Services driver.
The server is running the proper Oracle Net Services driver. The server detects the connection request from the application and creates a dedicated server process on behalf of the user process.
The user runs a SQL statement and commits the transaction. For example, the user changes a name in a row of a table.
The server process receives the statement and checks the shared pool for any shared SQL area that contains a similar SQL statement. If a shared SQL area is found, then the server process checks the user's access privileges to the requested data, and the previously existing shared SQL area is used to process the statement. If not, then a new shared SQL area is allocated for the statement, so it can be parsed and processed.
The server process retrieves any necessary data values from the actual datafile (table) or those stored in the SGA.
The server process modifies data in the system global area. The DBWn process writes modified blocks permanently to disk when doing so is efficient. Because the transaction is committed, the LGWR process immediately records the transaction in the redo log file.
If the transaction is successful, then the server process sends a message across the network to the application. If it is not successful, then an error message is transmitted.
Throughout this entire procedure, the other background processes run, watching for conditions that require intervention. In addition, the database server manages other users' transactions and prevents contention between transactions that request the same data.
See Also:Chapter 9, "Process Architecture" for more information about Oracle configuration
Oracle provides several utilities for data transfer, data maintenance, and database administration, including Data Pump Export and Import, SQL*Loader, and LogMiner.
See Also:Chapter 11, "Oracle Utilities"
This section contains the following topics:
Oracle includes several software mechanisms to fulfill the following important requirements of an information management system:
Data concurrency of a multiuser system must be maximized.
Data must be read and modified in a consistent fashion. The data a user is viewing or changing is not changed (by other users) until the user is finished with the data.
High performance is required for maximum productivity from the many users of the database system.
This contains the following sections:
A primary concern of a multiuser database management system is how to control concurrency, which is the simultaneous access of the same data by many users. Without adequate concurrency controls, data could be updated or changed improperly, compromising data integrity.
One way to manage data concurrency is to make each user wait for a turn. The goal of a database management system is to reduce that wait so it is either nonexistent or negligible to each user. All data manipulation language statements should proceed with as little interference as possible, and destructive interactions between concurrent transactions must be prevented. Destructive interaction is any interaction that incorrectly updates data or incorrectly alters underlying data structures. Neither performance nor data integrity can be sacrificed.
Oracle resolves such issues by using various types of locks and a multiversion consistency model. These features are based on the concept of a transaction. It is the application designer's responsibility to ensure that transactions fully exploit these concurrency and consistency features.
Guarantees that the set of data seen by a statement is consistent with respect to a single point in time and does not change during statement execution (statement-level read consistency)
Ensures that readers of database data do not wait for writers or other readers of the same data
Ensures that writers of database data do not wait for readers of the same data
Ensures that writers only wait for other writers if they attempt to update identical rows in concurrent transactions
The simplest way to think of Oracle's implementation of read consistency is to imagine each user operating a private copy of the database, hence the multiversion consistency model.
To manage the multiversion consistency model, Oracle must create a read-consistent set of data when a table is queried (read) and simultaneously updated (written). When an update occurs, the original data values changed by the update are recorded in the database undo records. As long as this update remains part of an uncommitted transaction, any user that later queries the modified data views the original data values. Oracle uses current information in the system global area and information in the undo records to construct a read-consistent view of a table's data for a query.
Only when a transaction is committed are the changes of the transaction made permanent. Statements that start after the user's transaction is committed only see the changes made by the committed transaction.
The transaction is key to Oracle's strategy for providing read consistency. This unit of committed (or uncommitted) SQL statements:
Dictates the start point for read-consistent views generated on behalf of readers
Controls when modified data can be seen by other transactions of the database for reading or updating
By default, Oracle guarantees statement-level read consistency. The set of data returned by a single query is consistent with respect to a single point in time. However, in some situations, you might also require transaction-level read consistency. This is the ability to run multiple queries within a single transaction, all of which are read-consistent with respect to the same point in time, so that queries in this transaction do not see the effects of intervening committed transactions. If you want to run a number of queries against multiple tables and if you are not doing any updating, you prefer a read-only transaction.
Oracle also uses locks to control concurrent access to data. When updating information, the data server holds that information with a lock until the update is submitted or committed. Until that happens, no one else can make changes to the locked information. This ensures the data integrity of the system.
Oracle provides unique non-escalating row-level locking. Unlike other data servers that ÒescalateÓ locks to cover entire groups of rows or even the entire table, Oracle always locks only the row of information being updated. Because Oracle includes the locking information with the actual rows themselves, Oracle can lock an unlimited number of rows so users can work concurrently without unnecessary delays.
Oracle locking is performed automatically and requires no user action. Implicit locking occurs for SQL statements as necessary, depending on the action requested. Oracle's lock manager automatically locks table data at the row level. By locking table data at the row level, contention for the same data is minimized.
Oracle's lock manager maintains several different types of row locks, depending on what type of operation established the lock. The two general types of locks are exclusive locks and share locks. Only one exclusive lock can be placed on a resource (such as a row or a table); however, many share locks can be placed on a single resource. Both exclusive and share locks always allow queries on the locked resource but prohibit other activity on the resource (such as updates and deletes).
Database administrators occasionally need isolation from concurrent non-database administrator actions, that is, isolation from concurrent non-database administrator transactions, queries, or PL/SQL statements. One way to provide such isolation is to shut down the database and reopen it in restricted mode. You could also put the system into quiesced state without disrupting users. In quiesced state, the database administrator can safely perform certain actions whose executions require isolation from concurrent non-DBA users.
Real Application Clusters (RAC) comprises several Oracle instances running on multiple clustered computers, which communicate with each other by means of a so-called interconnect. RAC uses cluster software to access a shared database that resides on shared disk. RAC combines the processing power of these multiple interconnected computers to provide system redundancy, near linear scalability, and high availability. RAC also offers significant advantages for both OLTP and data warehouse systems and all systems and applications can efficiently exploit clustered environments.
You can scale applications in RAC environments to meet increasing data processing demands without changing the application code. As you add resources such as nodes or storage, RAC extends the processing powers of these resources beyond the limits of the individual components.
Oracle provides unique portability across all major platforms and ensures that your applications run without modification after changing platforms. This is because the Oracle code base is identical across platforms, so you have identical feature functionality across all platforms, for complete application transparency. Because of this portability, you can easily upgrade to a more powerful server as your requirements change.
People who administer the operation of an Oracle database system, known as database administrators (DBAs), are responsible for creating Oracle databases, ensuring their smooth operation, and monitoring their use. In addition to the many alerts and advisors Oracle provides, Oracle also offers the following features:
Oracle Database provides a high degree of self-management - automating routine DBA tasks and reducing complexity of space, memory, and resource administration. Oracle self-managing database features include the following: automatic undo management, dynamic memory management, Oracle-managed files, mean time to recover, free space management, multiple block sizes, and Recovery Manager (RMAN).
Enterprise Manager is a system management tool that provides an integrated solution for centrally managing your heterogeneous environment. Combining a graphical console, Oracle Management Servers, Oracle Intelligent Agents, common services, and administrative tools, Enterprise Manager provides a comprehensive systems management platform for managing Oracle products.
From the client interface, the Enterprise Manager Console, you can perform the following tasks:
Administer the complete Oracle environment, including databases, iAS servers, applications, and services
Diagnose, modify, and tune multiple databases
Schedule tasks on multiple systems at varying time intervals
Monitor database conditions throughout the network
Administer multiple network nodes and services from many locations
Share tasks with other administrators
Group related targets together to facilitate administration tasks
Launch integrated Oracle and third-party tools
Customize the display of an Enterprise Manager administrator
See Also:SQL*Plus User's Guide and Reference
Automatic Storage Management automates and simplifies the layout of datafiles, control files, and log files. Database files are automatically distributed across all available disks, and database storage is rebalanced whenever the storage configuration changes. It provides redundancy through the mirroring of database files, and it improves performance by automatically distributing database files across all available disks. Rebalancing of the database's storage automatically occurs whenever the storage configuration changes.
To help simplify management tasks, as well as providing a rich set of functionality for complex scheduling needs, Oracle provides a collection of functions and procedures in the
DBMS_SCHEDULER package. Collectively, these functions are called the Scheduler, and they are callable from any PL/SQL program.
The Scheduler lets database administrators and application developers control when and where various tasks take place in the database environment. For example, database administrators can schedule and monitor database maintenance jobs such as backups or nightly data warehousing loads and extracts.
Traditionally, the operating systems regulated resource management among the various applications running on a system, including Oracle databases. The Database Resource Manager controls the distribution of resources among various sessions by controlling the execution schedule inside the database. By controlling which sessions run and for how long, the Database Resource Manager can ensure that resource distribution matches the plan directive and hence, the business objectives.
See Also:Chapter 14, "Manageability"
In every database system, the possibility of a system or hardware failure always exists. If a failure occurs and affects the database, then the database must be recovered. The goals after a failure are to ensure that the effects of all committed transactions are reflected in the recovered database and to return to normal operation as quickly as possible while insulating users from problems caused by the failure.
Oracle provides various mechanisms for the following:
Database recovery required by different types of failures
Flexible recovery operations to suit any situation
Availability of data during backup and recovery operations so users of the system can continue to work
Several circumstances can halt the operation of an Oracle database. The most common types of failure are described in the following table.
|User error||Requires a database to be recovered to a point in time before the error occurred. For example, a user could accidentally drop a table. To enable recovery from user errors and accommodate other unique recovery requirements, Oracle provides exact point-in-time recovery. For example, if a user accidentally drops a table, the database can be recovered to the instant in time before the table was dropped.|
|Statement failure||Occurs when there is a logical failure in the handling of a statement in an Oracle program. When statement failure occurs, any effects of the statement are automatically undone by Oracle and control is returned to the user.|
|Process failure||Results from a failure in a user process accessing Oracle, such as an abnormal disconnection or process termination. The background process PMON automatically detects the failed user process, rolls back the uncommitted transaction of the user process, and releases any resources that the process was using.|
|Instance failure||Occurs when a problem arises that prevents an instance from continuing work. Instance failure can result from a hardware problem such as a power outage, or a software problem such as an operating system failure. When an instance failure occurs, the data in the buffers of the system global area is not written to the datafiles.
After an instance failure, Oracle automatically performs instance recovery. If one instance in a RAC environment fails, then another instance recovers the redo for the failed instance. In a single-instance database, or in a RAC database in which all instances fail, Oracle automatically applies all redo when you restart the database.
|Media (disk) failure||An error can occur when trying to write or read a file on disk that is required to operate the database. A common example is a disk head failure, which causes the loss of all files on a disk drive.
Different files can be affected by this type of disk failure, including the datafiles, the redo log files, and the control files. Also, because the database instance cannot continue to function properly, the data in the database buffers of the system global area cannot be permanently written to the datafiles.
A disk failure requires you to restore lost files and then perform media recovery. Unlike instance recovery, media recovery must be initiated by the user. Media recovery updates restored datafiles so the information in them corresponds to the most recent time point before the disk failure, including the committed data in memory that was lost because of the failure.
Oracle provides for complete media recovery from all possible types of hardware failures, including disk failures. Options are provided so that a database can be completely recovered or partially recovered to a specific point in time.
If some datafiles are damaged in a disk failure but most of the database is intact and operational, the database can remain open while the required tablespaces are individually recovered. Therefore, undamaged portions of a database are available for normal use while damaged portions are being recovered.
Oracle uses several structures to provide complete recovery from an instance or disk failure: the redo log, undo records, a control file, and database backups.
Note:Because the online redo log is always online, as opposed to an archived copy of a redo log, thus it is usually referred to as simply "the redo log".
The online redo log is a set of two or more online redo log files that record all changes made to the database, including uncommitted and committed changes. Redo entries are temporarily stored in redo log buffers of the system global area, and the background process LGWR writes the redo entries sequentially to an online redo log file. LGWR writes redo entries continually, and it also writes a commit record every time a user process commits a transaction.
Optionally, filled online redo files can be manually or automatically archived before being reused, creating archived redo logs. To enable or disable archiving, set the database in one of the following modes:
ARCHIVELOG: The filled online redo log files are archived before they are reused in the cycle.
NOARCHIVELOG: The filled online redo log files are not archived.
ARCHIVELOG mode, the database can be completely recovered from both instance and disk failure. The database can also be backed up while it is open and available for use. However, additional administrative operations are required to maintain the archived redo log.
If the database redo log operates in
NOARCHIVELOG mode, then the database can be completely recovered from instance failure, but not from disk failure. Also, the database can be backed up only while it is completely closed. Because no archived redo log is created, no extra work is required by the database administrator.
Undo records are stored in undo tablespaces. Oracle uses the undo data for a variety of purposes, including accessing before-images of blocks changed in uncommitted transactions. During database recovery, Oracle applies all changes recorded in the redo log and then uses undo information to roll back any uncommitted transactions.
The control files include information about the file structure of the database and the current log sequence number being written by LGWR. During normal recovery procedures, the information in a control file guides the automatic progression of the recovery operation.
Because one or more files can be physically damaged as the result of a disk failure, media recovery requires the restoration of the damaged files from the most recent operating system backup of a database. You can either back up the database files with Recovery Manager (RMAN), or use operating system utilities. RMAN is an Oracle utility that manages backup and recovery operations, creates backups of database files (datafiles, control files, and archived redo log files), and restores or recovers a database from backups.
Computing environments configured to provide nearly full-time availability are known as high availability systems. Such systems typically have redundant hardware and software that makes the system available despite failures. Well-designed high availability systems avoid having single points-of-failure.
When failures occur, the fail over process moves processing performed by the failed component to the backup component. This process remasters systemwide resources, recovers partial or failed transactions, and restores the system to normal, preferably within a matter of microseconds. The more transparent that fail over is to users, the higher the availability of the system.
Oracle has a number of products and features that provide high availability in cases of unplanned downtime or planned downtime. These include Fast-Start Fault Recovery, Real Application Clusters, Recovery Manager (RMAN), backup and recovery solutions, Oracle Flashback, partitioning, Oracle Data Guard, LogMiner, multiplexed redo log files, online reorganization. These can be used in various combinations to meet specific high availability needs.
See Also:Chapter 17, "High Availability"
This section describes several business intelligence features.
A data warehouse is a relational database designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources.
In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.
You must load your data warehouse regularly so that it can serve its purpose of facilitating business analysis. To do this, data from one or more operational systems must be extracted and copied into the warehouse. The process of extracting data from source systems and bringing it into the data warehouse is commonly called ETL, which stands for extraction, transformation, and loading.
A materialized view provides access to table data by storing the results of a query in a separate schema object. Unlike an ordinary view, which does not take up any storage space or contain any data, a materialized view contains the rows resulting from a query against one or more base tables or views. A materialized view can be stored in the same database as its base tables or in a different database.
Materialized views stored in the same database as their base tables can improve query performance through query rewrites. Query rewrite is a mechanism where Oracle or applications from the end user or database transparently improve query response time, by automatically rewriting the SQL query to use the materialized view instead of accessing the original tables. Query rewrites are particularly useful in a data warehouse environment.
Data warehousing environments typically have large amounts of data and ad hoc queries, but a low level of concurrent database manipulation language (DML) transactions. For such applications, bitmap indexing provides:
Reduced response time for large classes of ad hoc queries
Reduced storage requirements compared to other indexing techniques
Dramatic performance gains even on hardware with a relatively small number of CPUs or a small amount of memory
Efficient maintenance during parallel DML and loads
Fully indexing a large table with a traditional B-tree index can be prohibitively expensive in terms of space because the indexes can be several times larger than the data in the table. Bitmap indexes are typically only a fraction of the size of the indexed data in the table.
To reduce disk use and memory use (specifically, the buffer cache), you can store tables and partitioned tables in a compressed format inside the database. This often leads to a better scaleup for read-only operations. Table compression can also speed up query execution. There is, however, a slight cost in CPU overhead.
When Oracle runs SQL statements in parallel, multiple processes work together simultaneously to run a single SQL statement. By dividing the work necessary to run a statement among multiple processes, Oracle can run the statement more quickly than if only a single process ran it. This is called parallel execution or parallel processing.
Parallel execution dramatically reduces response time for data-intensive operations on large databases, because statement processing can be split up among many CPUs on a single Oracle system.
Oracle has many SQL operations for performing analytic operations in the database. These include ranking, moving averages, cumulative sums, ratio-to-reports, and period-over-period comparisons.
Application developers can use SQL online analytical processing (OLAP) functions for standard and ad-hoc reporting. For additional analytic functionality, Oracle OLAP provides multidimensional calculations, forecasting, modeling, and what-if scenarios. This enables developers to build sophisticated analytic and planning applications such as sales and marketing analysis, enterprise budgeting and financial analysis, and demand planning systems. Data can be stored in either relational tables or multidimensional objects.
Oracle OLAP provides the query performance and calculation capability previously found only in multidimensional databases to Oracle's relational platform. In addition, it provides a Java OLAP API that is appropriate for the development of internet-ready analytical applications. Unlike other combinations of OLAP and RDBMS technology, Oracle OLAP is not a multidimensional database using bridges to move data from the relational data store to a multidimensional data store. Instead, it is truly an OLAP-enabled relational database. As a result, Oracle provides the benefits of a multidimensional database along with the scalability, accessibility, security, manageability, and high availability of the Oracle database. The Java OLAP API, which is specifically designed for internet-based analytical applications, offers productive data access.
With Oracle Data Mining, data never leaves the database — the data, data preparation, model building, and model scoring results all remain in the database. This enables Oracle to provide an infrastructure for application developers to integrate data mining seamlessly with database applications. Some typical examples of the applications that data mining are used in are call centers, ATMs, ERM, and business planning applications. Data mining functions such as model building, testing, and scoring are provided through a Java API.
See Also:Chapter 16, "Business Intelligence"
Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions. SQL queries and DML statements do not need to be modified in order to access partitioned tables. However, after partitions are defined, DDL statements can access and manipulate individuals partitions rather than entire tables or indexes. This is how partitioning can simplify the manageability of large database objects. Also, partitioning is entirely transparent to applications.
Partitioning is useful for many different types of applications, particularly applications that manage large volumes of data. OLTP systems often benefit from improvements in manageability and availability, while data warehousing systems benefit from performance and manageability.
Oracle includes datatypes to handle all the types of rich Internet content such as relational data, object-relational data, XML, text, audio, video, image, and spatial. These datatypes appear as native types in the database. They can all be queried using SQL. A single SQL statement can include data belonging to any or all of these datatypes.
XML, eXtensible Markup Language, is the standard way to identify and describe data on the Web. Oracle XML DB treats XML as a native datatype in the database. Oracle XML DB offers a number of easy ways to create XML documents from relational tables. The result of any SQL query can be automatically converted into an XML document. Oracle also includes a set of utilities, available in Java and C++, to simplify the task of creating XML documents.
The LOB datatypes
BFILE enable you to store and manipulate large blocks of unstructured data (such as text, graphic images, video clips, and sound waveforms) in binary or character format. They provide efficient, random, piece-wise access to the data.
Oracle Text indexes any document or textual content to add fast, accurate retrieval of information. Oracle Text allows text searches to be combined with regular database searches in a single SQL statement. The ability to find documents based on their textual content, metadata, or attributes, makes the Oracle Database the single point of integration for all data management.
The Oracle Text SQL API makes it simple and intuitive for application developers and DBAs to create and maintain Text indexes and run Text searches.
Oracle Ultra Search lets you index and search Web sites, database tables, files, mailing lists, Oracle Application Server Portals, and user-defined data sources. As such, you can use Oracle Ultra Search to build different kinds of search applications.
Oracle interMedia provides an array of services to develop and deploy traditional, Web, and wireless applications that include image, audio, and video in an integrated fashion. Multimedia content can be stored and managed directly in Oracle, or Oracle can store and index metadata together with external references that enable efficient access to media content stored outside the database.
Oracle includes built-in spatial features that let you store, index, and manage location content (assets, buildings, roads, land parcels, sales regions, and so on.) and query location relationships using the power of the database. The Oracle Spatial Option adds advanced spatial features such as linear reference support and coordinate systems.
See Also:Chapter 19, "Content Management"
Oracle includes security features that control how a database is accessed and used. For example, security mechanisms:
Prevent unauthorized database access
Prevent unauthorized access to schema objects
Audit user actions
Associated with each database user is a schema by the same name. By default, each database user creates and has access to all objects in the corresponding schema.
Database security can be classified into two categories: system security and data security.
Valid user name/password combinations
The amount of disk space available to a user's schema objects
The resource limits for a user
System security mechanisms check whether a user is authorized to connect to the database, whether database auditing is active, and which system operations a user can perform.
Which users have access to a specific schema object and the specific types of actions allowed for each user on the schema object (for example, user
SCOTT can issue
INSERT statements but not
DELETE statements using the
The actions, if any, that are audited for each schema object
Data encryption to prevent unauthorized users from bypassing Oracle and accessing data
The Oracle database provides discretionary access control, which is a means of restricting access to information based on privileges. The appropriate privilege must be assigned to a user in order for that user to access a schema object. Appropriately privileged users can grant other users privileges at their discretion.
Oracle manages database security using several different facilities:
Authentication to validate the identity of the entities using your networks, databases, and applications
Authorization processes to limit access and actions, limits that are linked to user's identities and roles.
Access restrictions on objects, like tables or rows.
See Also:Chapter 20, "Database Security"
Data must adhere to certain business rules, as determined by the database administrator or application developer. For example, assume that a business rule says that no row in the
inventory table can contain a numeric value greater than nine in the
sale_discount column. If an
UPDATE statement attempts to violate this integrity rule, then Oracle must undo the invalid statement and return an error to the application. Oracle provides integrity constraints and database triggers to manage data integrity rules.
Note:Database triggers let you define and enforce integrity rules, but a database trigger is not the same as an integrity constraint. Among other things, a database trigger does not check data already loaded into a table. Therefore, it is strongly recommended that you use database triggers only when the integrity rule cannot be enforced by integrity constraints.
An integrity constraint is a declarative way to define a business rule for a column of a table. An integrity constraint is a statement about table data that is always true and that follows these rules:
If an integrity constraint is created for a table and some existing table data does not satisfy the constraint, then the constraint cannot be enforced.
After a constraint is defined, if any of the results of a DML statement violate the integrity constraint, then the statement is rolled back, and an error is returned.
Integrity constraints are defined with a table and are stored as part of the table's definition in the data dictionary, so that all database applications adhere to the same set of rules. When a rule changes, it only needs be changed once at the database level and not many times for each application.
NULL: Disallows nulls (empty entries) in a table's column.
KEY: Disallows duplicate values in a column or set of columns.
KEY: Disallows duplicate values and nulls in a column or set of columns.
KEY: Requires each value in a column or set of columns to match a value in a related table's
KEY integrity constraints also define referential integrity actions that dictate what Oracle should do with dependent data if the data it references is altered.
CHECK: Disallows values that do not satisfy the logical expression of the constraint.
Key is used in the definitions of several types of integrity constraints. A key is the column or set of columns included in the definition of certain types of integrity constraints. Keys describe the relationships between the different tables and columns of a relational database. Individual values in a key are called key values.
The different types of keys include:
Primary key: The column or set of columns included in the definition of a table's
KEY constraint. A primary key's values uniquely identify the rows in a table. Only one primary key can be defined for each table.
Unique key: The column or set of columns included in the definition of a
Foreign key: The column or set of columns included in the definition of a referential integrity constraint.
Referenced key: The unique key or primary key of the same or a different table referenced by a foreign key.
See Also:Chapter 21, "Data Integrity"
Triggers are procedures written in PL/SQL, Java, or C that run (fire) implicitly whenever a table or view is modified or when some user actions or database system actions occur.
Triggers supplement the standard capabilities of Oracle to provide a highly customized database management system. For example, a trigger can restrict DML operations against a table to those issued during regular business hours.
See Also:Chapter 22, "Triggers"
A distributed environment is a network of disparate systems that seamlessly communicate with each other. Each system in the distributed environment is called a node. The system to which a user is directly connected is called the local system. Any additional systems accessed by this user are called remote systems. A distributed environment allows applications to access and exchange data from the local and remote systems. All the data can be simultaneously accessed and modified.
A homogeneous distributed database system is a network of two or more Oracle databases that reside on one or more computers. Distributed SQL enables applications and users to simultaneously access or modify the data in several databases as easily as they access or modify a single database.
An Oracle distributed database system can be transparent to users, making it appear as though it is a single Oracle database. Companies can use this distributed SQL feature to make all its Oracle databases look like one and thus reduce some of the complexity of the distributed system.
Oracle uses database links to enable users on one database to access objects in a remote database. A local user can access a link to a remote database without having to be a user on the remote database.
Location transparency occurs when the physical location of data is transparent to the applications and users. For example, a view that joins table data from several databases provides location transparency because the user of the view does not need to know from where the data originates.
Oracle's provides query, update, and transaction transparency. For example, standard SQL statements like
DELETE work just as they do in a non-distributed database environment. Additionally, applications control transactions using the standard SQL statements
ROLLBACK. Oracle ensures the integrity of data in a distributed transaction using the two-phase commit mechanism.
Oracle Streams enables the propagation and management of data, transactions, and events in a data stream either within a database, or from one database to another. The stream routes published information to subscribed destinations. As users' needs change, they can simply implement a new capability of Oracle Streams, without sacrificing existing capabilities.
Oracle Streams provides a set of elements that lets users control what information is put into a stream, how the stream flows or is routed from node to node, what happens to events in the stream as they flow into each node, and how the stream terminates. By specifying the configuration of the elements acting on the stream, a user can address specific requirements, such as message queuing or data replication.
Oracle Streams implicitly and explicitly captures events and places them in the staging area. Database events, such as DML and DDL, are implicitly captured by mining the redo log files. Sophisticated subscription rules can determine what events should be captured.
The staging area is a queue that provides a service to store and manage captured events. Changes to database tables are formatted as logical change records (LCR), and stored in a staging area until subscribers consume them. LCR staging provides a holding area with security, as well as auditing and tracking of LCR data.
Messages in a staging area are consumed by the apply engine, where changes are applied to a database or consumed by an application. A flexible apply engine allows use of a standard or custom apply function. Support for explicit dequeue lets application developers use Oracle Streams to reliably exchange messages. They can also notify applications of changes to data, by still leveraging the change capture and propagation features of Oracle Streams.
Oracle Streams Advanced Queuing is built on top of the flexible Oracle Streams infrastructure. It provides a unified framework for processing events. Events generated in applications, in workflow, or implicitly captured from redo logs or database triggers can be captured in a queue. These events can be consumed in a variety of ways. They can be automatically applied with a user-defined function or database table operation, can be explicitly dequeued, or a notification can be sent to the consuming application. These events can be transformed at any stage. If the consuming application is on a different database, then the events are automatically propagated to the appropriate database. Operations on these events can be automatically audited, and the history can be retained for the user-specified duration.
Replication is the maintenance of database objects in two or more databases. Oracle Streams provides powerful replication features that can be used to keep multiple copies of distributed objects synchronized.
Oracle Streams automatically determines what information is relevant and shares that information with those who need it. This active sharing of information includes capturing and managing events in the database including data changes with DML and propagating those events to other databases and applications. Data changes can be applied directly to the replica database, or can call a user-defined procedure to perform alternative work at the destination database, for example, populate a staging table used to load a data warehouse.
Oracle Streams is an open information sharing solution, supporting heterogeneous replication between Oracle and non-Oracle systems. Using a transparent gateway, DML changes initiated at Oracle databases can be applied to non-Oracle platforms.
Oracle Streams is fully inter-operational with materialized views, or snapshots, which can maintain updatable or read-only, point-in-time copies of data. They can contain a full copy of a table or a defined subset of the rows in the master table that satisfy a value-based selection criterion. There can be multitier materialized views as well, where one materialized view is a subset of another materialized view. Materialized views are periodically updated, or refreshed, from their associated master tables through transactionally consistent batch updates.
Oracle Transparent Gateways and Generic Connectivity extend Oracle distributed features to non-Oracle systems. Oracle can work with non-Oracle data sources, non-Oracle message queuing systems, and non-SQL applications, ensuring interoperability with other vendor's products and technologies.
They translate third party SQL dialects, data dictionaries, and datatypes into Oracle formats, thus making the non-Oracle data store appear as a remote Oracle database. These technologies enable companies to seamlessly integrate the different systems and provide a consolidated view of the company as a whole.
Oracle Transparent Gateways and Generic Connectivity can be used for synchronous access, using distributed SQL, and for asynchronous access, using Oracle Streams. Introducing a Transparent Gateway into an Oracle Streams environment enables replication of data from an Oracle database to a non-Oracle database.
Generic Connectivity is a generic solution, while Oracle Transparent Gateways are tailored solutions, specifically coded for the non-Oracle system.
See Also:Chapter 23, "Information Integration"
SQL and PL/SQL form the core of Oracle's application development stack. Not only do most enterprise back-ends run SQL, but Web applications accessing databases do so using SQL (wrappered by Java classes as JDBC), Enterprise Application Integration applications generate XML from SQL queries, and content-repositories are built on top of SQL tables. It is a simple, widely understood, unified data model. It is used standalone in many applications, but it is also invoked directly from Java (JDBC), Oracle Call Interface (OCI), Oracle C++ Call Interface (OCCI), or XSU (XML SQL Utility). Stored packages, procedures, and triggers can all be written in PL/SQL or in Java.
This section contains the following topics:
SQL (pronounced SEQUEL) is the programming language that defines and manipulates the database. SQL databases are relational databases, which means that data is stored in a set of simple relations.
All operations on the information in an Oracle database are performed using SQL statements. A SQL statement is a string of SQL text. A statement must be the equivalent of a complete SQL sentence, as in:
SELECT last_name, department_id FROM employees;
Only a complete SQL statement can run successfully. A sentence fragment, such as the following, generates an error indicating that more text is required:
A SQL statement can be thought of as a very simple, but powerful, computer program or instruction. SQL statements are divided into the following categories:
These statements create, alter, maintain, and drop schema objects. DDL statements also include statements that permit a user to grant other users the privileges to access the database and specific objects within the database.
These statements manipulate data. For example, querying, inserting, updating, and deleting rows of a table are all DML operations. The most common SQL statement is the
SELECT statement, which retrieves data from the database. Locking a table or view and examining the execution plan of a SQL statement are also DML operations.
These statements manage the changes made by DML statements. They enable a user to group changes into logical transactions. Examples include
These statements let a user control the properties of the current session, including enabling and disabling roles and changing language settings. The two session control statements are
These statements change the properties of the Oracle database instance. The only system control statement is
SYSTEM. It lets users change settings, such as the minimum number of shared servers, kill a session, and perform other tasks.
These statements incorporate DDL, DML, and transaction control statements in a procedural language program, such as those used with the Oracle precompilers. Examples include
PL/SQL is Oracle's procedural language extension to SQL. PL/SQL combines the ease and flexibility of SQL with the procedural functionality of a structured programming language, such as
When designing a database application, consider the following advantages of using stored PL/SQL:
PL/SQL code can be stored centrally in a database. Network traffic between applications and the database is reduced, so application and system performance increases. Even when PL/SQL is not stored in the database, applications can send blocks of PL/SQL to the database rather than individual SQL statements, thereby reducing network traffic.
Data access can be controlled by stored PL/SQL code. In this case, PL/SQL users can access data only as intended by application developers, unless another access route is granted.
PL/SQL blocks can be sent by an application to a database, running complex operations without excessive network traffic.
Oracle supports PL/SQL Server Pages, so your application logic can be invoked directly from your Web pages.
The following sections describe the PL/SQL program units that can be defined and stored centrally in a database.
Program units are stored procedures, functions, packages, triggers, and autonomous transactions.
Procedures and functions are sets of SQL and PL/SQL statements grouped together as a unit to solve a specific problem or to perform a set of related tasks. They are created and stored in compiled form in the database and can be run by a user or a database application.
Procedures and functions are identical, except that functions always return a single value to the user. Procedures do not return values.
Packages encapsulate and store related procedures, functions, variables, and other constructs together as a unit in the database. They offer increased functionality (for example, global package variables can be declared and used by any procedure in the package). They also improve performance (for example, all objects of the package are parsed, compiled, and loaded into memory once).
Java is an object-oriented programming language efficient for application-level programs. Oracle provides all types of JDBC drivers and enhances database access from Java applications. Java Stored Procedures are portable and secure in terms of access control, and allow non-Java and legacy applications to transparently invoke Java.
Oracle Database developers have a choice of languages for developing applications—C, C++, Java, COBOL, PL/SQL, and Visual Basic. The entire functionality of the database is available in all the languages. All language-specific standards are supported. Developers can choose the languages in which they are most proficient or one that is most suitable for a specific task. For example an application might use Java on the server side to create dynamic Web pages, PL/SQL to implement stored procedures in the database, and C++ to implement computationally intensive logic in the middle tier.
The Oracle Call Interface (OCI) is a C data access API for Oracle Database. It supports the entire Oracle Database feature set. Many data access APIs, such as OCCI, ODBC, Oracle JDBC Type2 drivers, and so on, are built on top of OCI. OCI provides powerful functionality to build high performance, secure, scalable, and fault-tolerant applications. OCI is also used within the server for the data access needs of database kernel components, along with distributed database access. OCI lets an application developer use C function calls to access the Oracle data server and control all phases of business logic execution. OCI is exposed as a library of standard database access and retrieval functions in the form of a dynamic runtime library that can be linked in by the application.
The Oracle C++ Call Interface (OCCI) is a C++ API that lets you use the object-oriented features, native classes, and methods of the C++ programing language to access the Oracle database. The OCCI interface is modeled on the JDBC interface. OCCI is built on top of OCI and provides the power and performance of OCI using an object-oriented paradigm.
Open database connectivity (ODBC), is a database access API that lets you connect to a database and then prepare and run SQL statements against the database. In conjunction with an ODBC driver, an application can access any data source including data stored in spreadsheets, like Excel.
Oracle offers a variety of data access methods from COM-based programming languages, such as Visual Basic and Active Server Pages. These include Oracle Objects for OLE (OO40) and the Oracle Provider for OLE DB. Oracle also provides .NET data access support through the Oracle Data Provider for .NET. Oracle also support OLE DB .NET and ODBC .NET.
Oracle also provides the Pro* series of precompilers, which allow you to embed SQL and PL/SQL in your C, C++, or COBOL applications.
A transaction is a logical unit of work that comprises one or more SQL statements run by a single user. According to the ANSI/ISO SQL standard, with which Oracle is compatible, a transaction begins with the user's first executable SQL statement. A transaction ends when it is explicitly committed or rolled back by that user.
Note:Oracle is broadly compatible with the SQL-99 Core specification.
Transactions let users guarantee consistent changes to data, as long as the SQL statements within a transaction are grouped logically. A transaction should consist of all of the necessary parts for one logical unit of work—no more and no less. Data in all referenced tables are in a consistent state before the transaction begins and after it ends. Transactions should consist of only the SQL statements that make one consistent change to the data.
Consider a banking database. When a bank customer transfers money from a savings account to a checking account, the transaction can consist of three separate operations: decrease the savings account, increase the checking account, and record the transaction in the transaction journal.
The transfer of funds (the transaction) includes increasing one account (one SQL statement), decreasing another account (one SQL statement), and recording the transaction in the journal (one SQL statement). All actions should either fail or succeed together; the credit should not be committed without the debit. Other nonrelated actions, such as a new deposit to one account, should not be included in the transfer of funds transaction. Such statements should be in other transactions.
Oracle must guarantee that all three SQL statements are performed to maintain the accounts in proper balance. When something prevents one of the statements in the transaction from running (such as a hardware failure), then the other statements of the transaction must be undone. This is called rolling back. If an error occurs in making any of the updates, then no updates are made.
See Also:Oracle Database SQL Reference for information about Oracle's compliance with ANSI/ISO standards
The changes made by the SQL statements that constitute a transaction can be either committed or rolled back. After a transaction is committed or rolled back, the next transaction begins with the next SQL statement.
To commit a transaction makes permanent the changes resulting from all DML statements in the transaction. The changes made by the SQL statements of a transaction become visible to any other user's statements whose execution starts after the transaction is committed.
To undo a transaction retracts any of the changes resulting from the SQL statements in the transaction. After a transaction is rolled back, the affected data is left unchanged, as if the SQL statements in the transaction were never run.
Savepoints divide a long transaction with many SQL statements into smaller parts. With savepoints, you can arbitrarily mark your work at any point within a long transaction. This gives you the option of later rolling back all work performed from the current point in the transaction to a declared savepoint within the transaction.
See Also:Chapter 4, "Transaction Management"
Each column value and constant in a SQL statement has a datatype, which is associated with a specific storage format, constraints, and a valid range of values. When you create a table, you must specify a datatype for each of its columns.
New object types can be created from any built-in database types or any previously created object types, object references, and collection types. Metadata for user-defined types is stored in a schema available to SQL, PL/SQL, Java, and other published interfaces.
An object type differs from native SQL datatypes in that it is user-defined, and it specifies both the underlying persistent data (attributes) and the related behaviors (methods). Object types are abstractions of the real-world entities, for example, purchase orders.
Object types and related object-oriented features, such as variable-length arrays and nested tables, provide higher-level ways to organize and access data in the database. Underneath the object layer, data is still stored in columns and tables, but you can work with the data in terms of the real-world entities--customers and purchase orders, for example--that make the data meaningful. Instead of thinking in terms of columns and tables when you query the database, you can simply select a customer.
Oracle databases can be deployed anywhere in the world, and a single instance of an Oracle database can be accessed by users across the globe. Information is presented to each user in the language and format specific to his or her location.
The Globalization Development Kit (GDK) simplifies the development process and reduces the cost of developing internet applications for a multilingual market. GDK lets a single program work with text in any language from anywhere in the world.