Oracle9i Application Developer's Guide - Large Objects (LOBs)
Release 1 (9.0.1)

Part Number A88879-01
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback

Go to previous page Go to beginning of chapter Go to next page

LOBs Case Studies, 2 of 3


Building a Multimedia Repository

This description has been extracted from an article by Samir S. Shah in Java Developer's Journal. Reprinted by permission of Java Developer's Journal.

Toolset Used

Today building an information repository is essential for businesses. the information respository helps establish a paperless office and allows data to be shared in or outside an enterprise.

With the toolset shown above, you can build an enterprise-class, scalable web-enabled multimedia-rich information repository that incorporates various forms of media. This repository includes unstructured data, such as document files, video clips, photographs, ... and sound files. It uses Java and Oracle's Large Objects (LOBs).

This section describes how you can build such an information repository for storing and searching documents such as Microsoft Word, HTML, and XML files which are stored in a LOB column of a database table.

The example used here populates the repository with Microsoft Word resumes, indexes it using Oracle Text (interMedia Text), and reads the repository using Java streams from a servlet.

See Figure 14-1.

Figure 14-1 Data Repository Using Oracle and Java


Text description of samshah.gif follows.
Text description of the illustration samshah.gif

Building repositories using Java and Oracle8i/9i has several benefits. The documents can inherently take advantage of the transaction management and ACID (Atomocity, Concurrency, Integrity, and Durability) properties of the relational database. This means that changes to an internal LOB can be committed or rolled-back. Moreover, because the unstructured data is stored by the database, you applications can seamlessly take advantage of database features such as backup and recovery. This helps Administrators who would no longer have to perform separate database and file system backups for relational information and documents.

All data in the database, including structured (relational) and unstructured (document files), can be written, searched, and accessed using SQL. The SQL statements can be executed from Java using JDBC.

How this Application Uses LOBs

Oracle8i and Oracle9i support several types of LOB columns. One type, BLOBs, can house binary information such as audio, video, images, and couments internally in the database. Each rrow can store up to 4 gigabytes of data. The application described here uses a BLOB data type to store Micorsoft Word resumes.

The Oracle database stores a locator in-line with the data. The locator is a pointer to the actual location of the data (LOB value). The LOB data can be stored in the same or a separate table. the advantage of using the locator is that the database will not have to scan the LOB data each time it reads multiple rows because only the LOB locator value is read. The actual LOB data is read only when required.

When working with Java and LOBs, first execute the SELECT statement to get the LOB locator, then read or write LOBs using JDBC.


Note:

The JDBC driver's Oracle type extension package, oracle.sql, is used to read and write from an oracle database. 


The actual LOB data is materialized as a java stream from the database, where the locator represents the data in the table. The following code reads the resume of an employee whose employee number is 7900. Employee number is stored in a LOB column called "resume" in table, sam_emp.

Statement st = cn.createStatement();
     ResultSet rs = st.executeQuery
     ("Select resume from sam_emp where empno=7900");
     rs.next();     
oracle.sql.BLOB blob=((OracleResultSet)rs).getBLOB(1);
InputStream is=blob.getBinaryStream();

Populating the Repository

The documents can be written to LOB columns using Java, PL/SQL, or a bulk loading utility called Oracle SQL*Loader. To insert a new row, perform the following:

  1. Execute the SQL insert statement with an empty blob.

  2. Query the same row to get the locator object.Use this to write your document to the LOB column.


    Note:

    Java streams are employed to write the documents to the LOB column. 


  3. Create the Java output stream using the getBinaryOutputStream() method of this object to write your document or any binary information to that column. For example, to insert information about a new employee whose employee number is 9001 in table sam_emp, first insert all the structured information along with an empty BLOB using JDBC. next select the LOB column, resume, of the same row to get the oracle.sql.BLOB object (the locator).

  4. Finally, create the Java output stream from this object. For example, to insert infomration about a new employee whoe employee number is 9001 in the sam_emp table, first insert all the structured data along with an empty BLOB, using JDBC. Next, select the LOB column column, resume, from the same row to get the oracle.sql.BLOB object (the locator). Finally, create the Java output stream from this object:

    st.execute("INSERT INTO sam_emp(empno, resume) 
           VALUES(9001,empty_blob())");
    ResultSet rs = st.executeQuery(
       "select resume from sam_emp where empno=9001 for update");
       rs.next();
    oracle.sql.BLOB blob = ((OracleResultSet)rs).getBLOB(1);
    OutputStream os = blob.getBinaryOutputStream();
    
    
    

    Optionally, you can use java.awt.FileDialog class and java.io package to dynamically select and read a file from your PC. Then, load it to a LOB column using the above code.

The way you search and retrieve documents does not depend on how you load the documents. For example, you can store the documents using PL/SQL or SQL*Loader, then search and retrieve using Java servlets.

The following example loads an employee's resumé using PL/SQL, to the resume column of the sam_emp table.

Example 1: Inserting a Word document into a BLOB Column using PL/SQL

The code below (steps 2-5) inserts MyResume.doc in the resume column of sam_emp table.

  1. Create a directory object in Oracle. Here is how to create a directory object called MY_FILES which represents C:\MY_DATA directory.

    You must have CREATE DIRECTORY privilege in Oracle.

    create or replace directory MY_FILES as 'C:\MY_DATA';
    
    
    
  2. Insert a row with empty BLOB in your table and return the locater.

  3. Point to the Word file to be loaded from the directory created in Step 1, using the BFILE data type.

  4. Open the file and use the locater from step 2 to insert the file.

  5. Close the file and commit the transaction.

    
    declare
        f_lob   bfile;
        b_lob   blob;
    
    begin
    
        insert into sam_emp(empno,ename,resume) 
     values ( 9001, 'Samir',empty_blob() )
     return documents into b_lob;
    
        f_lob := bfilename( 'MY_FILES', 'MyResume.doc' );
        dbms_lob.fileopen(f_lob, dbms_lob.file_readonly);
        dbms_lob.loadfromfile
       ( b_lob, f_lob, dbms_lob.getlength(f_lob) );
        dbms_lob.fileclose(f_lob);
    
        commit;
    
    end;
    /
    

Searching the Repository

Documents stored in the LOB columns can be indexed using Oracle9i Text (interMedia Text). Oracle9i Text provides you with advanced search capabilities such as fuzzy, stemming, proxy, phrases, and more. It can also generate thematic searches and gist. The documents can be indexed using `create index' database command.

See Also:

 

How the Index Was Built on Table sam_emp, resume Column

The following code shows you how the index was built on the resume column of the sam_emp table. Once the index is created, the Java applications can search the repository by simply submitting SELECT statements.

The steps listed below index all the Microsoft Word formated resumes stored in the resume column to the sam_emp table.The resumes can then be searched using SQL.

  1. Add primary key to your table if it does not exist. To make empno primary key of the sam_emp table execute following command:

    alter table sam_emp add constraint 
    pk_sam_emp primary key(empno);
    
    
    
  2. Get the privileges (ctxapp role) to create text indexes from your administrators.

  3. Create the index with appropriate filter object. Filters determine how to extract text for document indexing from word processor, formatted documents as well as plain text.

    create index ctx_doc_idx on sam_emp(resume)
    indextype is ctxsys.context parameters
    ('filter CTXSYS.INSO_FILTER');
    
    

    See Also:

    for a complete list of filters. 

MyServletCtx Servlet

The following code lists the servlet `MyServletCtx'. It searches the term passed to it as a parameter in the resume column of table, sam_emp. The servlet returns the rows matching the search criteria in HTML table format. The employee names in the HTML table are hyperlinked to another servlet, `MyServlet`, which reads the entire resumé from the database, in its original format.

MyServletCtx.java

1234567890123456789012345678901234567890123456789012
package package1;
import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
import java.sql.*;

/**
* This servlet searches documents stored in Oracle8i
* database repository using SQL and JDBC. The hit
* list is displayed in html table with hyper links.
* JDK 1.1.7 and Oracle Thin JDBC 1.22 complient 
* driver is used.
*
* @author Samir Shah
* @version 1.0
**/
public class MyServletCtx extends HttpServlet{
   Connection cn;

   public void init(ServletConfig parm1) 
   throws ServletException {
    super.init( parm1);
    try{
    DriverManager.registerDriver(
 (new oracle.jdbc.driver.OracleDriver()));
    cn =DriverManager.getConnection
 ("jdbc:oracle:thin:@sshah:1521:o8i",
          "scott", "tiger");
    }
    catch (SQLException se){se.printStackTrace();}
  }

  public void doGet(HttpServletRequest req,
     HttpServletResponse res) throws IOException{

      doPost(req,res);
  }
  
  public void doPost(HttpServletRequest req,
    HttpServletResponse res) throws IOException{

    PrintWriter out = res.getWriter();
    res.setContentType("text/html");

    //The term to search in resume column
    String term = req.getParameter("term");
    if (term == null)
       term="security";

    out.print("<html>");
    out.print("<body>");
    out.print("<H1>Search Result</H1>");
    out.print("<table border=1 bgcolor=lightblue>");
    out.print("<tr><th>ID#</th><th>Name</th></tr>");
    out.print("<tr>");
    try{
     Statement st = cn.createStatement();

     //search the term in resume column using SQL
     String query =
         "Select empno,ename from sam_emp" +
         " where contains(resume,'"+term+"')>0";

     ResultSet rs = st.executeQuery(query);

      while (rs.next()){
        out.print("<td>"+ rs.getInt(1)+"</td>");
        out.print("<td>" +
          "<A HREF=http://sshah:8080/" +
          "servlet/MyServlet?term=" +
          rs.getString(1) +
          " target=Document>" + 
          rs.getString(2) +
          "</A></td>");
        out.print("</tr>");
      }

      out.print("</table>");
      out.print("</body>");
      out.print("</html>");
     }//try
     catch (SQLException se){se.printStackTrace();}

  }
} 

Retrieving Data from the Repository

The document retrieval using Java is similar to writing documents to the repository. The section, "How this Application Uses LOBs" describes how to read LOBs from the database.

The following code in `MyServlet' reads a Microsoft Word resumé from the table, sam_emp. It sets the content type, then streams it out to the browser using an output stream.

MyServlet.java

1234567890123456789012345678901234567890123456789012
package package1;

import javax.servlet.*;
import javax.servlet.http.*;
import java.sql.*;
import java.io.*;
import oracle.jdbc.driver.*;
import oracle.sql.*;  //for oracle.sql.BLOB

/**
* This class reads the entire document from the 
* resume LOB column. It takes one parameter,term,
* to search a specific employee from the sam_emp
* table and returns the doucement stored in that
* row.
*
* JDK 1.1.7, Oracle Thin JDBC 1.22 complient driver
* Use Oracle JDBC Type extends package oracle.sql.
*
* @author Samir Shah
* @version 1.0
**/
public class MyServlet extends HttpServlet{
  Connection cn;

  public void doGet(HttpServletRequest req,
  HttpServletResponse res)
  {
   try{
    doPost(req,res);
   }catch (IOException ie){ie.printStackTrace();}
  }

  public void init(ServletConfig parm1) 
  throws ServletException 
  {

    super.init( parm1);
    try{
    DriverManager.registerDriver(
    (new oracle.jdbc.driver.OracleDriver()));
    cn =DriverManager.getConnection(
        "jdbc:oracle:thin:@sshah:1521:o8i",
        "scott", "tiger");
    }
    catch (SQLException se){se.printStackTrace();}
  }

  public void doPost(HttpServletRequest req,
    HttpServletResponse res) throws IOException
  {
     InputStream is=null;
     oracle.sql.BLOB blob=null;

     res.setContentType("application/msword");
     OutputStream os = res.getOutputStream();
     String term = req.getParameter("term");

     if (term==null)
        term="9001";

     try{
     Statement st = cn.createStatement();
     ResultSet rs = st.executeQuery
          ("Select resume from sam_emp"+
           " where empno="+term);
          
      while (rs.next()){
          blob=((OracleResultSet)rs).getBLOB(1);
          is=blob.getBinaryStream();
      }

      int pos=0;
      int length=0;
      byte[] b = new byte[blob.getChunkSize()];

      while((length=is.read(b))!= -1){
       pos+=length;
       os.write(b);
      }
     }//try
     catch (SQLException se)
{
     se.printStackTrace();
        }
     finally {
                is.close();
     }

  }

}

Summary

This section showed you how to store, search and retrieve Word documents using LOB data types and Java.

You can also store, index, parse and transform XML documents using the Oracle9i database. By storing XML documents in the database, there is no need to administer and manage multiple repositories for relational and XML data. Oracle9i and Oracl9i Application Server are XML-enabled whereby you can run the Oracle XML Parser for Java and parse and transform XML files in the database before outputting to an application server.


Go to previous page Go to beginning of chapter Go to next page
Oracle
Copyright © 1996-2001, Oracle Corporation.

All Rights Reserved.
Go To Documentation Library
Home
Go To Product List
Book List
Go To Table Of Contents
Contents
Go To Index
Index

Master Index

Feedback