21 Overview of WebCenter Sites: Site Capture

This chapter presents an overview of the Site Capture application and the installation process.

This chapter contains the following topics:

21.1 About WebCenter Sites: Site Capture

Oracle WebCenter Sites: Site Capture is a web application that integrates with Oracle WebCenter Sites through the Oracle WebCenter Sites: Web Experience Management (WEM) Framework to capture dynamically published websites for evaluation, compliance purposes, high availability requirements, and other types of scenarios.

Crawls can be initiated manually from the Site Capture interface, or they can be triggered by the completion of a WebCenter Sites RealTime publishing session. In each scenario, the crawler captures the site in one of the following modes, depending on the user's selections:

  • Static mode: The site is stored as files that are ready to be served. Only the latest capture is kept.

  • Archive mode: The site is stored in a zip file. A pointer in the Site Capture database enables archive management from the Site Capture interface.

21.2 Site Capture Installation Summary

This guide contains procedures for installing and configuring Site Capture to support:

  • Static and archive capture initiated manually, from the Site Capture interface.

  • Static and archive capture triggered by the completion of a WebCenter Sites RealTime publishing session. Setting up publishing-triggered site capture is an option.

  • Administrative users and developers. The Site Capture application is designed for general administrators of the WebCenter Sites system on which the Site Capture application will be running. Developers will write advanced crawler configuration code – for example, code that triggers Site Capture to execute a post-crawl command such as copying statically captured sites to a web server's doc base.

21.3 Before You Begin

Users of this guide must have experience installing and configuring enterprise-level software, such as application servers and databases. Also required is a general administrator's experience with WebCenter Sites and the WEM Admin interface.

  • To complete the procedures in this guide, you must be a WebCenter Sites general administrator who belongs to the RESTAdmin security group.

  • Download the Oracle WebCenter Sites 11g Release 1 (11.1.1.x) Certification Matrix for information about supported operating systems, application servers, databases, and browsers. Also, download the Oracle WebCenter Sites release notes for information about Site Capture.

  • Read this guide to acquire an understanding of the Site Capture installation process. The basic steps are configuring the Site Capture application server, running the installer to create the Site Capture war file, deploying Site Capture, completing post-installation and verification steps, and if necessary, setting up publishing-triggered site capture.

  • On all systems, set the JAVA_HOME variable to point to a valid installation of certified version of JDK which is certified for use with WebCenter Sites as noted in the certification matrix.

  • Prepare the Site Capture installation components:

    • Ensure you have a dedicated and fully functional Oracle WebCenter Sites installation running in development or content management mode. Site Capture must communicate with this WebCenter Sites system in order to run.

    • Ensure you have a dedicated application server on which to install the Site Capture application server and the application itself. During the installation process, you will configure Site Capture to communicate with the WebCenter Sites system (described above), which runs on its own host machine.

    • Decide whether your Site Capture application will be running as a single application or in clustered mode. For a diagram of a single-server installation, and clustered installation, see Figure 21-1.

      • Install or reuse the following components:

        Install a supported Site Capture application server on the dedicated application server. For clustered installations, install the application server for each Site Capture instance.

        For a clustered installation, install a load balancer on a host machine of your choice. The Site Capture installation directory must be a shared directory, accessible to all cluster members.

        To store archived sites, you can either reuse WebCenter Sites' database or install a dedicated, supported Site Capture database on a host machine of your choice.

  • Decide whether you will be running publishing-triggered site capture. If so, you will need a WebCenter Sites source and target system:

  • Decide whether to install sample crawlers (recommended). For more information about the sample crawlers, see the note in step 1 in Section 23.1, "Installation Steps."

  • You have the option to install Site Capture silently or graphically. The silent installer provides help and sample values for every piece of information that needs to be set.

Figure 21-1 Single-Server and Cluster Installation

Description of Figure 21-1 follows
Description of ''Figure 21-1 Single-Server and Cluster Installation''

Figure 21-2 Single-Server Installation Enabled for Publishing-Triggered Site Capture

Description of Figure 21-2 follows
Description of ''Figure 21-2 Single-Server Installation Enabled for Publishing-Triggered Site Capture''

Figure 21-3 Clustered Installation Enabled for Publishing-Triggered Site Capture

Description of Figure 21-3 follows
Description of ''Figure 21-3 Clustered Installation Enabled for Publishing-Triggered Site Capture''