|
BulkLoader is a command-line application that loads content and metadata from a filesystem into a BEA Virtual Content Repository.
BulkLoader scans a directory structure containing content and loads it into a specified content repository. In addition to loading content, BulkLoader reads prepared metadata files and associates the metadata with each loaded content item. Metadata files can be prepared for each specific content item, or more broadly for directories and subdirectories of items.
| Note: | The BulkLoader only supports uploading of simple metadata and binary files . It does not support all BEA repository features such as nested types and link properties. |
If you use BulkLoader to load content into a database repository, then both the metadata and binary files are transferred to the repository. If you load into a filesystem repository, then only the metadata is transferred to the database--the actual content files remain in place on the filesystem.
If you want to load or transfer content that uses advanced repository features available in WebLogic Portal 9.2, you can use the Propagation Tool.
| Note: | You cannot use the Bulkloader to update existing content (or its metadata) within a filesystem repository. It can only be used to add new content to the repository. As with other types of BEA repositories, you can add new and modify existing content using the Portal Administration Console. |
This document explains how to use BulkLoader and includes these topics:
Before running BulkLoader, you need to create a repository, create appropriate content types, populate a directory structure with content, and prepare metadata files.
This section includes the following topics:
BulkLoader loads content and metadata into a pre-established content repository. For information on creating a repository, see the WebLogic Portal Content Management Guide.
Each piece of content stored in a repository is associated with a type. A type is a definition that includes specific metadata fields that can be used to identify and describe content items associated with that type. The BEA Repository contains several predefined, default types. For example, the predefined image type contains three metadata fields:
You can create your own types or use the ones provided. For information on creating types, see the WebLogic Portal Content Management Guide.
| Note: | A type associated with a content item must be created with binary as its primary property. |
BulkLoader loads all the content from a specified directory (and, by default, subdirectories) into the content repository. Directories are automatically recreated as hierarchy nodes (folders) in the content repository. The directory structure you load into the repository should only contain the content you wish to add to the repository--BulkLoader loads all files within this directory structure.
Tip: You can configure BulkLoader, using command line flags, to ignore or include particular files or folders based on filename pattern matching.
Each piece of content in the repository is mapped to a specific type. A type includes default and user-defined properties. These properties, also known as metadata, allow content items in the repository to be identified and searched.
BulkLoader allows you to automatically associate individual files and/or directories of files with specific types. This section describes both of these associations. In addition, this section describes how to add metadata when Library Services are enabled for your repository and how to name and store metadata files properly.
If you know that an entire directory (and, by default, its subdirectories) contains files of the same type, you can specify that type to be associated with all of those files when BulkLoader stores them in the repository. To do this, place a file called dir.md.properties in the root directory containing the related content. This file must contain a single line:
nodeType=type
where type is the name of the type to associate with the content. For example:
nodeType=image
By default, all content in the directory and its subdirectories will be associated with the type. If a subdirectory contains another dir.md.properties file, then the type defined in that file overrides the original one for that directory and any of its subdirectories. Furthermore, if a filename.md.properties file is encountered, it also overrides the dir.md.properties file for that specific file. The filename.md.properties file is described next.
You can also define metadata for specific files loaded by BulkLoader. To do this, create a file called:
filename.md.properties for each piece of content, where filename is the name of the file with which the metadata is associated. This file must contain all of the name/value pairs associated with a type. For example, the following entries are associated with the Ad type:
nodeType=Ad
height=65
width=115
adTargetUrl=
adTargetContent=
adWinClose=
adWinTarget=
adWinTitle=
adClickTarget=
adUseXhtml=
adAltText=BEA Logo
adMapName=
adMap=
adBorder=
audience=internal
You can then add values for some or all of these properties and save the file. Place the saved file in the same directory as the content item with which it is associated. When BulkLoader runs, the metadata will be stored and permanently associated with the specified content item.
filename.md.properties file. MM/DD/YY HH:MM AM/PM . The order of the day/month in the date is dependent on the locale of the JVM.
If you are storing content in a Library Services enabled repository, you must include the lifecyclestatus key in the filename.md.properties file for each content item. The lifecyclestatus key takes the following integer values that indicate the status of the content item:
For example, the following md.properties entries are associated with the Ad type, and include the lifecyclestatus entry, where the status value is set to 2, or "ready".
nodeType=Ad
height=65
width=115
adTargetUrl=
adTargetContent=
adWinClose=
adWinTarget=
adWinTitle=
adClickTarget=
lifecyclestatus=2
adUseXhtml=
adAltText=BEA Logo
adMapName=
adMap=
adBorder=
audience=internal
You can then add values for some or all of the other properties and save the file. Place the saved file in the same directory as the content item with which it is associated.
| Note: | If you are bulkloading content into a library services-enabled repository, you can only ADD content. You cannot update existing content with the BulkLoader when using library services. |
When BulkLoader encounters a directory to process, it tries to load metadata property files. First, BulkLoader looks for a file called dir.md.properties in the directory. If there are no overriding metadata files, these properties are applied to all content items in the directory and, unless overridden, its subdirectories. Metadata files can be associated with specific content files, and these metadata files override the directory level file. Metadata files associated with specific content files must be named according to the following convention:
filename.md.properties
where filename is the name of the associated content item file. For example:
logo.gif.md.properties
In this case, the metadata file is associated with an image file called logo.gif.
Note: You can change the default extension from md.properties to anything you like, using BulkLoader's -mdext parameter.
Tip: By default, BulkLoader recurses into subdirectories and properties in an dir.md.properties file are inherited by content in subdirectories. You can override this behavior by specifying the +recurse flag (to turn off recursion) and the +inheritProps flag (to turn off metadata property inheritance in subdirectories).
In summary, BulkLoader gathers content metadata from the following sources, in the order shown:
dir.md.properties file in a parent folder.dir.md.properties file in a subfolder.filename.md.properties file (applied to a specific file)<meta> tags in an HTML file. For more information, see the description of the htmlPat flag in the section BulkLoader Parameter Reference.filter flag in the section BulkLoader Parameter Reference.
Typically, you run BulkLoader from a script. You need to edit this script to run in your environment, and to customize parameters that are passed to the BulkLoader program itself. This section includes the following sections:
Note: If BulkLoader fails with an out of memory error, increase your Java heap size. You may do this in the BulkLoader script by passing -Xms<xxx>m as a parameter to the BulkLoader command, where <xxx> is the number of megabytes. For example -Xms1000m.
The following script is provided with Weblogic Portal:
This section includes the following sections:
Note: WebLogic server must be running when you use BulkLoader.
BulkLoader scripts require access to two JAR files: the content.jar file and the content_system.jar. By default, these files are located in the wlp-services-app-lib.ear file. If you want to use the BulkLoader, you need to extract the content.jar and the content_system.jar files from the wlp-services-app-lib.ear file. After extracting the file, you also need to add their location to your classpath. You can delete these JAR file after you are finished using the BulkLoader.
c:\bea\weblogic92\portal\lib\modules\wlp-services-app-lib.ear. content.jar and the content_system.jar files from the wlp-services-app-lib.ear. Extract the .JAR files to a temporary location. content.jar and the content_system.jar files to the classpath on the system where you intend to use the BulkLoader.
You need to modify the default script to match your needs. The following script is provided with Weblogic Portal:
PLATFORM_HOME variable to point to your WebLogic Server installation. For example:PLATFORM_HOME=C:\bea\weblogic92
CM_DATA variable to point to the parent directory of the directory containing the content you wish to load into the content repository. For example, if the content you want to store is in a directory called images, located in D:\myContent\images, then set CM_DATA to:CM_DATA=D:\myContent
%JAVA_HOME%\bin\java -classpath %CLASSPATH% com.bea.content.loader.bulk.BulkLoader-verbose-repository"MyRepository" -application portalApp -d %CM_DATA% file1 file2 filen
The parameters shown in bold type are described in Table 2.
Table 2For a description of all BulkLoader parameters, see BulkLoader Parameter Reference.
| Tip: | You can run the BulkLoader script from the command line or by double-clicking the file icon. |
Note: The BulkLoader command does not support wildcards or regular expressions in its parameter list.
The following command recursively loads all files in the directories Images, Audio, and Doc in D:\media. Note that Images, Audio, and Doc must each contain a dir.md.properties file, or there must be a filename.md.properties file defined for each content item in those directories.
%JAVA_HOME%\bin\java -classpath %CLASSPATH% com.bea.content.loader.bulk.BulkLoader -verbose -repository "MyRepository" -application portalApp -d D:\media Images Audio Doc
The command shown in Listing 1-3 loads all files in D:\media\images. The command does not recurse into subdirectories. Metadata files with a *.info.properties naming convention are recognized.
%JAVA_HOME%\bin\java -classpath %CLASSPATH% com.bea.content.loader.bulk.BulkLoader -verbose -repository "MyRepository" -application portalApp -mdext info.properties +recurse -d D:\media images
Listing 1-4 configures the appropriate paths and runs the BulkLoader program. You can modify this script as you wish, to suit your specific environment and needs.
@ECHO OFF
REM #########################################################################
REM # (c) BEA SYSTEMS INC. All rights reserved
REM #
REM ##########################################################################
SETLOCAL
SET PLATFORM_HOME=C:\bea\weblogic92
FOR %%i IN ("%PLATFORM_HOME%") DO SET PLATFORM_HOME=%%~fsi
SET PORTAL_HOME=%PLATFORM_HOME%\portal
SET P13N_HOME=%PLATFORM_HOME%\p13n
CALL %PLATFORM_HOME%\common\bin\commEnv.cmd
@rem **************************************************************************
@rem Set any additional CLASSPATH information below
@rem **************************************************************************
setCLASSPATH=%POINTBASE_CLASSPATH%;%WEBLOGIC_CLASSPATH%;%P13N_HOME%\lib\
p13n_system.jar;%PORTAL_HOME%\lib\content.jar;%PORTAL_HOME%\lib\
content_system.jar;%CLASSPATH%
REM Set some defaults
if "%CM_DATA%"=="" set CM_DATA=..\db\data\sample\cm_data
%JAVA_HOME%\bin\java -classpath %CLASSPATH% com.bea.content.loader.bulk.BulkLoader -verbose -repository "BEA Repository" -application portalApp -d %CM_DATA% Ads
ENDLOCAL
|