com.bea.content.loader.bulk
Class BulkLoader

java.lang.Object
  extended by com.bea.content.loader.bulk.BulkLoader
All Implemented Interfaces
FilenameFilter
Direct Known Subclasses:
FileBulkLoader

public class BulkLoader
extends Object
implements FilenameFilter

The Content Manager bulk loader application.

This class will scan the local file system for files to load via the content manager.

BulkLoader has limited use when loading against a Library Services Enabled Repository. If the content is new (not currently loaded into the WLP Repository) and Library Services is enabled, the content may be loaded. In this case, the workflowstatus property must be specified in the md.properties, which defines the workflow status id used when the content is checked in. All lifecycle actions will operate as if the bulkloader user is a user in the admin tools. So, for example if a content item is checked in as "Ready" then the assignements will occur. Please review the Workflow javadoc for more information.

The status transitions for the default workflow are as follows:

  • 1 = DRAFT
  • 2 = READY
  • 3 = REJECTED
  • 4 = PUBLISHED
  • 5 = RETIRED

    The type (ObjectClass) for the file along with the values of any required properties must be specified in the metata data properties file. Thus, a type must be defined in the content repository before the bulkloader may load files from the file system.

    A folder will be loaded as a Hierarchy Node and a file will be loaded as a Content Node.

    The actual bytes will be loaded into the primary property (must be defined in the type) and must be of type Binary.

    In order for bulkloader to run, the repository and the application must be passed in as arguments and the server must be running.

    This class is mainly designed to run as a command-line application, via a "java com.bea.content.loader.bulk.BulkLoader" command-line. To see a usage, give it a -h flag or read the Usage.txt in this package.

    Additionally, BulkLoader objects can be created and used to provide the functionality in other places. The lifecycle of a BulkLoader is as follows:

    The base directory that will be loaded may be passed in using the -d paramter. If it is not specified then the current directory "." will be used. Any additional argument will be considered a file/folder to load relative to the base directory, or if an absolute path is specified then it will be used.

    For FileSystemRepositories, the "-d" parameter must be the same as the "cm_fileSystem_dir" property in content-config.xml unless the repository is managed.

    Folders (directories) will automatically be assigned an ObjectClass of type (ObjectClass.FOLDER)

    Not calling parseArgs() and validateArgs(), in that order, will cause the BulkLoader object to most likely ungracefully fail. However, once those methods have been invoked, doLoad(), can invoked.

    If manually constructing and utilizing a BulkLoader object, be certain to synchronize all access to the object. Since the command-line program is single-threaded, BulkLoader objects are not thread-safe by design.

    To load the default LoaderFilters, the BulkLoader looks for the com/bea/content/loader/bulk/loader.properties file in the CLASSPATH. From that it reads the list of default LoaderFilter class names from the "loader.defFilters" property. To not use any of the default filters, specify +filters in the command-line args.

    Since:
    2.0
    See Also
    MetaParser, FileCache

    Nested Class Summary
    static class BulkLoader.ShowUsageException
              Quick inner exception thrown on parseArgs() to say we should just print a usage report.
     
    Field Summary
    protected  String application
              The application name which is pre-pended to the jndiName
     String baseDirectory
              The contentBase.
    protected  String batchFileName
              The batch properties file with the user name and password properties
    static String BEA_BINARY_CHECKSUM
              Deprecated Use WLP_BINARY_CHECKSUM
    static String BEA_BINARY_SIZE
              Deprecated Use WLP_BINARY_SIZE
    protected  String DEF_ENCODE_PREFIX
              Encode batch property prefix
    static String DEF_MD_FILE_EXT
              The default file extension for metadata property files.
    static String DEF_MIME_TYPE
              The default mime type.
    static String DEF_WLS_PROPS_PATH
              The default weblogic properties file path.
    protected  String deletePath
              The path of the hierarchy to delete, starting with a "/".
     boolean doMetaParse
              Are we supposed to parse '*.htm' and '*.html' files for META tags.
     String fileEncoding
              The file enconding (null for VM default).
     List fileList
              The list of files/directories to scan over.
     List htmlMatchList
              The list of patterns that represent HTML file names.
     boolean ignoreErrors
              Do we ignore errors.
     List ignoreList
              The list of file name patterns to ignore.
     boolean includeHidden
              Are we supposed to include hidden files and directories.
     boolean inheritProps
              Are we supposed to inherit metadata properties when recursing directories?
    protected  boolean isFileSystem
              Flag indicating whether or not this repository is a filesystem repository
    static String JNDI_FACTORY
              Defines the JNDI context factory.
    protected  String jndiName
              The jndi home for the remote Loader session bean.
     List loaderFilters
              The list of LoaderFilters to try.
     List matchList
              The list of file name patterns to include.
     String mdFileExt
              The file extension of metadata property files.
    protected  Collection metadataNames
              The metadata properties we find along the way.
    protected  long numDocsLoaded
              The number of nodes we've loaded so far.
    protected  String password
              The password for the user of this resource.
     boolean recurse
              Do we recurse over directories?
    protected  String repository
              The Repository to run the BulkLoader against.
    protected  String url
              The WLS instance host where the content manager is running.
    protected  String user
              The user of this resource.
     boolean verbose
              Do we spew out messages.
    static String WLP_BINARY_CHECKSUM
               
    static String WLP_BINARY_SIZE
               
     
    Constructor Summary
    BulkLoader()
              Constructor a BulkLoader without command-line arguments.
    BulkLoader(String[] args)
              Construct a BulkLoader from the given command-line arguments.
     
    Method Summary
     boolean accept(File dir, String name)
              Implement the FilenameFilter interface method to use our match and ignore lists.
     boolean checkFileAttributes(File f)
              A helper method to check file attirbutes.
     void debug(String mesg)
              Out put a debug message.
     void doDelete()
              Do the actual bulk load logic on the file list.
     void doLoad()
              Do the actual bulk load logic on the file list.
     void doLoad(File baseDir, String path, Properties mdProperties)
              Load the given path into the database.
     void error(String mesg)
              Output an error message.
     void error(String mesg, Throwable ex)
              Output an error message.
     void finished()
              Once you are done remove the bean for cleanup
     String fixPath(String path)
              Fix up a path to be forward-slash style and to not have empty path parts.
     Properties getLoaderFilterProperties(File f, Properties p)
              Get the properties from the BulkLoader's LoaderFilters for the given file.
     Properties getMetadataProperties(File base, Properties p)
              Get the metadata properties for the given file or directory.
     RepositoryConfig getRepositoryConfig()
               
    protected  void initRepoFileDir()
               
     void inspectCurrentDirectory(File f, String path, Properties mdProperties)
              A helper method for directory inspection.
    static boolean isHidden(File f)
              Check if the specified file is a hidden file.
     boolean isHtmlFile(String name)
              Tell if the specified file name is an HTML file to the loader.
    static boolean isReadableDirectory(String name)
              Check if the specified file name is a directory that we can get into.
     void loadIndividualFile(File f, String path, Properties mdProperties)
              Load a file.
    static int main(BulkLoader loader, String[] args)
              The main method invoked on a BulkLoader instance.
    static void main(String[] args)
              Command-line entry point.
     void parseArgs(String[] args)
              Parse the given input arguments.
     void printArgs()
              Prints the arguments as debug statements.
     void processBatchProperties()
              CR201221 Read the user name and password from the batch properties file
     boolean shouldIgnore(String name)
              Tell if the loader should ignore the specified file name.
     boolean shouldInclude(String name)
              Tell if the loader should include the specified file name.
     void usage()
              Print the usage of the application.
     void usage(PrintWriter out)
              Print the usage of the application.
     void validateArgs()
              Validate that we have been passed correct arguments.
     void warning(String mesg)
              Output a warning message.
     void warning(String mesg, Throwable ex)
              Output a warning message.
     
    Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
     

    Field Detail

    JNDI_FACTORY

    public static final String JNDI_FACTORY
    Defines the JNDI context factory.

    See Also
    Constants Summary

    DEF_MD_FILE_EXT

    public static final String DEF_MD_FILE_EXT
    The default file extension for metadata property files.

    See Also
    Constants Summary

    DEF_WLS_PROPS_PATH

    public static final String DEF_WLS_PROPS_PATH
    The default weblogic properties file path.

    See Also
    Constants Summary

    DEF_MIME_TYPE

    public static final String DEF_MIME_TYPE
    The default mime type.

    See Also
    Constants Summary

    BEA_BINARY_CHECKSUM

    public static final String BEA_BINARY_CHECKSUM
    Deprecated Use WLP_BINARY_CHECKSUM
    See Also
    Constants Summary

    WLP_BINARY_CHECKSUM

    public static final String WLP_BINARY_CHECKSUM
    See Also
    Constants Summary

    BEA_BINARY_SIZE

    public static final String BEA_BINARY_SIZE
    Deprecated Use WLP_BINARY_SIZE
    See Also
    Constants Summary

    WLP_BINARY_SIZE

    public static final String WLP_BINARY_SIZE
    See Also
    Constants Summary

    verbose

    public boolean verbose
    Do we spew out messages.


    recurse

    public boolean recurse
    Do we recurse over directories?


    doMetaParse

    public boolean doMetaParse
    Are we supposed to parse '*.htm' and '*.html' files for META tags.


    includeHidden

    public boolean includeHidden
    Are we supposed to include hidden files and directories.


    inheritProps

    public boolean inheritProps
    Are we supposed to inherit metadata properties when recursing directories?


    ignoreErrors

    public boolean ignoreErrors
    Do we ignore errors.


    baseDirectory

    public String baseDirectory
    The contentBase.


    mdFileExt

    public String mdFileExt
    The file extension of metadata property files.

    This should start with a ".".


    fileEncoding

    public String fileEncoding
    The file enconding (null for VM default).


    matchList

    public List matchList
    The list of file name patterns to include.

    Empty to include all.


    ignoreList

    public List ignoreList
    The list of file name patterns to ignore.


    htmlMatchList

    public List htmlMatchList
    The list of patterns that represent HTML file names.


    fileList

    public List fileList
    The list of files/directories to scan over.


    loaderFilters

    public List loaderFilters
    The list of LoaderFilters to try.


    metadataNames

    protected Collection metadataNames
    The metadata properties we find along the way.


    numDocsLoaded

    protected long numDocsLoaded
    The number of nodes we've loaded so far.


    repository

    protected String repository
    The Repository to run the BulkLoader against.


    batchFileName

    protected String batchFileName
    The batch properties file with the user name and password properties


    DEF_ENCODE_PREFIX

    protected String DEF_ENCODE_PREFIX
    Encode batch property prefix


    url

    protected String url
    The WLS instance host where the content manager is running. Defaults to "t3://localhost:7001"


    jndiName

    protected String jndiName
    The jndi home for the remote Loader session bean. todo: this can't be rebranded to 'WLP_content.LoaderHome' until/unless the EJB descriptor files (in vcr package) are renamed as well


    application

    protected String application
    The application name which is pre-pended to the jndiName


    user

    protected String user
    The user of this resource.


    password

    protected String password
    The password for the user of this resource.


    deletePath

    protected String deletePath
    The path of the hierarchy to delete, starting with a "/". The repository must be specified in the -repository argument.


    isFileSystem

    protected boolean isFileSystem
    Flag indicating whether or not this repository is a filesystem repository

    Constructor Detail

    BulkLoader

    public BulkLoader()
    Constructor a BulkLoader without command-line arguments.


    BulkLoader

    public BulkLoader(String[] args)
               throws IllegalArgumentException,
                      RepositoryException
    Construct a BulkLoader from the given command-line arguments.

    Throws
    IllegalArgumentException - thrown on invalid args
    RepositoryException
    See Also
    parseArgs(java.lang.String[])
    Method Detail

    accept

    public boolean accept(File dir,
                          String name)
    Implement the FilenameFilter interface method to use our match and ignore lists.

    Specified by:
    accept in interface FilenameFilter

    parseArgs

    public void parseArgs(String[] args)
                   throws IllegalArgumentException
    Parse the given input arguments.

    Parameters
    args - the input arguments.
    Throws
    BulkLoader.ShowUsageException - thrown if the caller should show a usage report.
    IllegalArgumentException - thrown on bad arguments.

    initRepoFileDir

    protected void initRepoFileDir()

    getRepositoryConfig

    public RepositoryConfig getRepositoryConfig()
                                         throws RepositoryException
    Throws
    RepositoryException

    usage

    public void usage()
    Print the usage of the application.


    usage

    public void usage(PrintWriter out)
    Print the usage of the application.


    printArgs

    public void printArgs()
    Prints the arguments as debug statements.


    validateArgs

    public void validateArgs()
                      throws IllegalStateException
    Validate that we have been passed correct arguments.

    This does not validate that the arguments are valid. That will be done in initialize().

    Throws
    IllegalStateException

    finished

    public void finished()
                  throws RemoteException,
                         javax.ejb.RemoveException,
                         Exception
    Once you are done remove the bean for cleanup

    Throws
    RemoteException
    javax.ejb.RemoveException
    Exception

    doDelete

    public void doDelete()
                  throws Exception
    Do the actual bulk load logic on the file list.

    Throws
    Exception

    processBatchProperties

    public void processBatchProperties()
                                throws Exception
    CR201221 Read the user name and password from the batch properties file

    Throws
    BulkLoader.ShowUsageException
    Exception

    doLoad

    public void doLoad()
                throws Exception
    Do the actual bulk load logic on the file list.

    Throws
    Exception

    doLoad

    public void doLoad(File baseDir,
                       String path,
                       Properties mdProperties)
                throws Exception
    Load the given path into the database.

    If path is a directory, all files underneath it that match our patterns will be included. If path is a file, it will be loaded.

    Parameters
    baseDir - the base directory (can be used to get absolute file paths).
    path - the path to the file or directory (this can be multi-part, not just name).
    mdProperties - the base md properties for file (this should be a clone this method can modify as needed).
    Throws
    SQLException - thrown on a database error.
    Exception

    loadIndividualFile

    public void loadIndividualFile(File f,
                                   String path,
                                   Properties mdProperties)
                            throws Exception
    Load a file.

    Throws
    Exception

    getMetadataProperties

    public Properties getMetadataProperties(File base,
                                            Properties p)
                                     throws IOException
    Get the metadata properties for the given file or directory.

    This does not do a META data parse.

    Parameters
    base - the file or directory base path.
    p - the properties to load into (null to create new).
    Returns
    the properties (p if p was not null).
    Throws
    IOException - on an error reading the properties file.

    checkFileAttributes

    public boolean checkFileAttributes(File f)
    A helper method to check file attirbutes.


    inspectCurrentDirectory

    public void inspectCurrentDirectory(File f,
                                        String path,
                                        Properties mdProperties)
                                 throws Exception
    A helper method for directory inspection.

    Throws
    Exception

    getLoaderFilterProperties

    public Properties getLoaderFilterProperties(File f,
                                                Properties p)
    Get the properties from the BulkLoader's LoaderFilters for the given file.

    Parameters
    f - the file.
    p - the properties object to add to (null to create new one).
    Returns
    p.

    shouldInclude

    public boolean shouldInclude(String name)
    Tell if the loader should include the specified file name.


    shouldIgnore

    public boolean shouldIgnore(String name)
    Tell if the loader should ignore the specified file name.


    isHtmlFile

    public boolean isHtmlFile(String name)
    Tell if the specified file name is an HTML file to the loader.


    fixPath

    public String fixPath(String path)
    Fix up a path to be forward-slash style and to not have empty path parts.


    debug

    public void debug(String mesg)
    Out put a debug message.

    Subclasses can override this method to change where messages go.


    warning

    public void warning(String mesg,
                        Throwable ex)
    Output a warning message.

    Subclasses can override this method to change where messages go.


    warning

    public void warning(String mesg)
    Output a warning message.


    error

    public void error(String mesg,
                      Throwable ex)
    Output an error message.

    Subclasses can override this method to change where messages go.


    error

    public void error(String mesg)
    Output an error message.


    isReadableDirectory

    public static boolean isReadableDirectory(String name)
    Check if the specified file name is a directory that we can get into.


    isHidden

    public static boolean isHidden(File f)
    Check if the specified file is a hidden file.

    Under UNIX, the File.isHidden() reports that "/weblogicCommerce/dmsBase/." is a hidden file, which it is not. So, this fixes that problem by getting canonicals paths for directories before calling isHidden(). That seems to do the trick.


    main

    public static int main(BulkLoader loader,
                           String[] args)
    The main method invoked on a BulkLoader instance.

    This will take a BulkLoader through the bulk loading steps. Output will be sent via the BulkLoader's debug(), warning(), and error() methods.

    This will not call System.exit().

    Parameters
    args - the command-line args.
    Returns
    the exit code (0 for success, non-zero for failure).
    See Also
    parseArgs(java.lang.String[]), validateArgs()

    main

    public static void main(String[] args)
    Command-line entry point.

    This will call System.exit() on invalid args or error. To invoke a bulk load from your own code, create and manipulate a BulkLoader object. You can use the other main method, which does not exit.

    Parameters
    args - the command-line args.
    See Also
    main(com.bea.content.loader.bulk.BulkLoader,java.lang.String[])


    Copyright © 2000, 2008, Oracle and/or its affiliates. All rights reserved.
    Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
    Other names may be trademarks of their respective owners.