Optional settings

The second part of bdd.conf contains optional properties. You can update these if you want, but the default values will work for most installations.

General settings

This section configures settings relevant to all components and the installation process itself.

Configuration property Description
FORCE Determines whether the installer will remove files and directories left over from previous installations when it runs.

Use FALSE if this is your first time installing BDD. Use TRUE if you're reinstalling after either a failed installation or an uninstallation.

Note that this property only accepts UPPERCASE values.

ENABLE_AUTOSTART Determines whether the BDD cluster will automatically restart after its servers are rebooted:
  • TRUE: WebLogic (including Studio and the Dgraph Gateway), the Dgraph, and the HDFS Agent will automatically restart after their host servers are rebooted.
  • FALSE: WebLogic, the Dgraph, and the HDFS Agent must be restarted manually.

Note that this property only accepts UPPERCASE values.

WebLogic settings

This section configures the WebLogic Server, including the Admin Server and all Managed Servers. It doesn't configure Studio or the Dgraph Gateway.

Configuration property Description and possible settings
WLS_START_MODE Defines the mode WebLogic Server will start in:
  • If set to prod, the WebLogic Server starts in production mode, which requires a username and password when it starts.
  • If set to dev, it starts in development mode, which doesn't require a username or password. The installer will still prompt you for a username and password at runtime, but these will not be required when starting WebLogic Server.

Note that this property only accepts lowercase values.

WLS_NO_SWAP Controls whether the installer will check for the required amount of free swap space (512MB) on the Admin Server and all Managed Servers before installing WebLogic Server.

If set to TRUE, the script won't perform the swap space check. Use this value if you're installing WebLogic Server on nodes that don't meet the swap space requirement.

For more information, see Physical memory and disk space requirements.

WEBLOGIC_DOMAIN_NAME The name of the WebLogic domain, which Studio and the Dgraph Gateway run in.
ADMIN_SERVER_PORT The Admin Server's port number. This number must be unique.
MANAGED_SERVER_PORT The port used by the Managed Server (i.e., Studio). This number must be unique.

This property is still required if you are installing on a single server.

WLS_SECURE_MODE Enables and disables SSL for Studio's outward-facing ports.

This can be set to TRUE or FALSE. When set to TRUE, the Studio instances on the Admin Server and the Managed Servers listen for requests on the ADMIN_SERVER_SECURE_PORT and MANAGED_SERVER_SECURE_PORT, respectively.

Note that this property doesn't enable SSL for any other BDD components.

ADMIN_SERVER_SECURE_PORT The secure port on the Admin Server that Studio listens on when WLS_SECURE_MODE is set to TRUE.

Note that when SSL is enabled, Studio still listens on the un-secure ADMIN_SERVER_PORT for requests from the Dgraph Gateway.

MANAGED_SERVER_SECURE_PORT The secure port on the Managed Server Studio listens on when WLS_SECURE_MODE is set to TRUE.

Note that when SSL is enabled, Studio still listens on the un-secure MANAGED_SERVER_PORT for requests from the Dgraph Gateway.

ENDECA_SERVER_LOG_LEVEL The log level used by the Dgraph Gateway:
  • INCIDENT_ERROR
  • ERROR
  • WARNING
  • NOTIFICATION
  • TRACE

More information on Dgraph Gateway log levels is available in the Administrator's Guide.

SERVER_TIMEOUT The timeout value (in milliseconds) used when responding to requests sent to all Dgraph Gateway web services except the Data Ingest Web Service. A value of 0 means there is no timeout.
SERVER_INGEST_TIMEOUT The timeout value (in milliseconds) used when responding to requests sent to the Data Ingest Web Service. A value of 0 means there is no timeout.
SERVER_HEALTHCHECK_TIMEOUT The timeout value (in milliseconds) used when checking data source availability when connections are initialized. A value of 0 means there is no timeout.
STUDIO_ADMIN_SCREEN_NAME The Studio admin's screen name.
STUDIO_ADMIN_EMAIL_ADDRESS The email address of the Studio admin, which will be their username. This must be a full email address and can't begin with root@ or postmaster@.
Note: If you set the BDD_STUDIO_ADMIN_USERNAME environment variable for a silent installation, you don't need to set this property. If you do, the installer will overwrite this value with the value of BDD_STUDIO_ADMIN_USERNAME.
STUDIO_ADMIN_PASSWORD_RESET_REQUIRED Determines whether the Studio admin will be asked to reset their password the first time they log in.
STUDIO_ADMIN_FIRST_NAME The Studio admin's first name.
STUDIO_ADMIN_MIDDLE_NAME The Studio admin's middle name.
STUDIO_ADMIN_LAST_NAME The Studio admin's last name.

Dgraph and HDFS Agent settings

This section configures the Dgraph and the HDFS Agent.

Configuration property Description and possible settings
DGRAPH_WS_PORT The port the Dgraph listens on for requests.
DGRAPH_BULKLOAD_PORT The port that the Dgraph listens on for bulk load ingest requests.
DGRAPH_OUT_FILE The path to the Dgraph's stdout/stderr file.
DGRAPH_LOG_LEVEL Optional. Defines the log levels for the Dgraph's out log subsystems. This must be in the format "subsystem1 level1|subsystem2,subsystem3 level2|subsystemN levelN" (including quotes).

You can include as many subsystems as you want. Any you don't include will be set to NOTIFICATION.

For more information on the Dgraph's out log subsystems and their supported levels, see the Administrator's Guide.

DGRAPH_ADDITIONAL_ARG
Note: This property is only intended for use by Oracle Support. Don't provide a value for this property when installing BDD.
Optional. Defines one or more flags to start the Dgraph with. More information on Dgraph flags is available in the Administrator's Guide.
AGENT_PORT The port that the HDFS Agent listens on for HTTP requests.
AGENT_EXPORT_PORT The port that the HDFS Agent listens on for requests from the Dgraph.
AGENT_OUT_FILE The path to the HDFS Agent's stdout/stderr file.

Data Processing settings

This section configures Data Processing and the Hive Table Detector.

Configuration property Description and possible settings
ENABLE_HIVE_TABLE_DETECTOR

Enables the DP CLI to automatically run the Hive Table Detector according to the schedule defined by the subsequent properties.

When set to TRUE, the Hive Table Detector runs automatically on the server defined by DETECTOR_SERVER. When it runs, the default behavior performs these two steps:
  • Provisions any new Hive table in the "default" database, if that table passes the whitelist and blacklist.
  • Deletes any BDD data set that does not have a corresponding source Hive table. This is an action that you cannot prevent.

When set to FALSE, the Hive Table Detector does not run.

DETECTOR_SERVER The hostname of the server the Hive Table Detector runs on. This must be one of the WebLogic Managed Servers.
DETECTOR_HIVE_DATABASE The name of the Hive database that the Hive Table Detector monitors.

The default value is default. This is the same as the default value of HIVE_DATABASE_NAME, which is used by Studio and the CLI. You can use a different database for each these properties, but Oracle recommends you start with one for a first time installation.

This value can't contain semicolons (;).

DETECTOR_MAXIMUM_WAIT_TIME The maximum amount of time (in seconds) that the Hive Table Detector waits between update jobs.
DETECTOR_SCHEDULE A Cron format schedule that specifies how often the Hive Table Detector runs. This must be enclosed in quotes. The default value is "0 0 * * *", which means the Hive Table Detector runs at midnight, every day of every month.
ENABLE_ENRICHMENTS Determines whether data enrichments are run during the sampling phase of data processing. This setting controls the Language Detection, Term Extraction, Geocoding Address, Geocoding IP, and Reverse Geotagger modules.

When set to true, all of the data enrichments run. When set to false, none of them run.

For more information on data enrichments, see the Data Processing Guide.

MAX_RECORDS The maximum number of records included in a data set. For example, if a Hive table has 1,000,000 records, you could restrict the total number of sampled records to 100,000.

Note that the actual number of records in each data set may be slightly higher or less than this value.

SANDBOX_PATH The path to the HDFS directory that the Avro files created when users export data from BDD are stored in.
LANGUAGE Specifies either a supported ISO-639 language code (en, de, fr, etc.) or a value of unknown to set the language property for all attributes in the data set. This controls whether Oracle Language Technology (OLT) libraries are invoked during indexing.

A language code requires more processing but produces better processing and indexing results by using OLT libraries for the specified language. If the value is unknown, the processing time is faster but the processing and indexing results are more generic and OLT is not invoked.

For a complete list of the languages BDD supports, see the Data Processing Guide.