Bulk Loading Data into a Database
TimesTen Scaleout enables you to load data into a database from various sources. You
can load data into a specific table either from a file by using the
ttBulkCp
utility or an Oracle database table by using the
ttLoadFromOracle
built-in procedure.
Both the ttBulkCp
utility and ttLoadFromOracle
built-in procedure support in TimesTen Scaleout a localOnly
filter option that enables you to load only the rows that are hashed to the local element and its replicas. If you use the localOnly
filter option, the ttBulkCp
utility and ttLoadFromOracle
built-in procedure ignore rows that are hashed to remote elements that are not a replica of the local element. Regardless of the options you specify, the ttBulkCp
utility and ttLoadFromOracle
built-in procedure do not copy duplicate rows into a table.
With the localOnly
filter option enabled and depending of the distribution scheme of the table, the ttBulkCp
utility and ttLoadFromOracle
built-in procedure behave as follows:
-
Hash: Retain and insert rows that have hash key values that are hashed to the elements of the local data instance and its replicas. They ignore rows that are hashed to the remaining elements.
-
Reference: Retain and insert rows whose reference key value references to a hash or reference key value that is hashed to the local element and its replicas. They ignore rows that are hashed to the remaining elements.
-
Duplicate: Ignore the
localOnly
option. They insert rows into the elements of all data instances.
The advantages of using the localOnly
filter option are:
-
It requires less network bandwidth to distribute the data during the bulk loading operation.
-
It allows a failed bulk loading operation to be retried independent of other elements.
The disadvantages of using the localOnly
filter option are:
-
The source file must be available to all hosts, or at least to one host for each replica set of the grid. This only applies for bulk loading operations with the
ttBulkCp
utility. -
You must run a bulk loading operation on an element of every replica set.
-
Every bulk loading operation must process the entire data set, even though it ignores any rows hashed to a different replica set.
These topics describe how to load data into a table in TimesTen Scaleout.
Note:
The following examples consider the same grid scenario as the one described in Define the Distribution Map of the Database.
Populating a Table with the ttBulkCp Utility
The ttBulkCp
utility with the -i
option enables you to load data from a file. This option uses standard INSERT
SQL statements to load data into a specific table of a database. The ttBulkCp
utility inserts each row into its corresponding element based on the distribution scheme of the table.
Note:
-
Unlike operations performed with the
ttGridAdmin
utility, thettBulkCp
utility (and thettBulkCpCS
utility) must be run on a data instance instead of the active management instance, and its use is not limited to the instance administrator. -
Ensure that the user running the command or the one specified in the connection string has
INSERT
privileges on the specified table.
These topics describe the options for loading data into a database while using the
ttBulkCp
utility.
Populate a Table from a Single Location
If the source file is only available to a single data instance, run the ttBulkCp
utility with the -i
option to insert the rows of the specified database into their corresponding element based on the distribution scheme of the specified database.
From the data instance with access to the source file, insert all rows in the file into the customers
table of the database1
database.
% ttBulkCp -i -connStr "DSN=database1;UID=terry" customers /mydir/customers_data.dmp
Enter password for 'terry':
/mydir/customers_data.dmp:
1000 rows inserted
1000 rows total
For more information on using the ttBulkCp
utility, see Bulk Copy Data
Using the ttBulkCp Utility in Oracle TimesTen In-Memory Database Operations
Guide and ttBulkCp
in Oracle TimesTen In-Memory Database
Reference.
Populate a Table from Several Locations
If the source file is available to any given host in the grid, run the ttBulkCp
utility with the -i
and -localOnly
options on one data instance of each replica set of the database to insert the rows hashed to the local element and its replicas from a file into a table.
Use the ttGridAdmin dbStatus -replicaset
command from the active management instance (in this example the active management instance is host1.instance1
) to help you determine the data instances associated with each replica set.
% ttGridAdmin dbStatus database1 -replicaset
Database database1 Replica Set status as of Mon Aug 16 14:05:15 PDT 2021
RS DS Elem Host Instance Status Cache Agent Date/Time of Event Message
-- -- ---- ----- --------- ------ ----------- ------------------- -------
1 1 1 host3 instance1 opened stopped 2021-08-12 14:49:08
1 2 2 host4 instance1 opened stopped 2021-08-12 14:49:09
1 3 3 host5 instance1 opened stopped 2021-08-12 14:49:08
2 1 4 host6 instance1 opened stopped 2021-08-12 14:49:08
2 2 5 host7 instance1 opened stopped 2021-08-12 14:49:08
2 3 6 host8 instance1 opened stopped 2021-08-12 14:49:09
Insert the rows hashed to the local element and its replica from the source file into the customers
table of the database1
database. Ensure you run the ttBulkCp
utility on one data instance of each replica set available, the host3.instance1
, host5.instance1
, and host7.instance1
data instances for example.
On the host3.instance1
data instance:
% ttBulkCp -i -localOnly -connStr "DSN=database1;UID=terry" customers /mydir/customers_data.dmp
Enter password for 'terry':
/mydir/customers_data.dmp:
501 rows inserted
449 rows not inserted (ignored)
1000 rows total
Note:
For this example where the element of the host4.instance1
and host5.instance1
data instances are defined as the replicas of the element of the host3.instance1
data instance. The same rows inserted into the customers
table in the element of the host3.instance1
data instance are inserted into the customers
table in the element of the host4.instance1
and host5.instance1
data instance.
On the host6.instance1
data instance:
% ttBulkCp -i -localOnly -connStr "DSN=database1;UID=terry" customers /mydir/customers_data.dmp
Enter password for 'terry':
/mydir/customers_data.dmp:
449 rows inserted
501 rows not inserted (ignored)
1000 rows total
Note:
For this example where the element of the host7.instance1
and host8.instance1
data instances are defined as the replicas of the element of the host6.instance1
data instance. The same rows inserted into the customers
table in the element of the host6.instance1
data instance are inserted into the customers
table in the element of the host7.instance1
and host8.instance1
data instance.
For more information on the ttGridAdmin dbStatus
command, see Monitor the
Status of a Database (dbStatus) in Oracle TimesTen In-Memory Database
Reference.
For more information on using the ttBulkCp
utility, see Bulk Copy Data
Using the ttBulkCp Utility in Oracle TimesTen In-Memory Database Operations
Guide and ttBulkCp in Oracle TimesTen In-Memory Database
Reference.
Populating a Table with the ttLoadFromOracle Built-in Procedure
The ttLoadFromOracle
built-in procedure enables you to load data from an Oracle database.
These topics describe how to load data from a Oracle database into a database while
using the ttLoadFromOracle
built-in procedure.
Enable Communication to an Oracle Database
For the ttLoadFromOracle
built-in procedure to be able to import data from an Oracle database table into a database table, TimesTen Scaleout must be able to recognize and communicate with the Oracle database. For this to happen, you need to:
Populate a Table from a Single Location
The following example connects with the ttIsql
utility to the database1
database to copy the rows from the terry.customers
table of an Oracle database into the terry.customers
table of the database1
database.
Note:
Ensure that the database user has the INSERT
privilege on the table the built-in procedure copies data into.
From a connection to the element of any data instance:
Command> call ttLoadFromOracle('terry', 'customers', 'SELECT * FROM terry.customers');
< 1000 >
1 row found.
For more information on the ttLoadFromOracle
built-in procedure,
see ttLoadFromOracle in Oracle TimesTen In-Memory Database
Reference.
Populate a Table from Several Locations
Call the ttLoadFromOracle
built-in procedure with the localOnly=Y
parameter to copy the rows hashed to a local element and its replicas from an Oracle database table into a TimesTen Scaleout database table. If you use the localOnly=Y
parameter, the ttLoadFromOracle
built-in procedure ignores rows that are hashed to remote elements that are not a replicas of the local element.
The following example connects with the ttIsql
utility to the database1
database to copy the rows hashed to the local element and its replicas from the terry.customers
table of an Oracle database into the terry.customers
table of the database1
database. If necessary, use the ttGridAdmin dbStatus -replicaset
command from the active management instance (in this example the active management instance is host1.instance1
) to help you determine the data instances associated with each replica set.
% ttGridAdmin dbStatus database1 -replicaset
Database database1 Replica Set status as of Mon Aug 16 14:05:15 PDT 2021
RS DS Elem Host Instance Status Cache Agent Date/Time of Event Message
-- -- ---- ----- --------- ------ ----------- ------------------- -------
1 1 1 host3 instance1 opened stopped 2021-08-12 14:49:08
1 2 2 host4 instance1 opened stopped 2021-08-12 14:49:09
1 3 3 host5 instance1 opened stopped 2021-08-12 14:49:08
2 1 4 host6 instance1 opened stopped 2021-08-12 14:49:08
2 2 5 host7 instance1 opened stopped 2021-08-12 14:49:08
2 3 6 host8 instance1 opened stopped 2021-08-12 14:49:09
Ensure you call the ttLoadFromOracle
built-in procedure on one replica of each replica set available, the host3.instance1
and host6.instance1
data instances for example.
Note:
Ensure that the database user has the INSERT
privilege on the table the built-in procedure copies data into.
From a connection to the element of the host3.instance1
data instance:
Command> call ttLoadFromOracle('terry', 'customers', 'SELECT * FROM terry.customers', 4, 'localOnly=Y');
< 501 >
1 row found.
Note:
For this example where the element of the host4.instance1
and host5.instance1
data instances are defined as the replicas of the element of the host3.instance1
data instance. The same rows inserted into the customers
table in the element of the host3.instance1
data instance are inserted into the customers
table in the element of the host4.instance1
and host5.instance1
data instance.
From a connection to the element of the host6.instance1
data instance:
Command> call ttLoadFromOracle('terry', 'customers', 'SELECT * FROM terry.customers', 4, 'localOnly=Y');
< 449 >
1 row found.
Note:
For this example where the element of the host7.instance1
and host8.instance1
data instances are defined as the replicas of the element of the host6.instance1
data instance. The same rows inserted into the customers
table in the element of the host6.instance1
data instance are inserted into the customers
table in the element of the host7.instance1
and host8.instance1
data instance.
For more information on the ttGridAdmin dbStatus
command or
ttLoadFromOracle
built-in procedure, see Monitor the Status of a Database (dbStatus)
or ttLoadFromOracle, respectively, in Oracle TimesTen In-Memory Database
Reference.