8.2.5 Amazon S3
Learn how to use the S3 Event Handler, which provides the interface to Amazon S3 web services.
8.2.5.1 Overview
Amazon S3 is object storage hosted in the Amazon cloud. The purpose of the S3 Event Handler is to load data files generated by the File Writer Handler into Amazon S3, see https://aws.amazon.com/s3/.
You can use any format that the File Writer Handler, see Flat Files.
Parent topic: Amazon S3
8.2.5.2 Detailing Functionality
The S3 Event Handler requires the Amazon Web Services (AWS) Java SDK to transfer files to S3 object storage. Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) does not include the AWS Java SDK. You have to download and install the AWS Java SDK from:
https://aws.amazon.com/sdk-for-java/
Then you have to configure the gg.classpath
variable to include the JAR files in the AWS Java SDK and are divided into two directories. Both directories must be in gg.classpath
, for example:
gg.classpath=/usr/var/aws-java-sdk-1.11.240/lib/*:/usr/var/aws-java-sdk-1.11.240/third-party/lib/
8.2.5.2.1 Resolving AWS Credentials
- Amazon Web Services Simple Storage Service Client Authentication
The S3 Event Handler is a client connection to the Amazon Web Services (AWS) Simple Storage Service (S3) cloud service. The AWS cloud must be able to successfully authenticate the AWS client in order in order to successfully interface with S3.
Parent topic: Detailing Functionality
8.2.5.2.1.1 Amazon Web Services Simple Storage Service Client Authentication
The S3 Event Handler is a client connection to the Amazon Web Services (AWS) Simple Storage Service (S3) cloud service. The AWS cloud must be able to successfully authenticate the AWS client in order in order to successfully interface with S3.
- Explicit Configuration of the Client ID and Secret
A client ID and secret are generally the required credentials for the S3 Event Handler to interact with Amazon S3. A client ID and secret are generated using the Amazon AWS website. - Use of the AWS Default Credentials Provider Chain
If thegg.eventhandler.name.accessKeyId
andgg.eventhandler.name.secretKey
are unset, then credentials resolution reverts to the AWS default credentials provider chain. The AWS default credentials provider chain provides various ways by which the AWS credentials can be resolved. - AWS Federated Login
The use case is when you have your on-premise system login integrated with AWS. This means that when you log into an on-premise machine, you are also logged into AWS.
Parent topic: Resolving AWS Credentials
8.2.5.2.1.1.1 Explicit Configuration of the Client ID and Secret
A client ID and secret are generally the required credentials for the S3 Event Handler to interact with Amazon S3. A client ID and secret are generated using the Amazon AWS website.
gg.eventhandler.name.accessKeyId=
gg.eventhandler.name.secretKey=
Furthermore, the Oracle Wallet functionality can be used to encrypt these credentials.
8.2.5.2.1.1.2 Use of the AWS Default Credentials Provider Chain
If the gg.eventhandler.name.accessKeyId
and
gg.eventhandler.name.secretKey
are unset, then
credentials resolution reverts to the AWS default credentials provider
chain. The AWS default credentials provider chain provides various ways by
which the AWS credentials can be resolved.
When Oracle GoldenGate for Distributed Applications and Analytics (GG for DAA) runs on an AWS Elastic Compute Cloud (EC2) instance, the general use case is to resolve the credentials from the EC2 metadata service. The AWS default credentials provider chain provides resolution of credentials from the EC2 metadata service as one of the options.
8.2.5.2.1.1.3 AWS Federated Login
The use case is when you have your on-premise system login integrated with AWS. This means that when you log into an on-premise machine, you are also logged into AWS.
- You may not want to generate client IDs and secrets. (Some users disable this feature in the AWS portal).
- The client AWS applications need to interact with the AWS Security Token Service (STS) to obtain an authentication token for programmatic calls made to S3.
gg.eventhandler.name.enableSTS=true
.
8.2.5.2.2 About the AWS S3 Buckets
AWS divides S3 storage into separate file systems called buckets. The S3 Event Handler can write to pre-created buckets. Alternatively, if the S3 bucket does not exist, the S3 Event Handler attempts to create the specified S3 bucket. AWS requires that S3 bucket names are lowercase. Amazon S3 bucket names must be globally unique. If you attempt to create an S3 bucket that already exists in any Amazon account, it causes the S3 Event Handler to abend.
Parent topic: Detailing Functionality
8.2.5.2.3 Troubleshooting
Connectivity Issues
If the S3 Event Handler is unable to connect to the S3 object storage when running on premise, it’s likely your connectivity to the public internet is protected by a proxy server. Proxy servers act a gateway between the private network of a company and the public internet. Contact your network administrator to get the URLs of your proxy server.
Oracle GoldenGate can be used with a proxy server using the following parameters to enable the proxy server:
gg.handler.name.proxyServer= gg.handler.name.proxyPort=80 gg.handler.name.proxyUsername=username gg.handler.name.proxyPassword=password
Sample configuration:
gg.eventhandler.s3.type=s3
gg.eventhandler.s3.region=us-west-2
gg.eventhandler.s3.proxyServer=www-proxy.us.oracle.com
gg.eventhandler.s3.proxyPort=80
gg.eventhandler.s3.proxyProtocol=HTTP
gg.eventhandler.s3.bucketMappingTemplate=yourbucketname
gg.eventhandler.s3.pathMappingTemplate=thepath
gg.eventhandler.s3.finalizeAction=none
Parent topic: Detailing Functionality
8.2.5.3 Configuring the S3 Event Handler
You can configure the S3 Event Handler operation using the properties file. These properties are located in the Java Adapter properties file (not in the Replicat properties file).
To enable the selection of the S3 Event Handler, you must first configure the
handler type by specifying gg.eventhandler.name.type=s3
and
the other S3 Event properties as follows:
Table 8-6 S3 Event Handler Configuration Properties
Properties | Required/ Optional | Legal Values | Default | Explanation |
---|---|---|---|---|
|
Required |
|
None |
Selects the S3 Event Handler for use with Replicat. |
|
Required |
The AWS region name that is hosting your S3 instance. |
None |
Setting the legal AWS region name is required. |
gg.eventhandler.name.cannedACL |
Optional | Accepts one of the following values:
|
None | Amazon S3 supports a set of predefined grants, known as canned Access Control Lists. Each canned ACL has a predefined set of grantees and permissions. For more information, see Managing access with ACLs |
|
Optional |
The host name of your proxy server. |
None |
Sets the host name of your proxy server if connectivity to AWS is required use a proxy server. |
|
Optional |
The port number of the proxy server. |
None |
Sets the port number of the proxy server if connectivity to AWS is required use a proxy server. |
|
Optional |
The username of the proxy server. |
None |
Sets the user name of the proxy server if connectivity to AWS is required use a proxy server and the proxy server requires credentials. |
|
Optional |
The password of the proxy server. |
None |
Sets the password for the user name of the proxy server if connectivity to AWS is required use a proxy server and the proxy server requires credentials. |
|
Required |
A string with resolvable keywords and constants used to dynamically generate the path in the S3 bucket to write the file. |
None |
Use resolvable keywords and constants used to dynamically generate the S3 bucket name at runtime. The handler attempts to create the S3 bucket if it does not exist. AWS requires bucket names to be all lowercase. A bucket name with uppercase characters results in a runtime exception. See Template Keywords. |
|
Required |
A string with resolvable keywords and constants used to dynamically generate the path in the S3 bucket to write the file. |
None |
Use keywords interlaced with constants to dynamically generate unique S3 path names
at runtime. Typically, path names follow the format,
|
|
Optional |
A string with resolvable keywords and constants used to dynamically generate the S3 file name at runtime. |
None |
Use resolvable keywords and constants used to dynamically generate the S3 data file name at runtime. If not set, the upstream file name is used. See Template Keywords. |
|
Optional |
|
None |
Set to |
|
Optional |
A unique string identifier cross referencing a child event handler. |
No event handler configured. |
Sets the event handler that is invoked on the file roll event. Event handlers can do file roll event actions like loading files to S3, converting to Parquet or ORC format, or loading files to HDFS. |
|
Optional (unless Dell ECS, then required) |
A legal URL to connect to cloud storage. |
None |
Not required for Amazon AWS S3. Required for Dell ECS. Sets the URL to connect to cloud storage. |
|
Optional |
|
|
Sets the proxy protocol connection to the proxy server for additional level of security. The client first performs an SSL handshake with the proxy server, and then an SSL handshake with Amazon AWS. This feature was added into the Amazon SDK in version 1.11.396 so you must use at least that version to use this property. |
|
Optional |
|
Empty |
Set only if you are enabling S3 server side encryption. Use the parameters to set the algorithm for server side encryption in S3. |
|
Optional |
A legal AWS key management system server side management key or the alias that represents that key. |
Empty |
Set only if you are enabling S3 server side encryption and the S3
algorithm is |
gg.eventhandler.name.enableSTS |
Optional |
|
|
Set to |
gg.eventhandler.name.STSAssumeRole |
Optional | AWS user and role in the following format:
{user arn}:role/{role name} |
None | Set configuration if you want to assume a different user/role. Only valid with STS enabled. |
gg.eventhandler.name.STSAssumeRoleSessionName |
Optional | Any string. | AssumeRoleSession1
|
The assumed role requires a session name for session
logging. However this can be any value. Only valid if both
gg.eventhandler.name.enableSTS=true and
gg.eventhandler.name.STSAssumeRole are
configured.
|
gg.eventhandler.name.STSRegion |
Optional |
Any legal AWS region specifier. |
The region is obtained from the
|
Use to resolve the region for the STS call. It's
only valid if the
|
gg.eventhandler.name.enableBucketAdmin |
Optional |
|
|
Set to |
gg.eventhandler.name.accessKeyId |
Optional | A valid AWS access key. | None | Set this parameter to explicitly set the access key
for AWS. This parameter has no effect if
gg.eventhandler.name.enableSTS is set to
true . If this property is not set, then the
credentials resolution falls back to the AWS default credentials
provider chain.
|
gg.eventhandler.name.secretKey |
Optional | A valid AWS secret key. | None | Set this parameter to explicitly set the secret key
for AWS. This parameter has no effect if
gg.eventhandler.name.enableSTS is set to
true . If this property is not set, then
credentials resolution falls back to the AWS default credentials
provider chain.
|
gg.eventhandler.s3.enableAccelerateMode
|
Optional | true |
false |
false |
Enable/Disable Amazon S3 Transfer Acceleration to transfer files quickly and securely over long distances between your client and an S3 bucket. |
Parent topic: Amazon S3