8.2.11 Azure Blob Storage

Topics:

8.2.11.1 Overview

Azure Blob Storage (ABS) is a service for storing objects in Azure cloud. It is highly scalable and is a secure object storage for cloud-native workloads, archives, data lakes, high-performance computing, and machine learning. You can use the Azure Blob Storage Event handler to load files generated by the File Writer handler into ABS.

8.2.11.2 Prerequisites

Ensure that the following are set:
  • Azure cloud account set up.
  • Java Software Development Kit (SDK) for Azure Blob Storage.

8.2.11.3 Storage Account, Container, and Objects

  • Storage Account: An Azure storage account contains all of your Azure Storage data objects: blobs, file shares, queues, tables, and disks.
  • Container: A container organizes a set of blobs, similar to a directory in a file system. A storage account can include an unlimited number of containers, and a container can store an unlimited number of blobs.
  • Objects/blobs: Objects or blobs are the individual pieces of data that you store in a storage account container.

8.2.11.4 Configuration

To enable the selection of the ABS Event Handler, you must first configure the Event Handler type by specifying gg.eventhandler.name.type=abs and the following ABS properties:

Properties Required/Optional Legal Values Default Explanation
gg.eventhandler.name.type Required abs None Selects the ABS Event Handler for use with File Writer handler.
gg.eventhandler.name.bucketMappingTemplate Required A string with resolvable keywords and constants used to dynamically generate a Azure storage account container name. None A container is created by the ABS Event handler if it does not exist using this name. See https://docs.microsoft.com/en-us/rest/api/storageservices/naming-and-referencing-containers--blobs--and-metadata#container-names. For supported keywords, see Template Keywords
gg.eventhandler.name.pathMappingTemplate Required A string with resolvable keywords and constants used to dynamically generate the path in the Azure storage account container to write the file. None Use keywords interlaced with constants to dynamically generate a unique Azure storage account container path names at runtime. Sample path name: ogg/data/${groupName}/${fullyQualifiedTableName}. For supported keywords, see Template Keywords
gg.eventhandler.name.fileNameMappingTemplate Optional A string with resolvable keywords and constants used to dynamically generate a file name for the Azure Blob object. None Use resolvable keywords and constants used to dynamically generate the Azure Blob object file name. If not set, the upstream file name is used. For supported keywords, see Template Keywords
gg.eventhandler.name.finalizeAction Optional none | delete none Set to none to leave the Azure Blob data file in place on the finalize action. Set to delete if you want to delete the Azure Blob data file with the finalize action.
gg.eventhandler.name.eventHandler Optional A unique string identifier cross referencing a child event handler. No event handler configured. Sets the downstream event handler that is invoked on the file roll event.
gg.eventhandler.name.accountName Required String None Azure storage account name.
gg.eventhandler.name.accountKey Optional String None Azure storage account key.
gg.eventhandler.name.sasToken Optional String None Sets a credential that uses a shared access signature (SAS) to authenticate to an Azure Service.
gg.eventhandler.name.tenantId Optional String None Sets the Azure tenant ID of the application.
gg.eventhandler.name.clientId Optional String None Sets the Azure client ID of the application.
gg.eventhandler.name.clientSecret Optional String None Sets the Azure client secret for the authentication.
gg.eventhandler.name.accessTier Optional Hot | Cool | Archive None Sets the tier on a Azure blob/object. Azure storage offers different access tiers, allowing you to store blob object data in the most cost-effective manner. Available access tiers include Hot, Cool and Archive. For more information, see https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers.
gg.eventhandler.name.endpoint Optional String https://<accountName>.blob.core.windows.net Sets the Azure Storage service endpoint. See Azure Government Cloud Configuration

8.2.11.4.1 Classpath Configuration

The ABS Event handler uses the Java SDK for Azure Blob Storage.

Note:

Ensure that the classpath includes the path to the Azure Blob Storage Java SDK.

8.2.11.4.2 Dependencies

Download the SDK using the following maven co-ordinates:
<dependencies>
    <dependency>
      <groupId>com.azure</groupId>
      <artifactId>azure-storage-blob</artifactId>
      <version>12.13.0</version>
    </dependency>
    <dependency>
      <groupId>com.azure</groupId>
      <artifactId>azure-identity</artifactId>
      <version>1.3.3</version>
    </dependency>
</dependencies>

8.2.11.4.3 Authentication

You can authenticate the Azure Storage device by configuring one of the following:
  • accountKey
  • sasToken
  • tenandId, clientID, and clientSecret

accounkKey has the highest precedence, followed by sasToken. If accountKey and sasToken are not set, then the tuple tenantId, clientId, and clientSecret are used.

8.2.11.4.3.1 Azure Tenant ID, Client ID, and Client Secret
You can authenticate the Azure Storage device by configuring one of the following:
To obtain your Azure tenant ID:
  1. Go to the Microsoft Azure portal.
  2. Select Azure Active Directory from the list on the left to view the Azure Active Directory panel.
  3. Select Properties in the Azure Active Directory panel to view the Azure Active Directory properties.
The Azure tenant ID is the field marked as Directory ID.
To obtain your Azure client ID and client secret:
  1. Go to the Microsoft Azure portal.
  2. Select All Services from the list on the left to view the Azure Services Listing.
  3. Enter App into the filter command box and select App Registrations from the listed services.
  4. Select the App Registration you created to access Azure Storage.
The Application Id displayed for the App Registration is the client ID. The client secret is the generated key string when a new key is added. This generated key string is available only once when the key is created. If you do not know the generated key string, then create another key making sure you capture the generated key string.

8.2.11.4.4 Proxy Configuration

When the process is run behind a proxy server, the jvm.bootoptions property can be used to set proxy server configuration using well-known Java proxy properties.

For example:

jvm.bootoptions=-Dhttps.proxyHost=some-proxy-address.com -Dhttps.proxyPort=80
-Djava.net.useSystemProxies=true

8.2.11.4.5 Sample Configuration

 #The ABS Event Handler
    gg.eventhandler.abs.type=abs
    gg.eventhandler.abs.pathMappingTemplate=${fullyQualifiedTableName}
    #TODO: Edit the Azure Blob Storage container name
    gg.eventhandler.abs.bucketMappingTemplate=<abs-container-name>
    gg.eventhandler.abs.finalizeAction=none
    #TODO: Edit the Azure storage account name.
    gg.eventhandler.abs.accountName=<storage-account-name>
    #TODO: Edit the Azure storage account key.
    #gg.eventhandler.abs.accountKey=<storage-account-key>
    #TODO: Edit the Azure shared access signature(SAS) to authenticate to an Azure Service.
    #gg.eventhandler.abs.sasToken=<sas-token>
    #TODO: Edit the the tenant ID of the application.
    gg.eventhandler.abs.tenantId=<azure-tenant-id>
    #TODO: Edit the the client ID of the application. 
    gg.eventhandler.abs.clientId=<azure-client-id>
    #TODO: Edit the the client secret for the authentication.
    gg.eventhandler.abs.clientSecret=<azure-client-secret>
    gg.classpath=/path/to/abs-deps/*
    #TODO: Edit the proxy configuration.
    #jvm.bootoptions=-Dhttps.proxyHost=some-proxy-address.com -Dhttps.proxyPort=80 -Djava.net.useSystemProxies=true

8.2.11.4.6 Azure Government Cloud Configuration

Additional configuration is required if Oracle GoldenGate for BigData has to replicate data to storage accounts that reside in Azure Government cloud.

Set the environment variables AZURE_AUTHORITY_HOST and gg.eventhandler.{name}.endpoint as per the following table:
Government cloud AZURE_AUTHORITY_HOST gg.eventhandler.{name}.endpoint

Azure US Government Cloud

https://login.microsoftonline.us.

https://<storage-account-name>.blob.core.usgovcloudapi.net

Azure German Cloud

https://login.microsoftonline.de

https://<storage-account-name>.blob.core.cloudapi.de

Azure China Cloud

https://login.chinacloudapi.cn https://<storage-account-name>.blob.core.chinacloudapi.cn

The environment variable can be set in the replicat prm file using the Oracle GoldenGate setenv parameter.

Example:

setenv (AZURE_AUTHORITY_HOST = "https://login.microsoftonline.us")

8.2.11.5 Troubleshooting and Diagnostics

  • Error: Confidential Client is not supported in Cross Cloud request.

    This indicates that the target Azure storage account resides in one of the Azure Government clouds. Set the required configuration as per Azure Government Cloud Configuration.