Oracle by Example brandingRegister a GitHub Repository as a Storage Provider with Oracle Big Data Manager

section 0Before You Begin

In this 10-minute tutorial, you learn how to register a GitHub repository as a storage provider with Oracle Big Data Manager.

Background

This is the first tutorial in the Integrate GitHub and Oracle Database with Oracle Big Data Manager series. Read the tutorials in the order listed.

What Do You Need?

  • Access to either an instance of Oracle Big Data Cloud Service or to an Oracle Big Data Appliance, and the required login credentials.
  • Access to Oracle Big Data Manager, on either an instance of Oracle Big Data Cloud Service or on an Oracle Big Data Appliance, and the required sign in credentials. A port must be opened to permit access to Oracle Big Data Manager, as described in Enabling Oracle Big Data Manager.
  • Basic familiarity with HDFS, Spark, and optionally, Apache Zeppelin.

section 1Register a GitHub Repository as a Storage Provider with Oracle Big Data Manager

In this section, you register the bdm-notebook-demo GitHub repository as a storage provider with Oracle Big Data Manager. The master branch in this repository contains the Notebook.json template that you will use in the third tutorial in this series.

  1. Sign in to Oracle Big Data Manager. See Access Oracle Big Data Manager.
  2. On the Oracle Big Data Manager page, click the Administration tab to display the Storage providers page. In our Oracle Big Data Manager instance, we have four registered storage providers: Apache Hive, HDFS, Oracle Cloud Infrastructure Object Storage, and Oracle Cloud Infrastructure Object Storage Classic.
  3. Description of the illustration 
                                initial-storage.png
    Description of the illustration initial-storage.png
  4. To register GitHub as a storage provider with Oracle Big Data Manager, click Register a new storage provider. The Register storage providers wizard is displayed. It has three pages: General, Storage Details, and Confirmation.
  5. In the General wizard page, enter github in the Name field and enter GitHub Repository in the Description field. Select Github from the Storage type drop-down list, and then click Next.
  6. In the Storage Details wizard page, copy the access token from the access-token.txt file, and then paste it in the Access token field. This access token enables you to access the bdm-notebook-demo repository. Click Test access to storage to make sure that you can access the GitHub repository storage. If the storage details that you provided are correct, the Successful, storage details are correct message is displayed. A Preview of storage content section is displayed on the page. Click Next.
  7. Description of the illustration 
                                github-storage-details-page.png
    Description of the illustration github-storage-details-page.png

    Note: You can click Help Help icon to learn how to create access tokens. In this tutorial, you will use the provided bdm-demo GitHub account, the bdm-notebook-demo GitHub repository, and an access token.

  8. In the Confirmation wizard page, review the settings. If you need to make a correction, click the back arrow Go back. If you are satisfied with the settings, click Register. The Storage providers page is re-displayed and the newly registered GitHub repository is displayed in the list of available storage providers.
    Description of the illustration 
                                github-registered.png
    Description of the illustration github-registered.png
  9. Note: You can click Manage this provider Manage storage icon to edit the GitHub storage provider properties (Name, description, and access token), disable the storage, and remove the storage.


section 2Explore the Registered GitHub Repository

In this section, you explore the contents of the registered bdm-notebook-demo GitHub repository.

  1. On the Oracle Big Data Manager page, click the Data tab.
  2. In the Data explorer section, select Github (github) from the Storage drop-down list. The bdm-notebook-demo GitHub repository is displayed in the Name column. Double-click the repository name and navigate to the master branch. This branch contains the Notebook.json Zeppelin note that you will import into Oracle Big Data Manager Notebook in the third tutorial in this series.
    Description of the illustration 
                                github-repository.png
    Description of the illustration github-repository.png
  3. You can display the content of text files in a registered GitHub repository from within Oracle Big Data Manager. Double-click the file name (or right-click the file name, and then select Show file content from the context menu). Double-click README.md to display its content. Click Close to close the window.
    Description of the illustration 
                                readme-file.png
    Description of the illustration readme-file.png

next stepNext Tutorial

Register an Oracle Database as a Storage Provider with Oracle Big Data Manager