4 OML Notebooks

Oracle Machine Learning Notebooks is an enhanced web-based notebook platform for data analyst and data scientists. You can write code, text, create visualizations, and perform data analytics including machine learning. Notebooks work with interpreters in the back-end. In Oracle Machine Learning user interface, notebooks are available in a project, where you can create, edit, delete, copy, move, and even save notebooks as templates.

4.1 Enable GPU Compute Capabilities in a Notebook through the Python Interpreter

This topic demonstrates how to enable GPU compute capabilities in a notebook through the Python interpreter. It also shows how to get information about the current GPU on which the notebook is running, and other details.

Prerequisites:
  • Paid Oracle Autonomous AI Database Serverless database instances.
  • The GPU feature is enabled for Oracle Autonomous AI Lakehouse Serverless or Oracle Autonomous AI Transaction Processing Serverless instances with 16 or more ECPUs specified for the OML application. For cost details, refer to the Oracle PaaS and IaaS Universal Credits Service Descriptions document available on the Oracle Cloud Services contracts page.
  • While basic NVIDIA libraries are included with the base environment, you are expected to create a custom Conda environment with the GPU-enabled 3rd party libraries required for your project. Only GPU-enabled packages will benefit from GPUs in Python paragraphs.

    Note:

    By default, pre-installed and pre-configured NVIDIA libraries are provided to the GPU interpreter container in the host VM. However, third-party Python packages that use GPUs typically require specific versions of NVIDIA CUDA libraries as dependencies, which may override the included libraries.
  • Third-party GPU-enabled Python packages. In this example, we use pytorch.

Note:

There is an expected delay in starting a notebook with GPU compute capabilities due to reserving and starting the GPU resources, which may take a few minutes.
Generating embeddings using transformer models can be done in Python memory using OML Notebooks' Python interpreter. Using GPUs, transformer models, e.g., for generating sentence and image embeddings, can process larger volume data more quickly and efficiently.
To use the GPU compute capability in OML Notebooks:
  • Create a Conda environment with the desired third-party GPU-enabled Python packages (ADMIN role required).
  • Download and activate the Conda environment in OML Notebooks to use the GPU compute capabilities (OML_DEVELOPER role required).
  • In Oracle Machine Learning Notebooks, select the notebook type gpu from the Update Notebook Type drop-down menu in the notebook editor (OML_DEVELOPER role required). This setting is persisted in the notebook until you change it to another type.
To enable GPU compute capabilities in a notebook, and to view information on GPU:
  1. Create a notebook and open it in the notebook editor. Click on the Update Notebook Type icon and select gpu. By selecting gpu, you enable the notebook with GPU compute capabilities.
  2. Run the following command to download and activate a Conda environment. To know more about how to create Conda environments for Python and R, see the Related Links section below.
    %conda
    
    download gpuenv 
    activate gpuenv
  3. In this Conda environment, you will import the third-party GPU-enabled Python library Torch. Add a Python paragraph in the notebook to import the Python library torch. Type the directive %python and run the following command:
    %python
    import torch
    import torch.nn as nn
    import torch.optim as optim
    import time
  4. In another Python paragraph in the notebook and run the following command to check if GPU is available and use it:
    %python
             device = torch.device('cuda' if torch.cuda.is_available()else 'cpu')
             print(f"Device in use is: {device}")
    The command returns the following information:
  5. In this step, we will create another Python paragraph and run the following to:
    • Set the device to CPU or GPU if available
    • Create tensors on the specified device
    • Measure time for basic operations, and
    • Print the operations and results
    %python
    
    import torch
    import time
    import matplotlib.pyplot as plt
    
    # Set the device to CPU or GPU if available
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    print(f"Running on device: {device}")
    
    # Create tensors on the specified device
    x = torch.randn(2, 3, device=device)
    y = torch.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]], device=device)
    
    # Measure time for basic operations
    start_time = time.time()
    z = x + y  # Element-wise addition
    w = torch.mm(x, y.T)  # Matrix multiplication
    v = torch.matmul(x, y.T)  # Another way to perform matrix multiplication
    end_time = time.time()
    
    # Print the operations and results
    print(f"x:\n{x}")
    print(f"y:\n{y}")
    print(f"Element-wise addition (z = x + y):\n{z}")
    print(f"Matrix multiplication (w = x * y^T):\n{w}")
    print(f"Matrix multiplication using matmul (v = x @ y^T):\n{v}")
    print(f"Tensor operations time: {end_time - start_time:.6f} seconds\n")
    The command returns the following information:

4.2 Export a Notebook

You can export a Notebook in Native format (.dsnb) file, Zeppelin format ( .json ) file, in Jupyter format ( .ipynb ), and later import them in to the same or a different environment.

To export a Notebook:
  1. On the Notebooks page, select the notebooks that you want to export. You have the option to export one or more or all the notebooks.
  2. On the top panel of the notebook editor, click Export and then click any one of the following options:


    • Notebooks to Export: To export notebooks, click:
      • All: To export all the notebooks listed on the Notebooks page.
      • Selected: To export the selected notebooks.
    • Format: Click the format in which you want to export your notebook:
      • Native: Exports the notebook as a .dsnb (Data Studio Notebook) file.
      • Jupyter: Exports the notebook as a .ipynb file.
      • Zeppelin: Exports the notebook as a .json (JavaScript Object Notation) file.
    The exported notebooks are saved either as .dsnb files, .json files or .ipynb files in a zipped folder.

4.3 Import a Notebook

You can import notebooks into your Oracle Machine Learning UI projects. Oracle Machine Learning UI supports the import of notebooks in the native format (.dsnb), Zeppelin (.json) and Jupyter (.ipynb) notebooks. Oracle Machine Learning UI also supports importing notebooks from your GitHub repository.

Oracle Machine Learning UI supports the import of both Zeppelin (.json) and Jupyter (.ipynb) notebooks.
You can import a notebook on the Notebooks page. To import a notebook, go to the OML Notebooks page.

Import Notebook from File

To import a notebook from file:
  1. On the Notebooks page, click Import. Here, you have two options—File and Git.

    Figure 4-1 Notebook Import Options



  2. Click File. This opens the File Upload dialog.
  3. In the File Upload dialog, browse and select the notebook to import.

    Note:

    You must have the notebook saved as a .json file or .dsnb file to import it. You can import notebooks exported from non-Oracle Apache Zeppelin environments, but only paragraphs types that are supported may be run.
  4. Click Open.

    This completes the task of importing a notebook file into your project.

Import/Clone Notebooks from GitHub

  1. On the Notebooks page, click Import. Here, you have two options—File and Git.
  2. Click Git. This opens the GitHub Checkout page.

    Figure 4-2 Notebook Import Options



  3. On the GitHub Checkout page, enter these details:

    Figure 4-3 GitHub Checkout page



    1. Repository URL: Enter the URL of the GitHub repository you want to access. You have the following options to provide the GitHub repository URL:
      • Minimum valid URL: You can provide the GitHub URL containing only the repository name and the owner. For example, https://github.com/RepoOwner/RepoName. This loads the branches found in the remote repository.
      • Base URL and branch: Enter the GitHub repository URL along with the branch you want to clone. For example, https://github.com/RepoOwner/RepoName/tree/BranchName/ or https://github.com/RepoOwner/RepoName/blob/BranchName/. This automatically loads the Branches field. Select the Branch Name defined in the URL. and this will trigger the load of the directory structure found in the repository.
      • Base URL, branch and directory: You can also provide the GitHub URL containing one or several sub-directories in the remote repository. For example, https://github.com/RepoOwner/RepoName/tree/Branch/Dir/Di2r This loads the directory structure and pre-select the directory specified in the URL.
      • Complete file path: You can also enter the complete file path in the remote GitHub repository. For example, https://github.com/RepoOwner/RepoName/blob/Branch/Dir/file.dsnb. This loads the branches, directories and the file in the Selected notebooks field at the bottom of the dialog.
    2. Select a credential: Click the down arrow and select a credential. If you do not have a credential created, click the + icon to create one. See Create GitHub Credentials for more information.
    3. Branch: The drop-down menu displays the branches available in the remote GitHub repository based on the specified repository owner, repository name, and credential combination. Select a branch. The notebooks available in the branch are listed. You can also filter the notebooks you are looking for by typing in the notebook name in the Filter field.
    4. Select the notebooks you want to clone and click Add. The notebooks you selected are now listed in the Selected notebooks section.
    5. Click Checkout. This starts cloning all the GitHub notebooks you selected. Once completed, it displays the message "Notebooks successfully cloned". Click Open Notebook listing on the message box to go to the Notebooks listing page.

    This completes the task of cloning and importing a notebook from your GitHub repository.

4.4 Use the Scratchpad

The Scratchpad provides you convenient one-click access to a notebook for running SQL statements, PL/SQL, R, and Python scripts that can be renamed. The Scratchpad is available on the Oracle Machine Learning User Interface (UI) home page.

Note:

The Scratchpad is a regular notebook that is prepopulated with four paragraphs - %sql, %script, , %python and %r.
After you run your scripts, the Scratchpad is automatically saved as a notebook by the default name Scratchpad on the Notebooks page. You can access it later on the Notebooks page. You can run all the paragraphs together or one paragraph at a time.
  1. To open and use the scratchpad, click Scratchpad on the Oracle Machine Learning UI home page under Quick Actions. The Scratchpad opens. The Scratchpad has four paragraphs each with the following directives:
    • %sql: Allows you to run SQL statements.
    • %script: Allows you to run PL/SQL scripts.
    • %python: Allows you to run Python scripts.
    • %r: Allows you to run R scripts.
  2. To run SQL script:
    1. Go to the paragraph with the %sql directive.
    2. Type the following command and click the Run icon. Alternatively, you can press Shift+Enter keys to run the paragraph.
      SELECT * FROM SH.SALES;
    In this example, the SQL statement fetches all of the data about product sales from the table SALES. Here, SH is the schema name, and SALES is the table name. Oracle Machine Learning UI fetches the relevant data from the database and displays it in a tabular format.

    Figure 4-5 SQL Statement in Scratchpad



  3. To run PL/SQL script:
    1. Go to the paragraph with the %script directive.
    2. Enter the following PL/SQL script and click the Run icon. Alternatively, you can press Shift+Enter keys to run the paragraph.
      CREATE TABLE small_table
      	(
      	 NAME VARCHAR(200),
      	 ID1 INTEGER,
      	 ID2 VARCHAR(200),
      	 ID3 VARCHAR(200),
      	 ID4 VARCHAR(200),
      	 TEXT VARCHAR(200)
      	);
      
      	BEGIN 
      		FOR i IN 1..100 LOOP
      				INSERT INTO small_table VALUES ('Name_'||i, i,'ID2_'||i,'ID3_'||i,'ID4_'||i,'TEXT_'||i);
      		END LOOP;
      		COMMIT;
      	END;
      The PL/SQL script successfully creates the table SMALL_TABLE. The PL/SQL script in this example contains two parts:
      • The first part of the script contains the SQL statement CREATE TABLE to create a table named small_table. It defines the table name, table column, data types, and size. In this example, the column names are NAME, ID1, ID2, ID3, ID4, and TEXT.
      • The second part of the script begins with the keyword BEGIN. It inserts 100 rows in to the table small_table.

      Note:

      When using the CREATE statement with a primary key, it fails and displays the error message Insufficient privileges. This error occurs due to lockdown profiles in the database. If you encounter this error, contact your database administrator or the designated security administrator to grant the required privileges.

      Figure 4-6 PL/SQL Script in Scratchpad



  4. To run python script:
    1. To use OML4Py, you must first import the oml module. oml is the OML4Py module that allows you to manipulate Oracle AI Database objects such as tables and views, call user-defined Python functions using embedded execution, and use the database machine learning algorithms. Go to the paragraph with %python directive. To import the oml module, type the following command and click the Run icon. Alternatively, you can press Shift+Enter keys to run the paragraph.
      import oml
    2. To check if the oml module is connected to Oracle AI Database, type oml.isconnected() and click the Run icon. Alternatively, you can press Shift+Enter keys to run the paragraph.
      oml.isconnected()
    3. You are now ready to run your Python script. Type the following Python code and click the run icon. Alternatively, you can press Shift+Enter keys to run the paragraph.
      import matplotlib.pyplot as plt
      import numpy as np
      
      list1 = np.random.rand(10)*2.1
      list2 = np.random.rand(10)*3.0
      
      plt.subplot(1,2,1) # 1 line, 2 rows, index nr 1 (first position in subplot)
      plt.hist(list1)
      plt.subplot(1, 2, 2) # 1 line, 2 rows, index nr 2 (second position in subplot)
      plt.hist(list2)
      plt.show()
      In this example, the commands import two python packages to compute and render the data in two histograms for list1 and list2. The Python packages are:
      • Matplotlib: Python package to render graphs.
      • Numpy: Python package for computations.

      Figure 4-7 Python script in Scratchpad



      The two graphs for list1 and list 2 are generated by the python engine, as shown in the screenshot here.

  5. After you have created and run your scripts in the Scratchpad, the Scratchpad is automatically saved as a notebook by the name default name Scratchpad on the Notebooks page. You can edit the name of the notebook and save it with the new name by clicking Edit.

4.5 Use the Markdown Interpreter and Generate Static html from Markdown Plain Text

Use the Markdown interpreter and generate static html from Markdown plain text.

To call the Markdown interpreter and generate static html from Markdown plain text:
  1. In your notebook, type %md and press Enter.
  2. Type "Hello World!" and click Run. The static html text is generated, as seen in the screenshot below.
    Static html text
  3. You can format the text in bold. To display the text in bold, write the same text inside two asterisks pair and click Run.
    Text in bold
  4. To display the text in italics, write the same text inside an asterisk pair or underscore pair as shown in the screenshot, and click Run.
    Text in italics
  5. To display the text in a bulleted list, prefix *(asterisk) to the text, as shown in the screenshot below:
    Text in bulleted points
  6. To display the text in heading1, heading 2 and heading 2, prefix # (hash) to the text and click Run. For H1, H2, and H3, you must prefix one, two, and three hashes respectively.
    Headings