2.3 QuickStart: Run Graph Analytics Using the Python Shell

This tutorial shows how you can get started using property graph data using the Python shell.

As a prerequisite for this quick start, you must ensure that you have completed the following installations:
  1. Start the Python shell as shown:
    ./bin/opg4py --base_url https://localhost:7007

    You are prompted to enter your username and password.

  2. Verify that the Python client is connected to a remote graph server (PGX) instance as shown:
    Oracle Graph Server Shell 22.2.0
    >>> instance
    ServerInstance(embedded: False, base_url: https://localhost:7007, version: <oracle.pgx.common.VersionInfo at 0x7fb71a1b2f68 jclass=oracle/pgx/common/VersionInfo jself=<LocalRef obj=0xadd938 at 0x7fb71a1808f0>>)
    
  3. Create the graph using the graph builder Python API.
    >>> graph = session.create_graph_builder().add_edge(1, 2).add_edge(2, 3).build("my_graph")
  4. Execute any built-in algorithm on the graph. For example:
    >>> analyst.pagerank(graph)
    VertexProperty(name: pagerank, type: double, graph: my_graph)
  5. Execute any PGQL queries and print the PGQL result set as shown:
    >>> rs = session.query_pgql("SELECT id(x), x.pagerank FROM MATCH (x) ON my_graph")
    >>> rs.print()
    +-----------------------------+
    | id(x) | pagerank            |
    +-----------------------------+
    | 1     | 0.05000000000000001 |
    | 2     | 0.09250000000000003 |
    | 3     | 0.12862500000000002 |
    +-----------------------------+

    Converting PGQL result set into pandas dataframe

    Additionally, you can also convert the PGQL result set to a pandas.DataFrame object using the to_pandas() method. This makes it easier to perform various data filtering operations on the result set and it can also be used in Lambda functions. For example,
    example_query = (
        "SELECT n.name AS name, n.age AS age "
        "WHERE (n)"
    )
    result_set = sample_graph.query_pgql(example_query)
    result_df = result_set.to_pandas()
    
    result_df['age_bin'] = result_df['age'].apply(lambda x: int(x)/20) # create age bins based on age ranges

    See Also:

    Python API Reference to view the complete set of available Python APIs