PGX 20.1.1
Documentation

Changelog

20.1.1

Possibly Breaking Changes

  • When using getGraph(String) in remote server mode PGX will not search in the public namespace if the user does not have SESSION_GET_PUBLISHED_GRAPH permission
  • When running in remote server mode PGX will reject users that do not have any permissions set in PGX - either directly or via their role
  • The PGX Webapp will now return code 401 on authentication errors and 403 on authorization / permission related errors
  • The names of flat permissions in the Pgx config and the user, role and realm fields are now prefixed with pgx_

Bug Fixes

  • Fixed a bug which caused the isFresh method to always return true for graphs loaded from RDBMS
  • Fixed a bug that could cause an excessive memory consumption when partitioning a graph while loading with a lot of distinct vertex or edge labels and string properties (backport)
  • Fixed a bug where graph modifications using the UPDATE PGQL clause would fail on graphs that are optimized for updates
  • Fixed a bug where updating an edge through UPDATE in PGQL would fail if the query also has an INSERT or DELETE clause
  • Fixed a bug that caused a PGQL parsing error when accessing a property with a name that has a space while using redaction rules
  • Fixed a bug that could cause the reverse edge index to be incorrect after applying changes containing both vertex additions and removals on a graph optimized for updates (backport)

Deprecations

  • Deprecated the allow_local_filesystem and datasource_dir_whitelist config fields; use permissions on file-locations instead

20.1.0

Features and Improvements

  • PGX shared-memory and distributed execution mode

    • Added support for authentication and authorization and fine grained data access control. See security page.
    • Added support for arbitrary expressions in PGQL's IN predicate. Previously, only positive numbers and other literals were supported.
    • Include the demo realm by default in the internal distribution package
  • PGX shared-memory execution mode

    • Throw meaningful exception in the case of Autogenerated vertex IDs for partitioned graphs
    • Added APIs to trigger graph garbage collection
      • ServerInstance.freeCachedMemory()
      • ServerInstance.freeCachedMemoryAsync()
      • ServerInstance.freeCachedMemory(double threshold)
      • ServerInstance.freeCachedMemoryAsync(double threshold)
    • Added graphs optimized for updates for faster and more memory efficient graph mutations.
    • Added the alterGraph mutation to add or remove vertex and edge providers in partitioned graphs
    • Added support for parenthesized graph patterns in PGQL (e.g. MATCH ( (n) -> (m), (n) -> (o) ) ON myGraph).
    • Added graph creation timestamp to ServerInstance.getServerState() API.
    • Allow loading from a realm provided datas source
  • PGX distributed execution mode

    • Added support for pre-loading graphs when the server initializes.
    • Added support for storing graphs to CSV files.
    • Added support for LIMIT and OFFSET without ORDER BY in PGQL.
    • Added support for distinct in projections, i.e., SELECT DISTINCT in PGQL.
    • Implemented simple CASE expression performance optimizations in PGQL.
    • Added support for HAVING in PGQL.
    • Added support for creating subgraphs from filters backed by vertex/edge collections.
    • Improved PGQL performance by refactoring result-set column manipulations (e.g., rename).
    • Improved PGQL performance by caching graph edge chunking information across queries.
    • Added support for constants in GROUP BY, HAVING, and ORDER BY in PGQL.
    • Added support for creating vertex/edge set from expression-based subgraph filters and from collection-backed subgraph filters.
    • Improved the accuracy of graph memory usage accounting.
    • Improved performance of casting from/to temporal types in PGQL queries.
    • Added support for case insensitive identifiers in PGQL 1.3.
    • Added support for graph versioning and publishing with snapshots.
    • Added support for explainPqgl().
    • Improved memory consumption of string properties.
  • Green-Marl / PGX Algorithm

    • Improved the performance of the Whom To Follow algorithm for graphs with high average out-degree
    • Added new Triangle Counting version for directed graphs.
    • Added Louvain algorithm to Analyst.
  • New: PGX Python client

    • Added Python package pypgx, a Python client for PGX. This package is in a beta release state. The API may change in the next releases.
    • Added PGX Python shell.
    • Added PgxGraph and BipartiteGraph, with support for reading from and writing to files.
    • Added GraphBuilder.
    • Added EdgeSequence, EdgeSet, VertexSequence and VertexSet.
    • Added MLLib models Pg2Vec and DeepWalk and a PgxFrame class.
    • Added CompiledProgram.
    • Added support for PGQL queries and a PgqlResultSet class.
    • Added support for authentication and authorization.

Bug Fixes

  • PGX shared-memory execution mode

    • When a PATH macro with bind variables in PGQL was invoked multiple times, then the internal bind variable count was incorrectly multiplied such that more bind variables had to be set by the user than there were bind variables in the query.
    • Fixed an issue where ARRAY_AGG in PGQL was not formatted properly for Time, Timestamp and LocalDate.
    • SELECT DISTINCT in combination with ORDER BY and unquoted (case insensitive) identifiers was not working in PGQL.
    • The unary minus of an integer property in PGQL returned a long value instead of an integer value.
    • Fixed an issue that could cause an exception when multiple sessions were concurrently creating graphs.
    • Fixed an issue that could allow a session that was not the original creator of a graph to create new snapshots for a graph that was published without snapshots
    • PGQL queries with reachability patterns inside subqueries may return incorrect results for large graphs.
    • PGQL queries with any-directed edge patterns may return incorrect results for large graphs.
    • Fixed an issue that caused some partitions being skipped during loading when the number of IO threads was low
    • Fixed a bug that could make preloaded published graph to be removed by the memory cleanup task
    • Fixed a bug where closing the most recent snapshot of a graph would free its memory but it would still be accessible to sessions
    • Removed limitation requiring RDBMS column name to be less than 30 characters.
    • Added a method in the Partition class to retrieve the vertex property used to store component IDs
  • PGX distributed execution mode

    • Fixed a bug where labels was accepted in arithmetic and comparison operations with other types.
    • Fixed an issue where Timestamps with Timezone properties on edges were not properly returned by a PGQL query.
  • PGX shared-memory and distributed execution mode

    • Make PGQL use tmp_dir if set.
    • SELECT DISTINCT followed by a unary minus expression (e.g. SELECT DISTINCT -n.prop ...) threw an exception in PGQL.
    • Avoid creating empty pgx_vfs_cache* directories in the default temporary directory.
  • Fixed a bug where loading a graph with labels using session.readGraphFile() throws a MemoryMapperOutOfBoundsException exception.

  • Fixed the documentation and operation id of the get graph names endpoint.
  • Make Filter Expression use tmp_dir if set.
  • Fixed a bug where duplicate edge removals on homogeneous graphs would cause an IllegalStateException.
  • Fixed a bug that could lead to passwords being logged
  • Fixed a bug that would ignore some fields for graph config equality calculation
  • Fixed a bug that would prevent the application of changes on partitioned graphs with unstable generated vertex IDs if there were not changes in some vertex providers
  • Fixed a bug where providing the wrong vertex ID type might cause duplicate entries in the GraphChangeSet
  • LONG_SET and SPARSE were removed from the PropertyType enumeration. Those types were not usable in any manner.
  • Fixed a bug where loading edges with missing source/destination vertices ignored error handling configuration.
  • Fixed a bug that could cause exceptions when running an algorithm on a partitioned graph

  • Deprecating Analyst.filteredBfs and Analyst.filteredDfs methods that accept a filter it has no effect.

Possibly Breaking Changes

  • oracle.pgx.vfs.VirtualFileManager can no longer be used with the tmp scheme such that any temporary directories will have to be set up manually and resolved via the file scheme.
  • Keywords DISTINCT and NOT in PGQL are now reserved such that they can no longer be used as identifier in unquoted form; work around by using the double-quoted identifier form (e.g. "DISTINCT" or "NOT").

  • The DATE property type support was removed. For graphs that were using DATE properties, it is now necessary to either use LOCAL_DATE or TIMESTAMP properties instead. Please refer to Using Datetime Data Types to determine what is the correct type for your needs. For graphs store in the PGB format, DATE properties will be loaded as TIMESTAMP properties automatically (that change should be reflected in the graph configuration if one is provided).

  • When loading edges with missing source/destination edges, the default behavior is changed to throw an error. This behavior can be changed in the graph config.

  • Replaced graphs section in ServerInstance#getServerState() with cached_graphs and published_graphs and fixed issues in ServerInstance#getServerState to not count published graphs and properties more than once

  • The vertex_labels and edge_label graph config fields to configure whether the graph has the vertex / edge label or not were removed. For graphs that were using vertex_labels or edge_label it is now necessary to use loading.load_vertex_labels and loading.load_edge_label instead

Miscellaneous

  • Upgraded Tomcat to 9.0.33
  • Upgraded GraalVM to 20.0.1
  • Upgraded log4cxx to 0.11.0-400f15c

20.0.3

Bug Fixes

  • Fixed an issue that could allow a session that was not the original creator of a graph to create new snapshots for a graph that was published without snapshots
  • Fixed a bug that could make preloaded published graph to be removed by the memory cleanup task
  • Fixed a bug where closing the most recent snapshot of a graph would free its memory but it would still be accessible to sessions

20.0.2

Bug Fixes

  • Fixed a bug that could throw a graph config validation exception when loading a graph remotely
  • Fixed an issue that could allow a session that was not the original creator of a graph to create new snapshots for a graph that was published without snapshots

20.0.1

Bug Fixes

  • Fixed a bug that could lead to passwords being logged
  • Fixed a bug that would ignore some fields for graph config equality calculation

20.0.0

Bug Fixes

  • Throw meaningful exception instead of NPE when filtering by non boolean expressions

Features and Improvements

  • Mllib

    • Added seed parameters to DeepWalk and PG2Vec
    • Added authenticated encryption to all model loading and storing
  • Partitioned graphs

    • Added support for 'PgxGraph.hasVertex()' and 'PgxGraph.hasEdge()'.
  • PGQL plan generator
  • Use key_column property from graph config to improve cost/cardinality estimation
  • PGX distributed execution mode
    • Added support for temporal types in graph storers.
    • Added support for multiple vertex labels in CSV format.
    • Added support for field extraction from temporal data types (EXTRACT) in PGQL.
    • Added support for numeric functions (ABS, FLOOR, CEIL/CEILING and ROUND) in PGQL.
    • Improved error reporting when the distributed server is not initialized properly.
    • Added support for graphs with string vertex ids in PGQL.
    • Improved concurrency and cancellation for commands related to collections, components proxy & paths.
    • Fixed loading from Apache HBase and Oracle NoSQL
    • Added support for loading from HDFS
    • Added support for CASE statements in PGQL.
    • Added support for labels() for edges and label() for vertices in PGQL queries.
  • PGQL 1.3 in shared-memory execution mode

    • Released a new version of PGQL: http://pgql-lang.org/spec/1.3/
    • Support subqueries after GROUP BY.
    • Case insensitive matching of unquoted identifiers:
      • Graph names
      • Labels
      • Property names
      • Aliases in SELECT and GROUP BY
      • Vertex and edge variable names
      • PATH macro names
    • Quoted identifiers throughout the language:
      • Graph names (previously already supported)
      • Labels (previously supported)
      • Property names (previously supported)
      • Aliases in SELECT and GROUP BY
      • Vertex and edge variable names
      • PATH macro names
    • Prepared statement support for UPDATE queries
  • PGX Algorithm

    • Improved documentation for built-in algorithm Modularity.
  • REST requests are now by default limited to not exceed 10 MB. This can be configured in the pgx configuration using max_http_client_request_size

  • Added VertexCollectionFilters and EdgeCollectionFilters that make it possible to extract a subgraph from vertex and edge collections

  • Adding the possibility to create graph snapshots from ChangeSets: Configuring the Snapshots Source

  • Deprecating PgxSession.getAvailableSnapshots(GraphConfig) in favour of PgxSession.getAvailableSnapshots(PgxGraph)

  • Partitioned graphs

    • It is now possible to use auto-refresh with partitioned graphs.
    • Added a label entry in the provider configuration to allow setting a label different than the provider name for partitioned graphs; more info in the documentation.
    • Added Subgraph support for partitioned graphs (from filter expressions, PGQL result sets and collections)
    • AUTO_GENERATED edge IDs are now supported when using GraphChangeSet with partitioned graphs
  • The embedded tomcat server now shuts down when the PGX webapp fails to start

  • The administrator can now set the minimum refresh intervals for auto-refresh and delta-refresh via the min_update_interval_sec and min_fetch_interval_sec options of the PGX engine configuration

Miscellaneous

  • Updated dependency RTS to version 2.7
  • Enabled out-of-bound checks for Java unsafe arrays

Bug Fixes

  • PGX configuration

    • Fixed a bug with configuration validation when programmatically setting a database password.
  • PGX distributed execution mode

    • Fixed a bug in PGQL where valid expressions containing label() were rejected.
    • Fixed a bug in PGQL where MIN aggregations over strings produced the wrong result.
    • Fixed a bug in PGQL where queries containing EXTRACT in ORDER BY clauses were rejected.
  • PGX Algorithm

    • Fixed a bug that caused an error when executing an algorithm that takes a string argument.
    • Fixed a bug that caused an error when using string literals.
    • Fixed a bug where string-concatenation would not work in PGX Algorithm.
    • Fixed a bug where a ConcurrentModificationException is thrown if a vertex property is created while invoking an analysis
  • Fixed a bug in PGQL on partitioned graphs where queries matching a neighbor of another vertex and specifying a constraint on the label and ID of the neighbor would fail

  • Fixed a bug with the loading of PGB files of homogeneous graphs in which there was either a single vertex label set or a single edge label
  • Added more default date formats to use when parsing properties with temporal datatypes.
  • Fixed a bug that could provoke the application of a changeset to fail on graphs with vertex labels
  • Fixed a bug that would make the first delta refresh to fail when loading from RDBMS with create vertex mode

Possibly Breaking Changes

  • Removed the rest endpoint for exists method with tableName as argument from the client part .
  • The PGX server now checks that the private key file (see server_private_key in Server Configuration guide is only readable and writable only for the current (permissions 600 in a POSIX filesystem) and refuses to load it otherwise
  • Adding invalid entities with a ChangeSet will throw an IllegalArgumentException instead of an IllegalStateException.
  • The PGX REST API no longer passes parameters as part of url paths. Parameters are now sent via HTTP headers instead.
  • All PGX ML storing and loading is now required to specify a key to use for encryption (use a null key to disable this feature).
  • DeepWalk and Pg2Vec models stored in previous versions of PGX cannot be loaded anymore.

19.4.0

Features and Improvements

  • Mllib

    • Introducing SupervisedGraphWise: a Graph Convolutional Neural Network algorithm that allows users to compute vector representations of vertices based on the topology and features of a graph in an inductive manner.
  • PGX Shell

    • Fixed a bug that prevented worker threads from being stopped when a computation is aborted in the jshell-based PGX shell (e.g. by pressing Ctrl + C).
  • On PGX single machine, improved substantially the speed to create new graphs or graph snapshots from ChangeSet changes when there are relatively few changes in the ChangeSet

  • Avoiding to wrap exceptions when parsing configurations so that the root cause issues can be easily found.

  • Adding the possibility to publish a graph with current and future snapshots Publishing a Graph.

  • We removed the limitation of graph names being unique across PGX and introduced session-private namespaces as well as a namespace for published graphs.

  • Partitioned graphs

    • Support ERROR when vertex is missing for partitioned graphs.
    • Fixed a bug that allowed adding a vertex with an ID that was already used in a different provider
    • Allow storing partitioned-while-loading graphs without having to provide a conifg.
    • Added partitioned graph support for Analyst.communitiesLabelPropoagation and Analyst.whomToFollow
  • PGX distributed execution mode

    • Improved concurrency and cancellation for commands.
    • hostnames option accepts IPv6 addresses.
    • Added support for temporal types in all loaders and PGQL queries,
    • Extended PGQL support.
      • Added support for expressions in projections.
      • Added support for expressions in GROUP BY and ORDER BY.
      • Added support for explicit casting (CAST)
  • Green-Marl / PGX Algorithm

    • Programs that use local procedures / private methods are now supported for partitioned graphs.
  • PgxSession

    • Added LATEST_SNAPSHOT constant to easily check out the latest graph snapshot.
  • Extended APIs in PgxSession, PgxGraph and MlLib classes, where some methods were incomplete.

  • Specific schemes (https, ftps, hdfs) can now be allowed for remote loading/storing, via the allowed_remote_loading_locations configuration option of the PGX engine.

  • Added several new algorithms to the Analyst API

Possible Breaking Changes

  • Removed the Groovy-based PGX shell. Users should use the new JShell-based PGX shell.
  • Removed graph config from instance.getGraphInfo(), instance.getGraphInfos() and instance.getServerState().
  • Removed support for loading from Spark.
  • Removed createListener endpoint.
  • The ftp and http protocols are not supported anymore for loading/storing data as they are unencrypted and thus insecure.
  • The REST API now identifies frames, collections, graphs, and properties by UUID instead of name.
  • With the introduction of session-private namespaces for graphs and properties, some operations that would previously throw an exception due to a naming conflict could succeed now.
  • PgxGraph.publish() could throw an exception now, if a graph with the given name has been published before.

Bug Fixes

  • Fixed a bug in the compilation of (do)while statements that contain a reduction expression.
  • Fixed a race condition that could provoke a crash when performing batched inferences with pg2vec.
  • Give meaningful error message (instead of NPE) when there is an error for storing partitioned graphs.
  • Fixed a bug where graph metadata was incorrect after running a DML query.
  • Failing early if an attempt is made at modifying graph with NO_IDS for either vertices or edges.
  • Enable ID statefy auto-detection for config-less loading of partitioned graph from CSV files.
  • Fixed bug where PGX Algorithm API / Green-Marl compilation could fail when using edge() in certain conditions.
  • Give meaningful error message (instead of constraints violation) when the nodes table has duplicated key when loading a graph from rdbms.
  • Fixed a bug where MAX value in PGQL queries was wrong.
  • Fixed a bug where ignoring global vertex/edge idType throws an IllegalArgumentException.
  • Fixed a bug where binary operations with scalar subqueries threw a NullPointerException.
  • Fixed a bug where boolean subqueries were not handled properly.
  • Fixed ConcurrentModificationException when running PGQL queries and adding new properties to the (same) graph from a different session.
  • Fixed a bug where the autogenerated name of a graph could be invalid

Security Bug Fixes

  • X-Content-Type-Options header added on every response with nosniff value.

  • Server PGX config files cannot be loaded from remote. For example, the following command now throws an exception ``` JAVA_OPTS='-Dpgx_conf=http://remote.host:8000/pgx.conf' ./pgx ...

    ```

19.3.3

Bug Fixes

  • Fix bug that prevented delta updates when changes included edge removals.

19.3.2

Security Bug Fixes

  • Fix bug that prevented from loading graphs via secure NoSQL.

19.3.1

Security Bug Fixes

  • Add server configuration option allowed_remote_loading_locations to allow loading graphs from remote locations; by default, no remote loading is allowed.

19.3.0

Features and Improvements

  • PGX Shell based on jshell

  • Reduced substantially the memory consumption when loading vertex labels with PGX shared memory.

  • PgxFrame

    • added PgxFrame.union() and PgxFrame.join() operations.
  • MlLib

    • The DeepWalk and Pg2vec algorithms can now be run not only on undirected graphs, but also directed graphs and partitioned graphs.
  • Heterogeneous graphs

    • "Heterogeneous graphs" are now renamed to "Partitioned graphs".
  • Partitioned graphs

    • PgxGraph.clone() now works on partitioned graphs.
    • The Property APIs are now supported to retrieve values of properties of partitioned graphs
    • Added support for PgxGraph.getVertex() and PgxGraph.getEdge()
    • The DeepWalk and Pg2vec MlLib algorithms are now supported on partitioned graphs
    • Fixed bugs that prevented to use partitioned graphs made of one vertex table and one edge table
  • PGX distributed execution mode

    • Added support for graph loading cancellation.
    • Added CSV files graph loader.
    • Fixed potentially high latency of graph deletion command.
    • Added support for prepared statements in PGQL queries.
    • Re-structured JSON query execution plan.
    • Added support for AS in GROUP BY.
    • Added pass-through of columns not needed for the GROUP BY operation.
    • Added support for regular expressions in PGQL queries.
    • Added support for the get session information control endpoint.
    • Added support for IS_DIRECTED, IS_SOURCE_OF and IS_DESTINATION_OF in projections in PGQL queries.
    • Added support for graph mutation via Change Sets.
    • Added support for algorithms metadata endpoints: /core/v1/analyses, /core/v1/analyses/{AID} and /core/v1/availableAnalysesIds. ]
    • Added enable_secure_handshake option which enables TLS-PSK based secured handshaking among PGX machines in a cluster.
  • PGX ISO Query Planner

    • The Query Planner now avoids the creation of InspectionStages when generating plans for distributed execution mode.
      • More heuristics for planning have been added, to improve query plan quality, especially for distributed execution mode.
    • The Query Planner now avoids exploration of sub-optimal plans involving Cartesian product.
    • The Query Planner now uses better heuristics to enable usage of common neighbor match operator.
  • Improved isCancelled check in remote mode to check the status of the future on the server.

  • Improved GraphChangeSet to not allow adding edges if vertexIdGenerationStrategy is set to AUTO_GENERATED.
  • PGQL

    • Added support for TOP k CHEAPEST path queries.
  • Green-Marl / PGX Algorithm

    • PGX Algorithm is taken out of beta (package name that contains the API is renamed from oracle.pgx.api.beta to oracle.pgx.algorithm).
    • Programs that use random vertices / random neighbors are now supported for partitioned graphs.
    • Added new built-in methods for getting random in and in-out neighbors.
      • Green-Marl: pickRandomInNbr / pickRandomInOutNbr
      • PGX Algorithm API: getRandomInNeighbor / getRandomInOutNeighbor

Miscellaneous

  • Split 'pgx-common-19.3.0.jar' into 'pgx-common_core-19.3.0.jar', 'pgx-config-19.3.0.jar' and 'pgx-common_rest-19.3.0.jar'.
  • Improved the loader error message when a wrong number of properties is given.
  • GraphMetaData.getDataSourceVersion() documentation changed: implementation details are not present anymore.

Bug Fixes

  • Fixed a bug that prevented using UDFs and built-ins in left hand side of IN predicate.
  • Fixed a bug in shared-memory execution mode that caused NullPointerException when running PGQL group by empty sub-queries.
  • Fixed a bug that prevented loading graph configuration files from various archive formats (e.g. .tar and .bz2).
  • Fixed a bug that was causing changes to graph data files to not be detected in some cases when auto-refresh is used.
  • Fixed a bug that caused Green-Marl algorithms to incorrectly treat a partitioned graph as a non-partitioned graph when the graph only has one vertex/edge provider.
  • Fixed a bug where graph validation was not performed after creating a graph from a changeset.
  • Fixed a performance bug that was causing calls to the degree methods (in_degree, out_degree, degree) to be slow in PGQL and GreenMarl algorithms on graphs obtained by application of a ChangeSet.
  • Fixed a bug that caused some tasks to record a wrong execution time.
  • Fixed a bug where a (vertex/edge)filter that accesses a label always returns false when used in an algorithm that does not use labels.
  • Fixed a potential memory leak due to string properties.
  • In the Query Planner, NeighborMatch cost estimation now does not ignore filters any more.
  • In the Query Planner, query vertices with multiple potential labels are now handled correctly.
  • Fixed a bug where loading a graph from Two Tables RDBMS with edge keys but no vertex labels would result in a crash in distributed execution mode.
  • Fixed a bug where grouping by labels in a PGQL query would give a wrong result in distributed execution mode.
  • Rejecting ORDER BY label-set columns in distributed execution mode.
  • Fixed a bug not allowing to ORDER BY elements not in the projection or GROUP BY in PGX.D.
  • Consider all GraphConfig fields for the equality check.
  • Fixed too many columns being collected for COUNT(*) with GROUP BY in PGX.D.
  • Provide graceful error message to indicate that GROUP BY in PGQL is not supported on null values.
  • Fixed a bug where removing an edge from the GraphChangeSet might throws an IllegalArgumentException.
  • Fixed a bug where cancelling a running task can cause an infinite loop in server.
  • Fixed a bug where cancellation is not propagated correctly when cancelling the execution of a pgql query.
  • Fixed a bug where a suboptimal query plan caused out of memory error.
  • Fixed a bug where adding an edge to the GraphChangeSet picks the wrong source and destination vertices if vertexIdGenerationStrategy is set to AUTO_GENERATED.
  • Fixed a bug where loading from Spark could lead to incorrect destination vertices.
  • Fixed a bug in Green-Marl that caused invalid Java code to be generated for programs that use a map with boolean values.
  • Fixed a bug where changeset application on empty changeset fails after a non-empty changeset application in some scenarios.
  • Fixed a bug where serverInstance.getMemoryInfo() returns zeros in remote mode.
  • Improve cardinality estimation for edge labels.
  • Fix a bug where partial results were shown.

Security Bug Fixes

  • Deprecated internal_communication_port pgxd configuration in the distributed. execution mode. The TCP channel has been replaced with a Domain socket.
  • Command line arguments --attach and --access_token are removed from the PGX shell running on Groovysh.
  • The command line argument --password,-p is not accepted anymore in the the PGX shell running on Groovysh: if it is used, the shell shows an error message and terminates immediately; if needed, the shell automatically prompts with the password; the same behavior occurs in the new PGX shell based on jshell.
  • When allow_local_filesystem is used, it is now required to provide a list of approved directories with datasource_dir_whitelist.
  • Credentials needed during graph loading are now passed via java keystores, by specifying the keystore_alias in the graph configuration instead. During server or shell startup keystore can be configured by appending --secret-store path/to/vault.jks, which tells PGX the keystore path to use. If the keystore is password protected, then the corresponding password will be prompted during startup.
  • Deprecated GraphConfigBuilder.setPassword in favor of encrypted java keystores for graph loading.

Possible Breaking Changes

  • Replaced 'org.apache.commons.lang3.NotImplementedException' with internal 'oracle.pgx.common.NotImplementedException', defined in 'pgx-common_core-19.3.0.jar'.
  • Source location for UDFs only support local file system, classpath and HDFS.
  • Property arguments for built-in Analyst methods and compiled programs need to have values defined for all vertices / edges.
  • When allow_local_filesystem is used, it is now required to provide a list of approved directories with datasource_dir_whitelist.
  • Command line arguments --attach and --access_token are removed from the PGX shell running on Groovysh.
  • The command line argument --password,-p is not accepted anymore in the the PGX shell running on Groovysh: if it is used, the shell shows an error message and terminates immediately; if needed, the shell automatically prompts with the password; the same behavior occurs in the new PGX shell based on jshell.
  • PGX no longer compiles PGX Algorithm or Green-Marl programs that are larger than 1MB.
  • Credentials needed during graph loading are no longer passed directly via graph configuration, but via encrypted java keystores.

19.2.1

Bug Fixes

  • Fixed a bug where a (vertex/edge)filter that accesses a label always returns false when used in an algorithm that does not use labels.

Possible Breaking Changes

  • Using the legacy Green-Marl compiler is no longer supported.
  • Compiler.GM_LEGACY has been deprecated and using it will lead to an exception at runtime.

19.2.0

Features and Improvements

  • PGX distributed execution mode
    • Text file loaders accept separator field in a graph config
    • Added support for Personalized Pagerank from Set and Personalized Weighted Pagerank from Set algorithms.
    • Added support for graph exporting. The available formats are adjacency list, edge list, PGB as well as two tables spark. For more information, see the corresponding page.
    • Session can now load the same graph multiple times and give different name to each instance, the user-provided graph names are respected, filename is used to infer the graph name if not provided by the user.
    • Added support for the IN filter expression in PGQL queries.
    • Added support for the FROM clause in PGQL queries.
    • Added support for DISTINCT aggregations without GROUP BY in PGQL queries.
    • Added support for literals in projections
    • Removed logging from the webserver that could display sensitive information
  • PGX Shell, Classpath

    • Added support for using CLASSPATH and CLASSPATH_APPEND to prepend and/or append additional dependencies to the PGX classpath to facilitate use of PGX with connectors.
  • It is now possible to load homogeneous graphs in the CSV, FLAT_FILE, TWO_TABLES RDBMS and PG formats as heterogeneous graphs based on the vertex labels and edge labels

  • The following Analyst algorithms can now run on on heterogeneous graphs:
    • Vertex betweenness centrality, approximateVertex, betweenness centrality from seeds, closeness centrality unit length, hits, eigen vector centrality, Adamic Adar, communities conductance minimization, conductance, partition conductance, partition modularity, SCC Kosaraju, SCC Tarjan, WCC, shortest path hop dist
  • Added field in runtime configuration to control the string pooling factor for string properties.
  • More information is logged on PGX startup.
  • There must now be an exact, case-sensitive match between columns and property names to load from two-tables RDBMSs

  • PGQL

    • Now, aliases in SELECT may not only be referenced in HAVING and ORDER BY but also in WHERE and GROUP BY.
    • Shared memory execution mode (Beta features). Note that the following features are beta features, their syntax and semantics might change in a future version. * Modified the syntax for updating graph modify queries * Extended beta support of graph modify queries with inserting new entities into session private graphs. * Extended beta support of graph modify queries with deleting entities from session private graphs.
  • Green-Marl / PGX Algorithm

    • Programs that use maps with vertices / edges as key or value-type are now supported for heterogeneous graphs.
    • Programs that use vertex- or edge-filters are now supported for heterogeneous graphs.
    • Added PGX Algorithm code for the built-in algorithms to the documentation.

Miscellaneous

  • Updated dependency RTS to version 2.6

Bug Fixes

  • Fixed a bug where running algorithms using read-only collections would fail
  • Fixed a bug where a memory leak could happen if applying a changeset to a heterogenoeus graph failed
  • Fixed a bug where heterogenoeus graphs built by applying a changeset on a heterogeneous graph could contain invalid edge data
  • Fixed a bug where building a changeset on a heterogeneous graph that is itself build from a changeset failed with an exception
  • Fixed a bug where running cycle detection would crash in some rare cases if there was no cycle in the graph
  • Fixed a bug where storing large graphs into file partitions behaved wrongly
  • Fixed a bug where graph loading cancellation was not propagated
  • Fixed a bug where the JDBC url would be wrongfully detected as invalid in distributed mode
  • Fixed a bug where building a changeset ignored the transient properties of the graph
  • Fixed a bug where adding modifications on non-existing properties to a heterogeneous changeset were silently ignored instead of raising an exception
  • Fixed a bug where a memory leak could happen in distributed execution mode when loading a graph with string properties
  • Fixed a bug where encoded characters in flat file graphs were not processed in distributed execution mode
  • Fixed server state REST endpoint in distributed mode
  • Fixed a bug where entity provider configurations would use a field's default value instead of the corresponding graph configuration value
  • Fixed a bug where attributes field was being ignored when describing a graph as heterogeneous
  • Fixed a bug where using SQL keywords as column names made the loading from two-tables RDBMS fail
  • Fixed a bug in the PGX Groovy shell where timing results could be for the previous command
  • Fixed a bug where loading in LOB format via datasource would fail
  • Updated to jackson-databind-2.9.9 to address a CVE
  • Fixed a bug where PG RDBMS loader would read incorrect destination vertices
  • Improved PG RDBMS partitioned view loading performance

  • PGQL

    • Fixed potential source of query plan non-determinism
    • Shared-memory execution mode
      • Fixed a bug that caused wrong results when selecting a subset of group by keys after an order by
      • Fixed a bug that caused wrong results when filtering on SHORTEST PATH aggregation
      • Fixed a bug that caused SHORTEST PATH not to always be displayed in the same order
      • Fixed a bug that caused error when evaluating id() function as a float or as a double
      • Fixed a bug that caused a multithreading issue when using SHORTEST in combination with a label predicate or other filter predicate
      • Fixed a bug that caused an unexpected PgqlException when using ORDER BY in a subquery
      • Fixed a bug that caused error when evaluating DISTINCT after SHORTEST in combination with a path filter
      • Fixed a bug that caused ARRAY_AGG to include null values in the output array; null values are now ignored
      • Fixed a bug that caused non-equality filters (e.g. n <> m) and ALL_DIFFERENT filters (e.g. ALL_DIFFERENT(n, m)) to be ignored when using SHORTEST
      • Fixed a bug that caused wrong results when specifying a filter predicate that includes the source or destination of a SHORTEST path
    • Distributed execution mode
      • Fixed graph statistics so that they are utilized in query planning.
      • Fixed potentially incorrect results when ghost nodes combined with inspection stage.
  • Green-Marl / PGX Algorithm

    • Fixed a bug that made the JVM crash when getting a default value from a map of vector.
    • Fixed a bug that caused a map to always return the same vector (instead of a new vector) as default value.
    • Fixed a bug that made it impossible to compile PGX Algorithm programs when connecting to a PGX server in remote mode.

Possible Breaking Changes

  • The oracle.pgx.api.AllPaths.getPath API will not work correctly in remote mode when using a PGX client version 19.1.0 with a PGX server version 19.2.0 or later.
  • Restructured the PGX documentation, old links to https://docs.oracle.com/cd/E56133_01/latest/ might be broken because some pages have been moved. Links to specific versions of the documentation will keep working.

19.1.0

Features and Improvements

Miscellaneous

  • Dependency on Apache Blueprints removed from OTN distribution.
  • Dropped dependency on jackson-1.9.13
  • Updated dependency Jansson to version 2.12
  • Introduced dependency Elasticsearch Java Low Level REST Client version 6.5.3
  • Updated dependency RTS to version 2.5
  • A temporary directory GM-GENERATED*** is no longer generated for each PGQL or Filter Expression execution.
  • A temporary directory vfs_cache*** is no longer left behind upon shutdown of PGX.

Bug Fixes

  • Fixed a bug where memory leak and corruption could happen when sorting by degree an undirected graph with edge properties
  • Fixed a bug where on-heap memory for string properties was not cleaned properly in remote mode
  • Fixed a bug where loading a graph with edge keys from PGB with PgxSession.readGraphFile() was throwing an exception EDGE_KEY_MAPPING_NOT_SUPPORTED
  • Fixed a bug where loading a graph from rdbms views throws a LoaderException
  • Fixed a bug due to the usage of thread local variables in PGX shared memory
  • Fixed a bug where adding a new property using the GraphChangeSet API throws an exception
  • Fixed a bug where formatting and parsing local date, timestamp and timestamp with timezone values handled negative year values incorrectly
  • Fixed a bug where on-heap memory is not cleaned properly
  • Fixed a bug where using a newly created String property causes a NullPointerException
  • Fixed a bug where adding or updating a vertex/edge property to null throws a IllegalArgumentException
  • Fixed a bug where date and timestamp values overflew when assigned too small or too large values
  • Fixed a bug where on cancellation a segmentation fault happened and the JVM was killed
  • Fixed a bug where certain operations on a published graph raised an exception
  • Fixed a bug where graph loading failed when string pool was configured to be used and the number of distinct vertex labels was small (< 1000)
  • Fixed a bug where the multidimensional-properties haven't the number of dimensions stored when saved
  • Fixed a bug that would prevent session private properties from being cleaned up when the underlying graph was shared
  • Fixed a bug that allowed to destroy a session-private graph with PgxGraph.Retention.KEEP_GRAPH, resulting in a memory leak
  • Fixed a bug where resources could leak, when loading graphs using HBase or NoSQL formats

  • PGQL (in shared-memory execution mode):

    • Added support for bind variable in java_regexp_like
    • Show a more descriptive error message instead of throwing NullPointerException when bind variables are used in queryPgql
    • Match self-edges only once in case of any-directional query edge
    • Give better error message for duplicated vertex in path query
    • Fixed ClassCastException when ordering by a vertex or edge property while there exists an expression in the SELECT that has the same variable name as the vertex or edge
    • Various queries with SELECT * in combination with ORDER BY either returned results in a wrong order or gave an unexpected error message
    • Fixed a bug that caused wrong results when selecting a subset of the group-by keys
    • Fixed a bug that caused IndexOutOfBoundsException when using group by in a subquery
    • Semantic of built-in functions label and labels is now aligned with PGQL 1.1
    • Fixed a bug that caused path filters to be ignored when using TOP k SHORTEST PATH

Possible Breaking Changes

  • Column indices for CSV files are now starting from 1 instead of 0.
  • If the default value of vertex or edge properties in two graph configurations differ, they are no longer considered the same graph. Previously, default property values where ignored during equality checks. Note that values are taken literally. For example, "1.0e0" and "1.0" will not be treated as equal.
  • Groovy installation is required to run the PGX shell. See PGX shell setup guide.
  • Removed support for spark 1.6.2.
  • Updated the default value of tmp_dir configuration field from system temporary directory to null.
  • Made some methods in VersionInfo private, which were public previously by mistake.
  • PGX releases on OTN do not include the Zeppelin interpreter package any more.

Deprecations

  • Deprecated constructor GraphMetaData(); use GraphMetaData(IdType vertexIdType, IdType edgeIdType) instead.
  • Deprecated constructor GraphMetaData(GraphMetaData other, URI baseUri); use GraphMetaData(GraphMetaData other) instead.
  • Deprecated GraphMetaData.setVertexIdType(), GraphMetaData.setEdgeIdType(). The IdType fields will be made final and the setters removed in a future version.

3.2.0

Features and Improvements

  • Introducing the PgxML library as a beta functionality for Graph Learning with two algorithms:
    • Deepwalk to compute vector representations of vertices based on the topology of a graph.
    • Pg2vec to compute vector representations of graphlets based on the topology and properties of those graphlets.
    • Trained models from these algorithms can be stored and reloaded for inference.
    • The PgxML library employs PgxFrames to communicate the output of some operations.
  • Introducing PgxFrames as a beta functionality to load/store and manipulate tabular data:
    • Tabular data can be loaded/stored from/to CSV data (commma separated or any other separator) or from/to PGB.
    • PgxFrames can be manipulated with operations such as select(), renameColumns, print(), head(), tail(), flatten() .
    • PgqResultSets can be converted to PgxFrames with PgqlResultSet.toFrame() for postprocessing or storing PGQL query output.
  • Introducing PGX Algorithm as a beta functionality to write graph algorithms in Java and have it automatically compiled to an efficient parallel implementation.
    • Enabled by setting the graph_algorithm_language configuration option to JAVA.
    • After starting the PGX shell, you can call myAlgorithm = session.compileProgram('/path/to/MyAlgorithm.java') to compile a PGX Algorithm.
  • Added support for loading and storing vertex and edge vector properties for the adjacency list, edge list, two tables text and PGB formats. Refer to the documentation of the text formats and the PGB format to know more.
  • Added support for session maps (e.g., session.createMap(PropertyType.INTEGER, PropertyType.DOUBLE, "myMap"))
  • Added support for loading two-table graph from csv files.
  • The close() and destroyAsync() methods on Destroyable API objects are no longer final.
  • Support creation of sub-graph from PGQL result set.
  • PGQL:
    • PgqlResultSet.print(...) now returns PgqlResultSet so it can be closed using rs.print().close().
    • Added ABS, CEIL, FLOOR and ROUND built-in functions.
    • Added EXTRACT function which allows extracting values from date, time or timestamp expressions.
    • Added support for TOP k SHORTEST path.
    • Added support for path returning queries via the ARRAY_AGG construct.
    • Added IN predicate which tests a value for membership in an array of values. Both array literals as well as array bind values are suppported.
    • Added support for IS NULL and IS NOT NULL.
  • OAAgraph:
    • Added oaa.edge.sequence function to create an edge sequence object.
    • Added nodeSeq and edgeSeq parameters in findCycle.oaa.graph algorithm.
    • Added cursor and columnName parameters to oaa.subGraph function to allow filtering by cursor.
  • PGX interpreter now supports Apache Zeppelin 0.8.0
  • Shell Client:
    • added :version command to print version info from client / server.

Miscellaneous

  • Updated Express.js version to 4.16.3
  • Removed legacy Green-Marl compiler
  • Better error messages upon property type mismatch during loading: the error will now report the line number (if available), along with the mismatched property types.
  • Updated Apache Commons VFS to version: 2.2
  • Updated Fastutil to version: 8.1.1
  • Updated Guava to version: 26.0-jre
  • Dependency on Apache Blueprints removed from OTN distribution.
  • Updated system requirements in the installation instructions of the documentation.
  • Improved the error message when trying to store invalid file formats by specifying path and format.
  • The PGX Client now logs future polling requests on TRACE log level.

Bug Fixes

  • Fixed a bug where memory leak and corruption could happen when sorting by degree an undirected graph with edge properties
  • Fixed a bug where on-heap memory for string properties was not cleaned properly in remote mode (backported)
  • Fixed a bug where on-heap memory is not cleaned properly
  • Fixed a bug due to the usage of thread local variables in PGX shared memory (backported)
  • Fixed a bug that would prevent session private properties from being cleaned up when the underlying graph was shared
  • Fixed a bug that allowed to destroy a session-private graph with PgxGraph.Retention.KEEP_GRAPH, resulting in a memory leak
  • Fixed a bug where loading a graph with edge keys from PGB with PgxSession.readGraphFile() was throwing an exception EDGE_KEY_MAPPING_NOT_SUPPORTED (backported)
  • Fixed a bug where on-heap memory is not cleaned properly
  • Fixed a bug due to the usage of thread local variables in PGX shared memory (backported)
  • Fixed a bug where graph loading failed when string pool was configured to be used and the number of distinct vertex labels was small (< 1000) (backported)
  • Fixed bug where Green-Marl compilation might fail when not providing a temporary directory path
  • Fixed a bug where loading a heterogeneous graph with the field create_label_histogram throws an IllegalArgumentException exception
  • Fixed a bug where the RDF loader doesn't properly handle literals of CPL, CPLL, CTL, CTLL, CPL@, CPLL@ types
  • Fixed a bug in which if a property was being defined multiples times in a graph config PGX would not throw an error during config validation and then leak memory after the graph is loaded
  • Fixed a bug where the shell history file was not saved on disk
  • PGQL:
    • When arithmetic operations with a CAST function on one side were also involving a different numeric type on the other side of the expression (e.g. in operation CAST(2 AS INTEGER) + 2.0 we have INTEGER on the left side and DOUBLE on the right side) we were getting a cannot be evaluated as error. Same situation occurred for in_degree function (e.g. in_degree(n) * 2.0) and out_degree function (e.g. out_degree(n) * 3.0). This has been fixed.
    • For arithmetic operations comparing TIME WITH TIME ZONE property against TIME expression or TIME property against TIME WITH TIME ZONE expression(e.g. TIME '23:00:00' = n.time_with_tz) we were getting a cannot be evaluated as error. Same thing happened when comparing TIMESTAMP property against TIMESTAMP WITH TIME ZONE expression and for TIMESTAMP WITH TIME ZONE property against TIMESTAMP expression. This has been fixed.
    • Fixed a bug where memory was leaked in sub-queries
    • Fixed a bug that prevented cancellation of running PGQL queries
  • Fixed a potential memory-leak which could happen when PGX is shutdown while it is loading or refreshing a graph
  • Fixed a bug where default value of string properties was not pooled.
  • Fixed a bug where default value of deferred vertex string properties was null.
  • Fixed a bug where adding a new edge in the graphBuilder might throw an IllegalArgumentException
  • Fixed a bug where calling getRuntimeConfig() on a PgxConfig object built whith PgxConfigBuilder returns null
  • Fixed "Could not replicate" exception when trying to use filter expressions inside Spring Boot applications.
  • Fixed a bug where NULL values in relational tables would cause a NullPointerException if loading edge labels in the TWO_TABLES format from database.
  • Fixed a bug where reading null temporal values from the database while loading a graph would provoke a null pointer exception
  • Fixed a bug preventing mutated graphs from being stored in PGB format

Possible Breaking Changes

  • Due to breaking changes in the Zeppelin API, the PGX interpreter will no longer work with Zeppelin installations 0.7.3 or older.
  • The legacy Green-Marl compiler is no longer supported. The UNSAFE_use_legacy_compiler flag in the Engine config has been replaced by compiler.
  • Yarn, HDFS, Spark, and HBASE support are not included any more in the OTN distribution of PGX . The feature is available via Oracle licensed products such as Big Data Spatial and Graph.
  • Updated the default value of tmp_dir configuration field from system temporary directory to null; users need to set tmp_dir to a valid path in the configuration file in order for PGX to be able to start properly.
  • Removed support for spark 1.6.2
  • Executing the PGX server in distributed mode on multiple machines is not supported in this version; distributed support will be reintroduced in an upcoming PGX version.
  • Due to licensing reasons, we do not publish the PGX client to maven any more.

Deprecations

  • Method PgxFuture<Void> setAsync(K key, V value) in oracle.pgx.api.PgxMap is deprecated, use PgxFuture<Void> putAsync(K key, V value) instead
  • Method void set(K key, V value) in oracle.pgx.api.PgxMap is deprecated, use void put(K key, V value) instead
  • Deprecated PGX config fields related to native analysis (cc, cflags and lflags)
  • Deprecated path_to_gm_compiler field in PGX config

3.1.5

Miscellaneous

  • Support Cloudera Distribution Hadoop (CDH) version 6 and dropped CDH version 5 support

3.1.2

Bug Fixes

  • Fixed a bug where some SQL exceptions could lead to full reload when using delta updates

3.1.1

Features and Improvements

  • Enabled loading of temporal properties with timezone support from PG-rdbms

Bug Fixes

  • Fixed a bug which prevented PGX to start on Windows.
  • Fixed a bug where the error handling configuration was not being applied for delta updates (Oracle Bug #28378645).
  • Fixed NullPointerException when trying to execute PGQL queries on Windows

3.1.0

Features and Improvements

  • Added possibility to refresh a graph in-place instead of creating new snapshot:

  • Added property convenience APIs getOrCreateVertexProperty and destroyVertexPropertyIfExists to PgxGraph including their variants for edge properties, vector properties and asyncronous execution.

  • Added new APIs PgxSession.createGraphBuilder() and PgxSession.createGraphBuilder(IdType idType) to create new GraphBuilder.

  • Added support for adding vertex/edge in GraphBuilder and GraphChangeSet using implicit ID:

  • PGQL:

    • Added support for scalar subqueries.
    • Added support for HAVING clause.
    • Improved the performance of recursive reachability queries under some scenarios
  • OAAgraph:

    • Improved oaa.graph.data.frame function performance.
    • Added columnNames argument in oaa.graph.ore.frame and oaa.create.oaa.graph functions to change table column names used to read a graph from two database tables or to write a graph into two database tables.
    • Added oaa.getNodeDegree function.
  • Client is now logging caught exceptions on ERROR log level instead of WARN level.
  • Improved the performance of loading graphs with a large amount of edges for some graph formats (among which two-tables, flat-file, edge list)
  • Improved integer and long vertex key mappings, making it possible to read up to 2B vertex graphs into PGX shared memory
  • Reduced the peak memory consumption during graph loading for graphs that have fewer than 2 billion edges.
  • Reduced the peak memory consumption when loading the vertex keys of a graph
  • Added Green-Marl's bipartite_check.gm to the supported algorithms in DIST.
  • Added Green-Marl's eigenvector_centrality.gm to the supported algorithms in DIST.
  • Added new built-in filteredDfs algorithm. See documentation for more details.
  • Added new maxDepth parameter to built-in filteredBfs algorithm. See documentation for more details.
  • The PGX Zeppelin interpreter now shows the output of print and println statements in paragraphs. This behavior can be turned off by setting the interpreter property pgx.printStdout to false. By default it is set to true.
  • Whether the PGX Zeppelin interpreter shows the value of the last expression of a paragraph is now configurable via the pgx.printLastExpression interpeter property. By default it is set to true.
  • Added exponential backoff when polling for the result of a remote future.
    • Added configuration field in PgxConfig interval_to_poll_max that sets the maximum value of the polling interval used by the PGX Client when requesting the future status.
  • Improved the embedded-mode performance of PgxGraph#getInEdges, PgxGraph#getOutEdges, PgxGraph#getInNeighbors, PgxGraph#getOutNeighbors
  • Added configuration field in PgxConfig interval_to_poll_max that sets the maximum value of the polling interval used by the PGX Client when requesting the future status.
  • Added client_server_interaction_mode to ClientConfig and as an argument in the PGX Shell. If the field is set to async_polling, the PGX client would poll the status of the future until it's completed, else if it's set to blocking the PGX client would send a request to directly get the value of the future and the server would block until the future result is ready.
  • Exception messages on the server are now logged with error severity
  • Enabled loading of temporal properties from PG-rdbms when they're stored as timestamps in the db tables

Miscellaneous

  • Removed ANTLR 3.5.2 dependency
  • Updated Apache Tomcat to version 9.0.8
  • Updated Groovy to version 2.4.15
  • Updated Apache Fluent HC to version 4.5.5.

Bug Fixes

  • PGQL:
    • The PgqlResultSet methods getInteger(..), getLong(..), getFloat(..) and getDouble(..) previously threw an exception when numbers should have been coerced. Similarly, getTime(..) and getTimeWithTimezone(..) threw an exception when times should have been coerced, and, getTimestamp(..) and getTimestampWithTimezone(..) threw an exception when timestamps should have been coerced. Coercion is now properly supported.
    • The CAST function now applies truncation instead of rounding when casting floats or doubles to integers or longs, as is the generally more accepted behavior for binary numeric data types.
    • Variable names that start with distinct such as DISTINCTabc or distinctxyz no longer give an error when they are used as the first element in the SELECT clause (e.g. SELECT DISTINCTabc ...).
    • Previously, when an expression in the SELECT got aliased to the same name as a vertex or edge variable in the MATCH, and, the alias or the expression itself was used in the ORDER BY, then, an error was raised with message Cannot order by vertex or Cannot order by edge. This has been fixed now.
    • A variable passed from an outer query to an inner query should result in a duplicate definition when the graph pattern of the inner query has a vertex or edge with the same name as the variable that is passed (unless both variables are vertices in which case they refer to the same vertex and there is no duplicate definition). However, the duplicate definitions were previously not detected and incorrect results were returned. Now, instead of returning incorrect results, the following exception is raised: Duplicate variable (variable with same name is passed from an outer query).
    • Variables in the SELECT clause should shadow variables in the MATCH clause. However, such shadowing was not correctly supported such that references in the ORDER BY clause in some cases wrongly referenced variables in the MATCH clause instead of the SELECT clause. This was for example the case for the query SELECT id(n) AS n MATCH (n) ORDER BY id(n). Now, such queries result in exceptions being raised instead of incorrect result being returned.
    • Fixed bug where sub-queries return wrong results when undirected edge is used on directed graph
  • Fixed bug where double values were wrongfully serialized into scientific notation in remote deployments
  • Fixed PgxGraph#createChangeSet generating very long graph names when invoked multiple times.
  • Fixed a bug in the shared memory runtime that caused an exception when applying a mutation to graphs having the same vertex label on all vertices
  • Fixed a bug where PgxGraph.undirect() throws a validation exception if the graph contains self edges.
  • Fixed NPE in MapResource#extractTopKFromMap endpoint caused by PgxFutureWrapper trying to get the resource ID of a Void result
  • Added missing method for specifying the edge label column name in a TwoTablesRdbmsGraphConfig.
  • Fixed StringIndexOutOfBoundsException in Zeppelin interpreter when trying to render certain PGQL query error messages
  • PGX server is now picking up conf/log4j2.xml by default.
  • Fixed a bug where incompatible maps were passed into a Green-Marl procedures
  • Fixed a bug where Green-Marl compilation might fail when using Filter types
  • Fixed a performance issue when using hasLabel in Green-Marl
  • Fixed a bug where DB connections could be leaked when enabling delta updates
  • Fixed a bug that prevent the auto-refresh tasks to run if a previous auto-refresh failed because of an exception
  • Fixed a bug where updates received from the database could be applied incorrectly. The change sets received by the database are now sanitized before application on a graph to avoid such issues.

  • Shell Client:

    • Fixed duplicate delete session request sent during shutdown.
    • Fixed a bug where hitting Ctrl+C in the PGX shell would lead to undefined behavior.

Possible Breaking Changes

  • PGQL:
    • The CAST function now applies truncation instead of rounding when casting floats or doubles to integers or longs. Also see the item under bug fixes (above).
    • Keywords (e.g. DISTINCT, MATCH, ORDER) are no longer allowed to be followed directly by an alphanumeric character. For example, ORDERBY is no longer allowed and should be ORDER BY. Similarly, WHEREx.prop > 3 is no longer allowed and should be WHERE x.prop > 3.
    • SELECT * is no longer allowed when there are no vertex or edge variables in the MATCH clause. This prevents zero-column result sets to be returned.
    • Order by vertex or edge type was previously already disallowed (for PGQL 1.1; not for PGQL 1.0) and errors are raised for such queries. However, it appeared that in some cases the error was not raised. To be more specific, this happened when the vertex or edge that is ordered by is a group key (e.g. .. GROUP BY v AS v2 ORDER BY v2). This is fixed now such that errors are raised in all cases.
  • Moved Changes, ElementsChanges, EdgeChanges and VertexChanges from oracle.pgx.api.graphbuilder to an internal package.
  • PgxGraph.createChangeSet() by default uses auto generated Ids for edges. To use user explicit Ids use PgxGraph.createChangeSet(IdGenerationStrategy vertexIdGenerationStrategy, IdGenerationStrategy edgeIdGenerationStrategy) instead.
  • Removed the parallelization_strategy runtime configuration field (deprecated since 2.3.0).

Deprecations

  • PgxSession.newGraphBuilder() and PgxSession.newGraphBuilder(IdType idType) are now deprecated use PgxSession.createGraphBuilder() and PgxSession.createGraphBuilder(IdType idType) instead.

3.0.0

Features and Improvements

  • Updated REST API to be compliant with the Oracle REST standards:
    • Updated Asynchronous implementation to support job status polling.
    • Redesigned URI's and methods for different PGX specific REST resources.
    • Introduced HyperMedia in REST responses.
    • Added PATCH method for partial updates and added support for POST tunneling.
    • Added support for REST API versioning.

Bug Fixes

  • Fixed bug in PgxEntity#getProperty() to return external Ids instead of internal ones in case of VERTEX/EDGE typed properties.
  • Fixed bug in Two Tables DB loader which was not supporting Temporal properties properly.
  • Fixed bug in Analyst#countTriangles when sortVerticesByDegree is true.

Deprecations

  • Deprecated obsolete REST endpoints:
    • Removed queryPgql endpoint using a HTTP GET method
    • Removed getRandomNode endpoint. Call getRandomEntity endpoint instead
    • Deprecated obsolete Control endpoints :
    • demotePinnedGraph
    • resizePool
    • addPinnedGraph
    • getSessionInfos
    • getGraphInfos
    • getSessionInfo
    • getGraphInfo
    • getMemoryInfo
    • getThreadPoolInfo

Possible Breaking Changes

  • Updated some APIs to throw NotFoundException instead of IllegalArgumentException mapped to a 404 Not Found Http error code.
  • Removed public API CompiledProgram#runAsync(PoolType targetPool, Object... args). Use CompiledProgram#runAsync(WorkloadCharacteristicPreset characteristics, Object... args) instead.

2.7.2

Features and Improvements

  • Added client configuration field: tls_version.
  • Added server configuration fields: ciphers and tls_version.

2.7.1

Miscellaneous

  • Changes to third-party libraries bundled with PGX
    • Updated Apache Tomcat to version 8.5.30
    • Updated Netty to version 4.0.56.Final

Bug Fixes

  • Fixed XSS risk in PGX API getting vertex labels
  • Fixed Cookie security setting HTTPOnly flag

2.7.0

Features and Improvements

  • Improved estimation of multiple filters selectivity in order to avoid false optimal plans.
  • Added support for Scalar collections. PGX now has two different types of collections:
    • Graph entity collections hold vertices or edges, and they are subclasses of the new oracle.pgx.api.GraphEntityCollection API instead of oracle.pgx.api.PgxCollection.
    • Session scalar collections are subclasses of the new oracle.pgx.api.ScalarCollection API and don't depend on graphs. The types of scalar collections we currently offer are sets and sequences. They can hold primitive data types like Integer, Long, Float and Double and can be created using the oracle.pgx.api.PgxSession API.
  • Added new PGX data types tutorial.
  • PGX Distributed now supports 13 more built-in algorithms:
    • Weighted Pagerank
    • Personalized Weighted Pagerank
    • Vertex Betweenness Centrality
    • Closeness Centrality (unit length)
    • HITS
    • Out-Degree Centrality
    • In-Degree Centrality
    • Degree Centrality
    • Conductance
    • Partition Modularity
    • Partition Conductance
    • Diameter
    • Periphery
  • PGX shared memory now supports two more built-in algorithms:
    • Topological Sort
    • Topological Schedule
  • Performance improvement for built-in algorithms findCycle and whomToFollow in PGX shared memory
  • Fixed potential memory leaks in the PGX runtime.
  • Moved some edge indexing and vertex key-mapping data structures off-heap in PGX.SM, therefore making it possible to load bigger graphs with the shared memory runtime
  • Added API to share graphs by name:
    • Graphs can now be shared with other sessions by calling graph.publish()
    • Other sessions can reference published graphs via session.getGraph("<name>")
    • The name can now also be modified by the owner of the graph after it is already loaded into memory by calling graph.rename("<name>")
    • Individual property types of graphs can be shared as well by calling property.publish()
    • Other sessions can see which graphs are available both in their session and globally by calling session.getGraphs()
  • PGQL:
    • Added support for EXISTS and NOT EXISTS subqueries
    • Added support for DISTINCT inside SELECT and aggregations.
    • Added support for GROUP BY inside subqueries.
    • Added support for subqueries inside PATH filters.
    • Identifiers like graph names, property names, and labels can now be delimited with double quotes to allow for encoding of special characters (e.g. the space in the property access n."my prop").
    • SQL-like escaping rules have been added for use in single-quoted string literals as well as in double-quoted identifiers. For example, n.prop = 'string''value' is now an alternative for n.prop = 'string\'value' and FROM "my""graph" is now an alternative for FROM "my\"graph".
    • Interruption of queries is now also supported during the query planning phase, which is meaningful for more complex queries that take longer to plan.
  • Exceptions logged on the server side are now associated with unique codes which can be used to map client-side exception to the corresponding error on server-side logs
  • Added parameter to specify a compression scheme to use when storing a graph, see documentation for more details
  • Javascript Client:
    • Added rename, publish and isPublished functions into Graph class.
    • Added publish and isPublished functions into Property class.
    • Added getGraph and getGraphs functions into Session class.
  • OAAgraph:
    • Added getName, rename, oaa.publish, oaa.isPublished functions for oaa.graph objects.
    • Added oaa.getGraph and oaa.getGraphs functions.
    • Added oaa.publish and oaa.isPublished functions for oaa.property objects.
    • Added file and format parameters in oaa.graph function to support pgb and graphml formats to load graphs from file.
    • Added cursor and columnName parameters to oaa.node.set function which can be used to create a node set out of a PGQL result set.
    • Added bindVars parameter to oaa.cursor function to prepare PGQL query string parameters.
  • New enterprise scheduler internal configuration flag for enabling the debug signal, which outputs a dump of internal state upon receiving the SIGQUIT signal.
  • Added preload_graphs configuration field to register graphs at start-up and publish them if needed. It overrides graphs configuration field.
  • The database BLOB filesystem implementation now supports J2EE data sources
  • Added support for creating vertex/edge sets out of a PGQL result set.
  • Added new API ServerInstance#getServerState() to retrieve the current server state (sessions, graphs, memory, threadPools and tasks)

Miscellaneous

  • Clarified in the installation requirements that target machines need to have installed the C++ standard libraries built upon 3.4.19 of the GNU C++ ABI for both PGX.SM and PGX.DIST.
  • Exceptions resulting from erroneous user input are no longer logged as ERROR but instead as INFO. Only system errors may be logged as ERROR.
  • Changes to third-party libraries bundled with PGX
    • Updated Apache Tomcat to version 8.5.28
    • Updated Apache Commons Codec to version 1.11
    • Updated Apache Commons Lang to version 3.7
    • Updated Apache HttpComponents HttpClient to version 4.5.4
    • Updated Glassfish jersey to version 2.26
    • Updated Apache HttpComponents HttpCore to version 4.4.8
    • Updated Apache Log4j to version 2.10.0
    • Updated Jline to version 2.4.15
    • Updated Groovy to version 2.4.13
    • Updated Hadoop/HBase to CDH version 5.13.1
    • Updated Jackson to version 2.9.3
    • Updated Jackson core and mapper asl to version 1.9.13
    • Updated argparse4j to version 0.8.1
    • Updated Jansi to version 1.16
    • Updated Jansson to version 2.10
    • Updated Spoofax to version 2.4.1
    • Updated Commons Configuration to version 2.2
    • Updated Jackson for Commons Configuration to version 0.7.0
    • Updated SnakeYAML to version 1.18
    • Updated Protocol Buffers to version 3.5.0
    • Updated Log4js to version 2.5.2
    • Updated Request to version 2.83.0
    • Updated Nodejs to version 8.9.4

Bug Fixes

  • Fixed enterprise scheduler causing JVM to crash during PGX startup if the machine has an unexpectedly formatted /proc/cpuinfo file on Linux.
  • Fixed a bug that could cause a NegativeArraySizeException when executing a BFS if too much memory was available
  • Fixed a bug where PGX.DIST stripped the double quotes from string properties surrounded by double quotes in Flat File graphs
  • Fixed edge label and property names not being encoded properly for Flat File format when storing.
  • Fixed unexpected parsing error connecting to an invalid PGX base URL
  • Fixed a bug where PGX.DIST loaded a string property from Flat File graph.
  • PGQL:
    • Give a meaningful error message instead of an IndexOutOfBoundsException when SELECT * is used in combination with GROUP BY.
    • Allow decimal literals to end with a dot (.).
  • Fixed a bug to prevent having duplicated vertices as seeds for built-in algorithm approximateVertexBetweennessCentrality.
  • Fixed a bug where PGX Server end up in an infinite loop when a queued task got cancelled
  • Fixed a bug where PGX server throws an IllegalStateException when trying to remove two linked vertices
  • Fixed Spark loader throws an exception if a connection is not established in listening_socket_timeout seconds.
  • Fixed NullPointerException when running out of memory in e.g. a PGQL query
  • Fixed potential memory leak when the auto refresh task of a graph was cancelled
  • Fixed a bug for built-in algorithms diameter, radius, periphery and center computing wrong eccentricity values on disconnected graphs.
  • Fixed a bug when sorting the vertices of an undirected graph by degree that could cause the edge properties and edge labels of the resulting graph to be incorrect
  • Fixed a bug where Green-Marl compilation would fail when using group-assignments involving vectors and scalar values
  • Fixed a bug where connections were being leaked when refreshing a graph from RDBMS
  • Fixed a bug in Green-Marl where the 'edge()' function could not be used on BFS iterator variables
  • Fixed a bug where compilation might fail when using node filters
  • Fixed a bug where creating a path from running shortestPathDijkstra might fail for undirected graph instances

Deprecations

  • The following methods:
    • EntityType elementType. Use PropertyType contentType instead.
    • EntityType getElementType(). Use PropertyType getContentType() instead.
    • PgxFuture<Void> addAllAsync(Collection<E> source). Use PgxFuture<Void> addAllElementsAsync(Collection<E> source) instead.
    • PgxFuture<Void> removeAllAsync(Collection<E> source). Use PgxFuture<Void> removeAllElementsAsync(Collection<E> source) instead.
    • void addAll(ID... ids). Use void addAllById(ID... ids) from oracle.pgx.api.GraphEntityCollection instead.
    • void removeAll(ID... ids). Use void removeAllById(ID... ids) from oracle.pgx.api.GraphEntityCollection instead. of the oracle.pgx.api.PgxCollection API are now deprecated.
    • The following SALSA analysis methods:
      • Pair<VertexSequence<ID>, VertexSequence<ID>> salsa(BipartiteGraph graph, int k). Use VertexProperty<ID, Double> salsa(BipartiteGraph graph) instead.
      • Pair<VertexSequence<ID>, VertexSequence<ID>> salsa(BipartiteGraph graph, int k, double maxDiff, double d, int maxIter). Use VertexProperty<ID, Double> salsa(BipartiteGraph graph, double maxDiff, int maxIter) instead.
      • Pair<VertexSequence<ID>, VertexSequence<ID>> salsaAsync(BipartiteGraph graph, int k). Use VertexProperty<ID, Double> salsaAsync(BipartiteGraph graph) instead.
      • Pair<VertexSequence<ID>, VertexSequence<ID>> salsaAsync(BipartiteGraph graph, int k, double maxDiff, double d, int maxIter). Use VertexProperty<ID, Double> salsaAsync(BipartiteGraph graph, double maxDiff, int maxIter) instead.
  • The PGX config fields cctrace, cctrace_out and cctrace_print_stacktraces are now deprecated and ignored if set. Use the corresponding client configuration fields enable_cctrace, cctrace_out and cctrace_print_stacktraces instead.
  • The PGX config field always_use_jni in the enterprise scheduler flags is now deprecated.
  • The public API methods ServerInstance#getPgxVersion and ServerInstance#getPgxVersionAsync are now deprecated, use ServerInstance#getVersion and ServerInstance#getVersionAsync instead to get information about the server's version.
  • The public API method ServerInstance#lookupPreloadedGraph is now deprecated.
  • The REST endpoint /control/graph/<graphName>/preloaded is deprecated.
  • The use_native_analysis PGX config flag is now deprecated.
  • The use_native_loaders PGX config flag is now deprecated.
  • The functions getSessionInfos, getGraphInfos, getSessionInfo, getGraphInfo, getGraphInfo, getMemoryInfo and getThreadPoolInfoin control API are now deprecated. Use ServerInstance#getServerState to get information about the current server state

Possible Breaking Changes

  • All graph names must now be globally unique, even across session boundaries. If a graph name is already taken, you'll get a meaningful exception at loading time.
  • whomToFollow now creates its internal bipartite graph in a different way, which may lead to different results.
  • Loading from Flat File format with properties defined in the file but not in the graph configuration no longer throws an exception when config strict mode is enabled. A warning is logged instead.
  • In PGQL, queries making use of missing properties or missing labels will now by default trigger exceptions, which is a change from the PGX 2.6 behavior. However, backwards compatibility is maintained with old clients.

2.6.1

Bug Fixes

  • Zeppelin Interpreter:
    • Fixed the html-error messages that are returned if an invalid command is sent to the PGX-Interpreter. Pretty-Print now returns valid HTML.
  • Distributed server/runtime :
    • Fixed a problem with configuring the number of runtime worker threads that was ignoring the user-provided configuration and could sometimes use a very small number of threads (OL-Jira GM-12065)
  • Fixed a bug where a NPE was thrown when adding new properties to an existing graph via graph builder (Oracle Bug #27169911).
  • Fixed a bug where valid Two Tables RDBMS passwords were wrongfully detected as invalid (OL-Jira GM-12494)

2.6.0

Features and Improvements

  • Added configuration spark_streams to specify the socket port and network interface for spark communication.
  • PGQL 1.0:
    • Added quantifiers for specifying min and max hops in regular path queries (e.g. + for one or more, ? for zero or one, and {1,4} for one to four hops).
    • It is now allowed to omit definitions of PATH patterns that match single edge labels. For example, SELECT * MATCH (n) -/:likes*/-> (m) is now allowed and returns the same result as PATH some_identifier AS () -[:likes]-> () SELECT * MATCH (n) -/:some_identifier*/-> (m).
    • Added an API PgxGraph#explainPgql(String) for explaining the execution plan of a PGQL query.
    • Order-by operator is speed up by pre-computing ahead of time the data to sort on.
    • Added SQL like result set iterator for PgqlResultSet. The methods next, previous, last, afterLast, first, beforeFirst, absolute and relative can be used to update a cursor, which can point to a specific row in the result set. Different get methods can be used to obtain a value in the a column. The first column and the first row have index 1.
    • Made PgqlResultSet iterable, so you can iterate over the result set directly without calling getResults() first.
  • PGQL 1.1:
    • Introduced PGQL 1.1, which has the following major changes since PGQL 1.0:
      • Introduced a FROM clause for specifying an input graph.
        • The FROM clause is optional in PgxGraph#queryPgql(String), PgxGraph#preparePgql(String) and PgxGraph#explainPgql(String)
        • The FROM clause is mandatory in PgxSession#queryPgql(String), PgxSession#preparePgql(String) and PgxSession#explainPgql(String)
      • The WHERE clause from PGQL 1.0 has been split up into a MATCH clause and a WHERE clause.
        • The MATCH clause contains the graph pattern (vertices, edges and paths) while the WHERE clause in PGQL 1.1 contains only the filter predicates.
        • Common path expressions now have an optional WHERE clause for specifying filter expressions (e.g. PATH connects_to AS () -[e]-> () WHERE e.status = 'OPEN' SELECT * MATCH ...).
      • Inlined filter predicates through WITH ... have been removed.
      • OO-style function-call syntax (e.g. x.label() or x.id()) has been replaced by functional-style syntax (e.g. label(x) or id(x)).
    • NOTE:
      • A full change log for PGQL 1.1 can be found in the PGQL 1.1 Specification.
      • For PGX-specific limitations and extensions in respect to PGQL 1.1, please refer to this page.
      • Both PGQL 1.0 and PGQL 1.1 are available through PgxGraph#queryPgql(String), however, PGQL 1.0 is now considered deprecated and may be discontinued in a future PGX version.
  • Green-Marl:
    • Added support for collections for scalar types e.g. (int, double, string)
  • Distributed server/runtime:
    • Added support for loading edge label.
    • Cluster can be deployed by IP addresses. Put an IP address in the pgx_hostnames parameter instead of host names.
  • Made GraphBuilder#addVertex, GraphBuilder#addEdge idempotent
  • Javascript Client:
    • Added Collection.contains function.
    • Added vector support in Property, Graph, Scalar classes.
    • Added Graph combine functions.
  • An edge label can now be retrieved with the getString(x) method, where x is the index or name of a column in the result set.
  • Added support for temporal types in filter expressions.
  • Added new community detection built-in algorithm Infomap.
  • Reduced PGX Java client overhead by 20-30ms per request
  • Added support for adding new properties/labels to a graph using GraphChangeSet
  • Added new configuration field context_path to PGX.SM server to allow custom context paths
  • Added OAuth 2 authentication support to PGX Java client. PGX shell has an --access_token option and Java applications can use the ServerInstance getInstance(String baseUrl, String accessToken) API.
  • Added new oracle.pgx.config.ClientConfigBuilder class which makes programmatic creation of client configs more flexible.
  • Maximum tuple size Spark loader supports is increased from 4K to 64MB
  • It is now possible to export a graph into multiple files, see documentation for more details.

Bug Fixes

  • Spark loader
    • Now PGX can load graphs from Spark with null value properties.
    • Fixed concurrency issue of Spark loader. When PGX.SM try to load graph from a lot of partitions, the concurrency issue occurred.
    • if spark_streams_interface option is not given, now Spark loader choose the first workable interface.
    • Improved error message if trying to read/write from/to Spark from outside a Spark environment
  • Improved the accuracy of the value returned by PgxGraph.getMemory() (Oracle Bug #26479966)
  • Automatic conversion of numbers from int to long for the following methods accepting vertex Ids in parameter (Oracle Bug #26796792):
    • PgxGraph#getVertexAsync
    • PgxGraph#hasVertexAsync
    • GraphChangeSet#addEdge
  • Distributed server/runtime:
    • hangs in top/bottomk if a machine has no vertices/edges
    • segmentation fault if actual number of CPU cores is smaller than pgx_num_worker_threads configuration value (Oracle Bug #26797611)
  • Fixed CachedGraph#renameProperty to keep transient properties for a session after a failed renaming (Oracle Bug #26797142).
  • Fixed a bug where calling GraphChangeSet#setRetainEdgeIds(true) doesn't create an edge key mapping (Oracle Bug #26830179)
  • PGQL
    • Fixed bug which caused filter expressions including a comparison of a vertex to a key value with the vertex variable on the right side of the expression to yield incorrect results. (Oracle Bug #26565632)
    • Fixed a bug that could cause a NullPointerException when manipulating the results of an aggregation that was null (Oracle Bug #26899982)
  • Fixed an overflow bug in PGX built-in Local Clustering Coefficient algorithm causing negative values for vertices with large degree (Oracle Bug #26786889).
  • Fixed PGX Java client basic authentication not working if username and password is passed in via Java API rather than being part of the URL (Oracle Bug #26871454)
  • Fixed the computation of number of elements in last segment for segment lists (Oracle Bug #26919963)
  • Fixed PGX Zeppelin interpreter failing to execute multi-line paragraphs containing an import statement

Possible Breaking Changes

  • Removed BIN format from the list of supported formats, which was deprecated and replaced by PGB format long time ago.
  • When reading from Spark data frames, date columns are now mapped to the PGX LOCAL_DATE type. Previously, it was mapped to the (deprecated) DATE type.
  • Moved ArgumentType from oracle.pgx.api.internal to oracle.pgx.common.types

Deprecations

  • The enable_solaris_studio_labeling PGX config flag is now deprecated.
  • The spark_streams_interface PGX config flag is now deprecated. Use spark_streams_config.network_interface instread.
  • PGQL:
    • PGQL 1.0 is now deprecated and is being replaced by PGQL 1.1.
    • The function getResults() in PgqlResultSet is now deprecated, since PgqlResultSet is now directly iterable.
    • PgqlResult (a result of resultSet.getResults().iterator().next()) is now deprecated. Instead, use PgxResult as returned from resultSet.iterator().next(). Get methods in PgxResult use index 1 for the first column while they used to use index 0 in PgqlResult.
  • The following APIs in oracle.pgx.api.Pgx are now deprecated. Instead, use the new oracle.pgx.config.ClientConfigBuilder class.
    • ServerInstance getInstance(String,String,String,Integer,Integer,Integer,Integer)
    • ServerInstance getInstance(String,String,String,Integer,Integer,Integer,Integer,Boolean)
    • ServerInstance getInstance(String,String,String,Integer,Integer,Integer,Integer,Boolean,Integer)

Miscellaneous

  • Updated versions for http client and http core libraries.

2.5.2

Bug Fixes

  • Fixed a bug where valid Two Tables RDBMS passwords were wrongfully detected as invalid (OL-Jira GM-12494)

2.5.1

Bug Fixes

  • PGQL:
    • Error in GROUP-BY when query result has zero elements is fixed (Oracle Bug #26708468)
    • Non-ASCII characters appeared in escaped form (e.g. \u6075) when printing a ResultSet through resultSet.print(..) (Oracle Bug #26865324)
  • Fixed backwards compatibility where PGX clients older than PGX 2.5.1 couldn't load graphs from a PGX 2.5.1 server (Oracle Bug #26830372)
  • Fixed bug in Spark loader which created inconsistent graphs when given duplicated vertex keys (Oracle Bug #27031666)
  • Distributed server/runtime :
    • Fixed possible crash or memory corruption when running the algorithms with ghosts enabled on graph with vertex properties (Oracle Bug #26940790)

Possible Breaking Changes

  • Removed support for Eclipse Jetty. Please deploy PGX to Apache Tomcat or Oracle Weblogic instead.
  • Setting the delimiter in the graph config is now ignored for Flat Files, since the format specification only allows comma as separator.

Miscellaneous

  • Updated version for log4j

2.5.0

Features and Improvements

  • In/Out degree is now calculated directly from CSR, as opposed to sending all edges and counting them client-side.
  • Added support of Green-Marl code compilation in scala environment
  • Added a client configuration field max_client_http_connections for controlling the maximum number of connections between a PGX Client and PGX Server.
  • Improved the error message returned if reading from the database fails due to a lack of available connections.
  • Added APIs hasVertexLabels() and hasEdgeLabel() to check whether a PgxGraph has vertex labels / edge labels.
  • Added a method getRandomEdge() to retrieve a random edge from a PgxGraph.
  • Added an API PgxSession#getGraph(String) for getting a reference to a session-bound PgxGraph by name.
  • Added graph loading configuration fields skip_edges and skip_vertices to be able to skip loading vertices or edges when reading two-table formats (CSV, Spark, RDBMS).
  • Added an API for retrieving session bound graphs
  • Added an API for retrieving graph bound collections
  • Apache Spark loader now supports Spark 2.X - use PgxSparkContext in the new oracle.pgx.api.spark2 package.
  • PGX now supports the tilde ~ character in paths as a placeholder for the user's home directory.
  • Added isNil function to vertex class of javascript client.
  • Greater than, greater than equals, less than and less than equals now supported for String values in PGQL and in filter expressions.
  • Added temporal property types local_date, time, timestamp, time_with_timezone and timestamp_with_timezone, replacing the now deprecated type date.
  • Deprecation of the RO_STRING_SET, LONG_SET and SPARSE property types. In the future, the access mode, sparsity and value types of set will not be encoded directly in the property type.
  • Added new built-in algorithms for cycle detection.
  • Improved shutdown log messages
  • Two Tables RDBMS column names are now configurable. e.g. column SVID could be renamed to a custom name like SourceVertexID.
  • Two Tables RDBMS format now supports string, integer and long as vertex ID types.
  • Added a method to copy all the values from an existing graph configuration into a graph configuration builder.
  • Added a method to copy only the base values from an existing graph configuration into a graph configuration builder.
  • Added methods to remove properties from a graph configuration builder.
  • Added a configuration flag to disable the JVM shutdown hook that shuts down PGX.
  • Updated Flat File format to properly recognize value_type 7 which corresponds to Long type, rather than using value_type 2 as before.
  • GZIP support
    • Added support to automatically detect and read graphs from GZIP'ed files without the need to unpack them first.
    • Added graph configuration flag detect_gzip to enable/disable automatic gzip compression detection.
  • Added greater than, greater than equals, less than and less than equals support for comparing String values in PGQL and in filter expressions.
  • Improved Analyst Javadocs.
  • Improved latency of refresh graph requests
  • Improved performance of BFS-based algorithms on large data-sets and moved more BFS memory to off-heap
  • We extended parsing config values from Java properties and the environment to work with nested configuration fields. For example to set the default number of IO threads in the enterprise scheduler configuration you can now use the Java property pgx.enterprise_scheduler_config__num_io_threads_per_task.
  • PGQL (also see PGQL Specification):
    • Added support for prepared statements to safeguard from PGQL injection and speed up execution of repeated queries, see Running Graph Pattern Matching (PGQL).
    • Added support for undirected query edges that match to both incoming and outgoing data edges.
    • Added temporal literal types DATE (maps to PGX property type local_date, not to the deprecated property type date), TIME, TIMESTAMP, TIME WITH TIMEZONE and TIMESTAMP WITH TIMEZONE.
    • Added support for constraints on vertices in PATH patterns. Previously, only constraints on edges in PATH patterns were supported.
      • e.g. PATH connects_to_high_volt_dev := (:Device) -> (:Device WITH voltage > 35000) SELECT ...
    • Added support for Cartesian Product in occurrence of disconnected graph patterns, e.g. SELECT * WHERE (n), (m)
    • Added support for graph patterns that are connected only via a value constraint but no edge, e.g. SELECT * WHERE (n), (m), n.prop = m.prop
    • Added the built-in function all_different(v1, v2, ...) for specifying that a certain set of values (typically vertices or edges) are all different from each other.
    • Added the <> (not equals) operator as a syntactic alternative for != (not equals).
    • Added the possibility to reference missing properties and labels (both for vertices and edges) in queries for convenience: if they are not present in the graph they will be treated as null values. The hasLabel() call returns null if the labels are missing.
    • Added built-in functions to_date, to_time and to_timestamp to convert a string to a temporal type, given a specific format.
    • Added built-in functions ST_PointFromText, ST_X and ST_Y to deal with the spatial type Point2D
  • Distributed server/runtime:
    • Now supporting property type DATE.
    • Now supporting top/bottom K for string properties.
    • Implemented endpoint rename.
    • Implemented endpoint getRandomVertex() and getRandomEdge().
    • Implemented endpoint getSource() and getDestination().
    • Reject null values (see below, Oracle Bug #25491165).
    • Now supporting edge properties of type vector.
    • Implemented endpoint getNeighbors().
    • Implemented endpoint getEdges().
    • Now support loading/storing graphs from/to Apache Spark
  • The environment variable HADOOP_CONF_DIR is now added to the PGX (shared memory) server classpath automatically if set.
  • New Green-Marl compiler based on Spoofax:
    • Green-Marl supports new language features when using the new compiler:
      • vertex/edge filters
      • ordered iteration
      • iteration over keys/values in maps
    • (New) Green-Marl compiler runs on all systems supported by the JDK
    • Foreign expressions are no longer supported in Green-Marl when using the new compiler
    • Old compiler is available as 'legacy' Green-Marl Use the following code to get the legacy compiler import oracle.pgx.compilers.Compilers import oracle.pgx.compilers.Language legacyCompiler = Compilers.findCompiler(Language.GM_LEGACY)
  • OAAgraph:
    • oaa.graph function now supports vertexIdType argument for string, integer and long.
    • oaa.graph function now supports loadEdgeIds argument.
    • added missing arguments to graph mutation functions.
    • oaa.create function now supports storeEdgeLabel and storeNodeLabels arguments.
  • Added scripts to launch PGX (shared memory) server on Windows.
  • Added configuration support for internal RTS hardware counters.
  • Improved log messages from property loading error handling, by adding the corresponding property name.
  • Improved log messages when parsing values for temporal typed properties fails.

  • When :timing ON is set, the PGX shell now always outputs time in seconds in the fixed format X+.XXXs, where X is [0-9], instead of switching between seconds and milliseconds. This change should make it easier to parse the logs for timing information.

  • Now PGX can load a graph from Spark with small number of workers.
  • Improved error message for FillProperty and SetProperty ( expected type {ValidType} but {WrongType} was given)
  • We introduced a new experimental string pool implementation which reduces on-heap memory consumption and garbage collection. It is disabled by default, but can be enabled by setting the pgx config flag string_pooling_strategy to indexed. Since it is still in an experimental stage there might be unforeseen bugs and performance issues.

Bug Fixes

  • Fixed error message when there is a mismatch in the number of vertex properties while loading from a PGB file.
  • Fixed bug in URI resolving if both parent and relative URIs are absolute (Oracle Bug #25682957)
  • Fixed a race condition that could occur when PGX is shutdown concurrently to sessions being destroyed (Oracle Bug #25856469)
  • Fixed a bug where a change set could reorder properties, resulting in wrongfully written values or exceptions (Oracle Bug #25898531)
  • ArrayIndexOutOfBoundsException when storing a graph in Two Tables RDBMS format (Oracle Bug #25683246)
  • Added property getter/setter to vertex class in Node.js client (Oracle Bug #25388494)
  • Overwrite flag was not behaving properly under Two Tables RDBMS format when storing a graph (Oracle Bug #25955647)
  • Fixed graph configuration field 'max_prefetched_rows' not being used in Two Tables RDBMS loader (Oracle Bug #26008519)
  • Corrected a bug where the generated push-down filter SQL statements could be wrong (Oracle Bug #25948473)
  • Fixed bug where adding a NIL vertex to an Order-collection could cause an exception
  • Default values for maps with collections as values are not null any longer but proper instances
  • Fixed a segmentation fault in algorithms with graphs over the size of 268 million vertices (Oracle Bug #26163795)
  • Fixed NPE when deserializing empty path returned from PGX server (Oracle Bug #26174519)
  • Fixed built-in BFS search consuming unnecessary on-heap memory (Oracle Bug #26199483)
  • Fixed a bug that prevented storing a graph in PGB format on windows (Oracle Bug #26577458)
  • Fixed a bug that prevented storing to PGB with long strings (Oracle Bug #26593270)
  • Fixed a bug that prevented storing to PGB with long strings, when overwrite == false (Oracle Bug #26632756)
  • PGQL:
    • Fixed ArrayIndexOutOfBoundsException when properties of type vertex or edge are accessed through PGQL. PGQL does not support properties of type vertex or type edge so a proper exception is generated now. (Oracle Bug #25604897)
    • Fixed ArrayOutOfBoundException when recursive path queries are executed on a same graph by multiple sessions concurrently (Oracle Bug #25774827)
    • Fixed bug in REST Server when running a PGQL query with more than 8,000 characters (Oracle Bug #25838489)
    • Fixed order by operator returning incorrect results when using multiple threads (Oracle Bug #26090931)
    • Fixed unescaping of String literal in PGQL queries; tabs, line feeds, carriage returns, backspaces, form feeds and backslashes were previously not properly unescaped. (Oracle Bug #26145160)
  • Distributed server/runtime :
    • Fixed no entityType given throws internal server error (Oracle Bug #25804875)
    • Fixed segfault if K is 0 (for top/bottom K) (Oracle Bug #25804957)
    • Fixed segfault by setting string prop to null (Oracle Bug #25806035)
    • Fixed not handling negative integer value for get prop (Oracle Bug #25811667)
    • Fixed flat file loader segfaults when loading integer edge property (Oracle Bug #25966461)
    • Fixed flat file loader does not load any node properties (Oracle Bug #25966473)
    • Fixed server not rejecting updates of non-transient (shared) properties

Possible Breaking Changes

  • Previously, it was possible to set string and date properties, as well as vertex labels, to null, resulting in corrupted graphs (property graphs don't allow null values) and unexpected exceptions during analyses (e.g. null pointer exceptions during execution of PGQL queries or during subgraph creation using filter expression). This has been fixed now (Oracle Bug #25491165):
    • null values are no longer allowed. Setting a property to null (e.g. vertexProp.set(v1, null)) will throw an exception: property values cannot be NULL
    • Default property values, which are used as an intermediate solution in PGX to represent missing data, have changed as follows:
      • The default value of a String property has changed from null to "" (empty String)
      • The default value of a Date property has changed from null to new Date(0) (epoch time)
      • The default value for vertex labels has changed from null to the empty set
    • Note that these default values are generated when loading sparse date from e.g. Oracle Property Graph Format or Oracle Two Table Format, or, when creating new vertex or edge properties using the APIs PgxGraph.createVertexProperty(...) and PgxGraph.createEdgeProperty(...).
    • The methods setProperty(String key, Object value) and addLabel(String label) in the Graph Builder and Graph Change Set APIs no longer allow null to be passed as value/label argument. Previously, null was an indication that the property/label should be set to its default (e.g. 0 for integers, 0.0 for doubles). However, with the change, an exception is thrown instead: property values cannot be null. Now, to set properties to their defaults, it is necessary to explicitly pass the default, for example: setProperty("integerProp1", 0) instead of setProperty("integerProp1", null).
  • We removed the checked exceptions ExecutionException and InterruptedException for synchronized API methods that are expected to finish quickly and are executed directly on the caller thread. A full list of affected methods can be found here: removed exception methods
  • Previously, it was possible to load Integer and Long values in Flat File format by specifying value_type 2, and both were treated as Longs internally. Right now using value_type 2, only loads Integer values (treated as Integers). To load Long values, value_type 7 has to be specified.
  • PGQL:
    • Methods ResultSet.getPgqlResultElements() and ResultSet.getNumResults() no longer throw exceptions after the result set has been closed, but instead return the same result as before closing of the result set. This is a result of removing some code complexity from the client implementations into the server implementation. Note: behavior of ResultSet is undefined after it has been closed, so avoid using result sets after closing them.
    • The addition of the possibility to reference missing labels and properties in queries makes some previously illegal queries to be now legal: instead of throwing exceptions during the query execution, the server will now send null values to the client for missing properties and edge labels and an empty set for vertex labels. For applications relying on an old client and a new server that were trying to execute those previously illegal queries, they may face different exceptions than before or see null values/empty sets returned instead.

Deprecations

  • PgxSparkContext in the oracle.pgx.api is now deprecated. Use the class in oracle.pgx.api.spark1 package instead.
  • The REST endpoint /core/graph/<graphname>/randomNode is deprecated. /core/graph/<graphname/randomEntity should be called instead
  • The graph configuration fields for Spark skip_nodes and skip_edges are deprecated. Use graph loading configurations fields loading.skip_vertices and loading.skip_edges instead.
  • The graph configuration methods isSkipNodes() and isSkipEdges() are deprecated. Use skipVertexLoading() and skipEdgeLoading() methods instead.
  • The SALSA algorithm algorithms/link_prediction/salsa_deprecated.gm is deprecated. algorithms/ranking_and_walking/salsa.gm should be used instead
  • The CALLER_THREAD PoolType is deprecated.
  • The REST endpoint /core/analysis/<analysisId> with a targetPool is deprecated. Use the workloadCharacteristics field instead
  • The use of the path finding filter argument type is deprecated.
  • The property type DATE is deprecated. Use LOCAL_DATE, TIME, TIMESTAMP, TIME_WITH_TIMEZONE or TIMESTAMP_WITH_TIMEZONE instead.
  • The REST endpoint GET /core/graph/<graphname>/query is deprecated. Use POST to /core/graph/<graphname>/query with query and semantic options in json payload
  • PGQL:
    • User-defined pattern matching semantic (i.e. ISOMORPHISM / HOMOMORPHSIM) is deprecated. Homomorphism stays the default semantic but isomorphic constraints should now be specified using either the new built-in PGQL function all_different(v1, v2, ...) or using non-equality constraints (e.g. v1 != v2). The deprecations are as follows:
      • The method PgxGraph.queryPgql(String, PatternMatchingSemantic) (use PgxGraph.queryPgql(String) instead)
      • The method PgxSession.setPatternMatchingSemantic(..)
      • The configuration field pattern_matching_semantic

Misc

  • Changes to third-party Java libraries bundled with PGX
    • Updated Jersey to version 2.25.1
    • Updated Groovy to version 2.4.10
    • Updated Tomcat to version 8.5.14
    • Updated argparse4j to version 0.7.0
    • Updated slf4j to version 1.7.25
    • Updated commons-lang3 to version 3.5
    • Updated jline to version 2.14.3
    • Updated jansi to version 1.15
  • Changes to third-party Node.js libraries bundled with PGX
    • Updated Node.js to version 6.11.1 (LTS)
    • Updated check-types to version 7.1.5
    • Updated cookie to version 0.3.1
    • Updated log4js to version 1.1.1
    • Updated node-uuid to version 1.4.8

2.4.4

Features and Improvements

  • R client (OAAgraph)
    • nrow function was added for oaa.cursor, oaa.partition, oaa.node.set and oaa.node.sequence objects

Possible Breaking Changes

  • R client (OAAgraph)
    • names.oaa.graph function was replaced with getPropertyNames.oaa.graph function to avoid conflict with names function in base package
    • names.oaa.cursor function was replaced with getColumnNames.oaa.cursor function to avoid conflict with names function in base package
    • length function was modified for oaa.cursor, oaa.partition and oaa.node.collection objects to avoid conflict with str function in base package. It's now returning number of columns rather than number of rows

2.4.3

Bug Fixes

  • Allow underscores in hostnames when connecting to a PGX server

2.4.2

Features and Improvements

  • Improved performance of BFS-based algorithms on large data-sets and moved more BFS memory to off-heap
  • Improved R client (OAAgraph)
    • oaa.getPersistedSnapshots function now returns character(0) instead of an error if there are no snapshots
    • oaa.graphSnapshot function now returns an error if no snapshots were found
    • Added nodeSet option to personalized versions of pagerank and SALSA algorithms
    • oaa.graphSnapshotList function now supports lowercase pattern argument.
    • Added serviceName argument to oaa.graphConnect function to support connecting to Databases which do not have a SID
    • oaa.getProperty function prints a warning and returns NULL if property not found instead of throwing an error.

Bug Fixes

  • ArrayIndexOutOfBoundsException when storing a graph in Two Tables RDBMS format (Oracle Bug #25683246)
  • Fixed graph configuration field 'max_prefetched_rows' not being used in Two Tables RDBMS loader (Oracle Bug #26008519)
  • Fixed a segmentation fault in algorithms with graphs over the size of 268 million vertices (Oracle Bug #26163795)
  • Fixed NPE when server returns an empty PgxPath (Oracle Bug #26174519)
  • oaa.create function does not print warning messages if table does not exist and overwrite flag is TRUE

2.4.1

Features and Improvements

  • Backported support for Zeppelin 0.7.0 (previously 0.6.2)
  • Improved R client
    • Added support for ORE 1.5.1
    • now supports character node IDs when reading from data frames
    • added demo illustrating interaction of OAAgraph and ORE
    • added function to set the log level of the underlying PGX client

Bug Fixes

  • Fixed bug in vertex label creation if the number of distinct vertex labels is larger than 64 (Oracle Bug #25791120)

2.4.0

Features and Improvements

  • Initial release of OAAgraph - a PGX client implemented in R
  • Added ability to cast types in PGQL (an extension to the spec, see the PGQL notice).
  • Improved performance of two-tables database loader.
  • Added APIs to control whether or not graph builder / changeset APIs create vertex or edge IDs
  • Added an API for easy programmatic construction of a PgxConfig object
  • Added some convenience APIs to to ServerInstance
    • Added an API to start the PGX engine by giving a reference of a PgxConfig object
    • Added an API to retrieve the current PGX config as an instance of PgxConfig
  • Improved Apache Spark loader
    • can now handle vertex or edge IDs of type string
    • can now handle (single value) vertex label and edge label (new VL and EL columns)
  • PGX.D now supports string properties
  • Added more personalization options (Hubs, Authorities and Mixed) for SALSA algorithm
  • Added APIs for new built-in algorithm PRIM to find the minimum spanning tree in an undirected graph
  • Added API to create the transpose of a graph
  • Added edge label support for javascript client
  • The PGB Format got extended to store the property names of the graph
  • The config-less loader can read the property names from the PGB Format
  • A couple of API methods got optimized to execute directly on the caller thread to improve performance.
  • Refreshing a graph from Oracle RDBMS now tries to do a delta-update before reloading the whole graph, even if no auto-refresh is configured.
  • Added a client configuration field for controlling the maximum number of connections between a PGX Client and PGX Server.

Bug Fixes

  • fixed NPE when loading edge label in EDGE list format (Oracle Bug #25077918)
  • fixed PGQL COUNT unsupported property type exception if property has values of type VERTEX/EDGE (Oracle Bug #25077861)
  • fixed PGQL division by zero exception on PATH queries (Oracle Bug #25439910)
  • fixed distributed runtime performance problem when creating partition objects (Oracle Bug #25083223)
  • fixed Spark loader throwing an error when tyring to load boolean property types (Oracle Bug #25093154)
  • fixed allocating more memory than necessary when storing string properties into PGB format (Oracle Bug #25093613)
  • fixed RDF loader throw an error during merge phase (Oracle Bug #25164042)
  • fixed two security issues in REST endpoint (Oracle Bugs #25167906 and #25168157)
  • fixed enterprise scheduler causing JVM to crash during PGX shutdown if JVM is instrumented via profiler (Oracle Bug #25247357)
  • fixed distributed runtime returning wrong vertex ID type if asked for properties of value type vertex (Oracle Bug #25076856)
  • fixed a bug which caused 100% CPU when no task was scheduled to the engine (Oracle Bug #25368409)
  • fixed PGX.D reading from PG (NoSQL/HBase) fails due to invalid config (Oracle Bug #25437504)
  • fixed a bug where a user could easily overwrite the PgxConfig at runtime
  • fixed memory leak when removing properties doing in-place simplify or undirect mutations (Oracle Bug #25456545)
  • fixed an issue where graphs built with the graph-builder won't have a creationTimestamp and creationRequestTimestamp (Oracle Bug #25463066)
  • disabled updating undirected graphs, now throwing exception if trying to update undirected graph via change set API. Before, this lead to undefined behavior and potential memory leaks. (Oracle Bug #25476335)
  • fixed NPE when iterating over the top/bottom k values of a string property containing null values (Oracle Bug #25484804)
  • fixed NPE in PGQL when querying string property with null values (Oracle Bug #25491165)
  • fixed ConcurrentModificationException on PGQL path queries when using multiple sessions concurrently (Oracle Bug #25425007)
  • fixed NPE when setting a string property in the graph builder/graph change set to null (Oracle Bug #25491213)
  • fixed a bug that wouldn't update a string value to null over the remote API (Oracle Bug #25497517)
  • fixed applying an empty changeset on a graph with vertex labels resulting in an UnsupportedOperationException (Oracle Bug #25496377)
  • fixed creating a graph using the graph builder with more than 64 distinct vertex labels throws an IllegalArgumentException (Oracle Bug #25506682)
  • fixed a bug where applying a changeset could reorder the properties (Oracle Bug #25526304)
  • fixed a bug where applying a changeset could drop properties (Oracle Bug #25526313)
  • fixed a bug where an undirected graph would show wrong edges (Oracle Bug #25552313)

Misc

  • Updated third-party library Spoofax bundled with PGX to version 2.1.0
  • Updated third-party library Commons Configurations2 (Jackson) bundled with PGX to version 0.6.1
  • Updated third-party library Apache Tomcat bundled with PGX to version 8.0.41

Possible Breaking Changes

  • In previous versions of PGX, a PGQL query using the built-in function id() would throw an exception if the graph has no IDs defined. This behavior changed in PGX 2.4.0. Such a query now succeeds and the internal IDs are returned. This applies to both vertex and edge IDs.
  • In previous versions of PGX, calling setUseVertexPropertyValueAsLabel(...) on a graph config builder implicitly loaded that given property as a vertex label. Since 2.4.0, you additionally have to call setLoadVertexLabel(true) on that same builder object, otherwise the resulting graph will not have any vertex labels.

Deprecations

  • Calling the REST endpoint for /core/graph/build without the config parameter is deprecated

2.3.1

Features and Improvements

  • Green-Marl compiler now supports code generation for edge properties of undirected graphs

Misc

  • Updated third-party library Apache HttpComponents HttpClient bundled with PGX to version 4.3.6 (CVE)
  • Updated third-party library Apache HttpComponents Core bundled with PGX to version 4.3.3 (CVE)

2.3.0

Features and Improvements

  • Added the enterprise scheduler and made it the default scheduler for PGX. This includes
    • Concurrent execution of tasks from multiple sessions
    • More detailed configuration for thread pools with pool weight, priority and maximum number of threads
    • Dynamically sized IO thread pool
    • Detailed, per task settings for each thread pool
  • Added arithmetic expressions to filter expressions, used for subgraph filtering, path finding and collection creation
  • PGB loader improvements
    • PGB loader can now load graphs that were stored with 64 bit vertex ids, as long as it has less than 2 billion vertices
    • PGB loader can now load graphs that are not semi-sorted
  • Made undirected graphs first-class citizens of PGX
  • Improved graph simplifying and undirecting API with strategies for reducing edges and edge properties. Now it is possible to merger or pick the edge properties of multi-edges using a MutationStrategy
  • Named arguments in the PGX shell analyst can now be abbreviated
  • Added API to create a bipartite subgraph of vertices with in-degree = 0 being the left set
  • Added loader to read and store from and to D3.js forced-layout format
  • Apache Zeppelin PGX interpreter improvements:
    • Cancellation support
    • support for client-side timeouts of paragraph execution
    • non-visual return values and stack traces are now pretty printed
    • stack traces no longer displayed by default
    • updated to support Apache Zeppelin version 0.6.1
    • better PGQL syntax error messages
  • Apache Spark integration: added support for writing back from PGX into Spark RDDs
  • Drastically improved performance of creation and iteration over Partition objects
  • Improved error messages when parsing text data to give more precise information regarding line number, error offset and description
  • Added more built-in algorithms:
    • Periphery
    • Center
    • Local Clustering Coefficient
    • Personalized Weighted Pagerank
  • Added normalization option to Pagerank and its variants (except Approximate)
  • PgxVertex now returns Java Collections instead of Iterable when asked for in/out edges or getIn/OutNeighbors are called. Thereby, operations like vertex.getInNeighbors().size() are possible to determine the degree of a vertex.
  • Added convenience methods in order to determine the in and out degree of a PgxVertex.
  • Added method to PgxEdge that returns its vertices as a Pair.
  • Added method to retrieve the neighbors of a PgxVertex by specifying an edge Direction
  • New built-in algorithms for distributed runtime: Weighted PageRank and Personalized SALSA.
  • Added contains() method to PgxCollection.
  • Added PGX shell launcher scripts for Windows
  • Added API to retrieve a PGQL result set by ID
  • Added API and PGX shell command line option to attach to an existing session
  • Added support for edge keys in the distributed runtime
  • Added support for retrieval of edge property values via REST in the distributed runtime
  • Performance improvements for the distributed loader (now up to 2x faster)
  • Both EDGE_LIST and ADJ_LIST loaders can now read from multiple data files.

Bug Fixes

  • fixed relative paths not being resolved properly in multiple source file loaders
  • fixed reading from files which have spaces in the path (Oracle Bug #24971444)
  • fixed approximatePagerank() in the Analyst calling non-approximated algorithm internally
  • fixed personalizedPagerank() not respecting vertex argument correctly in the Analyst
  • fixed clone() and toMutable() methods of collections ignoring user-specified names in remote mode
  • fixed division/modulo by 0 with floating points behavior in PGQL: now throwing ArithmeticException (Oracle Bug #24971496)
  • fixed memory leak when a PGQL query completes exceptionally (Oracle Bug #24971491)
  • fixed loading graph configs and/or graphs via HTTP/HTTPS (Oracle Bug #24971441)
  • fixed unnecessary large preallocation of memory when loading small graphs (Oracle Bug #24971482)
  • fixed an issue where every vertex was reported as existing when a graph without vertex keys was loaded
  • fixed getVertexLabels() not working in the remote case
  • fixed an issue where vertex sets did have the wrong element type in the distributed runtime
  • fixed issue where BFS navigator would be evaluated before necessary information for parent node has been computed

Misc

  • Updated third-party library Groovy bundled with PGX to version 2.4.6
  • Updated third-party library Node.js bundled with PGX to version 4.6.1

Deprecations

  • On the GraphConfigBuilder class, the methods forSingleFileFormats() and forMultipleFileFormats() are now deprecated and were unified into a new forFileFormats() which accepts all of the following:
    • a single file via setUri() method
    • multiple files via addUri() method
    • multiple vertex and edge files via addVertexUri() and addEdgeUri() methods Similarly, the methods forSingleFileFormat(Format) and forMultipleFileFormat(Format) are now deprecated and unified into a new forFileFormat(Format) method.
  • With introduction of the enterprise scheduler the following PGX config fields are deprecated:

    • num_workers_analysis, num_workers_fast_track_analysis and num_workers_io on the top level are now deprecated. Instead, they must be placed into the new, nested fj_pool_config field. For example, the following config file:

      {
        ...
        "num_workers_analysis": 64,
        "num_workers_fast_track_analysis": 1,
        "num_workers_io": 72,
        ...
       }
      

      must now be written like this:

       {
         ...
         "basic_scheduler_config": {
           "num_workers_analysis": 64,
           "num_workers_fast_track_analysis": 1,
           "num_workers_io": 72
         },
         ...
       }
      
    • parallelization_strategy is now deprecated and was replaced by a new field scheduler. For example, the following config file:

      {
        ...
        "parallelization_strategy": "task_stealing_counted",
        ...
      }
      

      must now be expressed like this:

      {
        ...
        "scheduler": "basic_scheduler",
        ...
       }
      

      On a similar note, the previous value rts for parallelization_strategy is mapped to the scheduler value enterprise_scheduler. The remaining two deprecated strategies task_stealing and segmented no longer have any effect and will be treated like setting the scheduler value to basic_scheduler

  • PGQL result element types VERTEX_LABELS and EDGE_LABEL changed to STRING_SET and STRING.

    The column types of a PgqlResultSet, as obtained via getPgqlResultElements().get(i).getElementType(), previously included VERTEX_LABELS in the case the labels of a vertex were returned, and, EDGE_LABEL in the case the label of an edge was returned. VERTEX_LABELS and EDGE_LABEL are now deprecated and are no longer returned from getElementType(). Instead, STRING_SET is returned in the case of vertex labels and STRING is returned in the case of an edge label. Take the following example PGQL query:

    SELECT n.labels(), e.label(), n.stringProp
    WHERE
      (n) -[e]-> (m)
    

    Previously, getPgqlResultElements() would return the list [VERTEX_LABELS, EDGE_LABEL, STRING], but this has now changed to [STRING_SET, STRING, STRING].

2.2.1

  • fixed changeset/graph builder API mixing up values of existing properties (Oracle Bug #24616333)

2.2.0

Features and Improvements

  • Initial release of Apache Spark integration
  • Initial release of a PGX Node.js client
  • A more flexible Analyst API
    • Optional parameters now have default values and can be omitted (overloaded Java methods)
    • The Groovy API in PGX Shell now supports named parameters
    • Existing data structures (e.g. an existing vertex property) can now be reused to hold the result of algorithms
  • Added API to export runtime-compiled algorithms into a .jar file to persist the compilation across different PGX instances. See this tutorial for an example usage.
  • Added support to load graph data in PGB and GraphML formats without the need to specify a graph config first.
  • Added a timing mode to PGX shell
  • Added support for edge keys and string vertex keys in the PGB Format.
  • Added new algorithms to compute the diameter, radius and eccentricity of a graph.
  • Upgraded to PGQL 1.0
  • Optimized PgxGraph#getVertices() and PgxGraph#getEdges() to return an immutable, virtual view on all vertices/edges

Bug Fixes

  • fixed property values of type VERTEX/EDGE not being returned as PgxVertex/PgxEdge objects
  • fixed potential race condition when executing algorithms running on different pools
  • fixed EdgeProperty.ALL giving an error if using remotely during sparsify()
  • fixed problem where running certain algorithms on an undirected graph would throw an exception
  • fixed segmentation fault in bidirectional dijkstra algorithm
  • fixed problem where a user was able to export a graph in flat file format with vertex keys of type string despite the format not supporting this
  • fixed problems in the distributed runtime where loading of node-properties was not performed correctly after graph partitioning
  • fixed segmentation fault while the system is shutting down

Misc

  • Removed support for JDK 7. PGX now requires JDK 8 to be installed.
  • Updated third-party library Spoofax to version 2.0.0
  • Updated third-party library fastutils to version 7.0.12
  • Updated third-party library Apache commons-lang to version 3.4
  • Updated third-party library Apache commons-io to version 2.5
  • Updated third-party library Node.js bundled with PGX.D to version 4.4.7

2.1.0

Features and Improvements

Bug Fixes

  • fixed "graph not found" error when trying to access result of whom-to-follow algorithm (Oracle Bug #23223057)
  • fixed race condition in undirect() implementation which might lead to a corrupt vertex ID mapping on large graphs (Oracle Bug #23527638)
  • fixed race condition in generated code of Bellman Ford algorithm which might lead to incorrect results on large graphs (Oracle Bug #23527644)

2.0.0

Features and Improvements

  • Initial release of the PGX distributed runtime. Unlike the previous versions of PGX, the distributed runtime is capable of leveraging multiple machines. Note that this initial release of the distributed runtime only supports a limited subset of the PGX API.
  • New PGQL features:
    • Added support for regular expressions
    • reduced memory consumption of certain queries
  • Initial release of the PGX server, a web-server based on a preconfigured Tomcat instance, which allows you to quickly deploy PGX on a certain port. PGX server enforces TLS 1.2 with client authentication by default and only allows cipher suites approved by the Oracle security team. You can still deploy the PGX WAR file into a servlet container of your choice.
  • Added new security features to the PGX remote interface:
    • PGX server supports two-way TLS by default
    • Authentication and authorization now based on client certificates by default (no more BASIC auth)
    • PGX now implements the double-submit cookie technique to protect against CSRF attacks
    • Session identifiers are no longer part of the URL, but sent via cookies instead
    • If the PGX server is requiring BASIC auth, the PGX client no longer expects the username and password to be part of the URL
  • Labels are now first-class citizens in PGX. Labels are strings you can use to group a set of vertices or edges together. We support multiple labels per vertex and one label per edge.
    • We updated the ADJ_LIST and EDGE_LIST text file formats as well as the PGB binary format to support the encoding of labels. Note that we updated the formats in a backwards-compatible manner, so older graph files or files without any label information can still be loaded into PGX as before.
    • The edge label in the PG and FLAT_FILE formats is now recognized by PGX.
  • Both FLAT_FILE and TWO_TABLE_TEXT loaders can now read from multiple data files.
  • Added a graph builder API which can be used to build a graph in-memory from scratch or to apply a set of changes to an existing graph.
  • New built-in algorithms: Weighted PageRank and Personalized SALSA.
  • Added new APIs to the PgxVertex class to retrieve all neighbor vertices and all out-going edges.
  • Removed limitation of text file loaders requiring edge keys to be grouped.
  • Added API to disable certain compiler optimizations when compiling Green-Marl code at runtime.
  • Two-tables loader is now reading and storing in parallel.
  • PGX shell now recognizes the Ctrl+C keyboard signal to interrupt the currently running request.
  • Green-Marl compiler improvements:
    • Reduced memory consumption of algorithms which make use of multi-source breadth-first search (MS-BFS)
    • MS-BFS now operates on off-heap memory
    • Reduced initial memory consumption of PGX by lazy initializing of data structures only needed by certain algorithms.
  • Improved internal task dispatching implementation. This improves overall performance if PGX is used in a single-user, batch-mode manner.

Bug Fixes

  • PGX remote deployments now properly support Unicode (Oracle Bug #22713906)
  • fixed binding bug in filter expression conjunctions (Oracle Bug #22713857)
  • FLAT_FILE loader now handles special characters correctly

1.2.1

  • fixed creating a bipartite subgraph not updating edge properties correctly
  • made compiler-generated MS-BFS code more robust for edge cases

1.2.0

Features and Improvements

  • Added basic graph query support - queries can be expressed in PGQL, an SQL-like graph query language specifically designed for property-graph queries.
  • Added in-memory support for vector-type properties and vector scalars.
  • New built-in algorithms: stochastic-gradient-descent, kcore and approximate pagerank.
  • New recommendation API.
  • Added batch breadth-first search optimization: Green-Marl programs which do a BFS search from every vertex in the graph now run up 100x faster on certain graphs. This optimization is applied to two built-in algorithms vertex betweenness centrality and closeness centrality.
  • Added support for new text-based file formats: flat-file format
  • Added support for administering a PGX server instance remotely. Users who want to access the administrative interface require special server-side authorization.
  • Added support for more archive formats and protocols: zip, jar, tar, tgz, tbz2, gz, bz2 and ftp(s). All previous formats and protocols (http(s), hdfs, classpath and res) are still supported, but have a new implementation.
  • New Green-Marl compiler features:
    • Performance improvements
    • Added support for edge collections (edge set and edge sequence)
    • Added print() statement
  • PGX Shell improvements:
    • better help screen
    • added support to run scripts via pgx /path/to/script.groovy script-arg1 script-arg2
    • add --max-output-lines parameter to limit the maximum amount of elements printed if an iterable is returned
  • New case studies:
  • Added support for one session to point to multiple snapshots of the same graph.
  • Added support for renaming existing transient properties.
  • Performance improvement when setting/getting property values directly on PgxVertex/PgxEdge objects.
  • Transient properties created by the Analyst API now have more meaningful default names.
  • Calls to getVertex(...) and getEdge(...) now verify the given vertex/edge ID exists on the graph.

Misc

  • Updated third-party Cloudera (CDH) dependency to version 5.4.4
  • Updated third-party commons-codec dependency to version 1.10
  • Updated third-party Jackson dependency to version 1.9.2

1.1.1

  • Updated third-party dependency Netty to fix vulnerability (CVE)
  • Updated third-party dependency Groovy to fix vulnerability (CVE)

1.1.0

Features and Improvements

  • Improved Java API
    • The new convenient API has been introduced.
    • The Core interface became internal and thus is not exposed to users.
    • Analyst is no longer session-bound or graph-bound. You can now use the same Analyst to analyze multiple different graphs (or multiple snapshots of the same graph).
    • Every API method has blocking and non-blocking version with different names. The non-blocking versions are now suffixed with Async
    • Updated all documentation to reflect the new API
  • Changes to PGX Shell
    • The new shell (based on Groovy 1.8) is now included in download package -- no separate Groovy installation is required any more
    • Use blocking Java API directly in shell. Shell commands became consistent with Java API
    • Support for basic UNIX commands (ls, mv, pwd, cp and cat)
    • Built-in javadoc command which prints the Javadocs of shell variables, class names and methods directly in the shell
  • Improved filter expressions: you can now specify both vertex and edge filters. See our filter expression reference.
  • Added more built-in algorithms: degree centrality, degree distribution, filtered Dijkstra, bidirectional Dijkstra, Bellman Ford, hop distance, weakly and strongly connected components
  • Support to point at multiple snapshots of the same graph within the same session
  • Extended Supports in Data Import
  • Experimental Feature (OTN only) -- Distributed PGX
    • Distributed PGX is our proprietary distributed graph analytic framework that can process very large graph instances by leveraging multiple machines
    • The current version is an experimental preview version intended to eventually evolve into the distributed back-end for PGX
    • Refer to the experimental/dist/doc directory for documentation

Bug Fixes

  • fixed Java client sending a wrongful request when trying to delete a transient property
  • Fixed bug: Occasional crash on reverse edge creation and sorting huge graphs with single thread

1.0.0

  • Compiler optimization: improved performance by merging properties
  • Compiler robustness: better dead-code and return statement checks
  • Reduced memory consumption of date properties
  • Reduced download package size
  • Improved Javadocs

0.9.1

  • improved documentation
  • fixed bug which led to array index out of bounds error when undirecting a certain type of graph
  • fixed bug which wrongfully rejected list of given edge properties when requesting a bipartite subgraph in the remote case

0.9.0

  • 64bit support: ability to load more than 2^32 edges
  • Added remote support
    • Deploy PGX as a web application
    • Core interface exposed via REST
    • Connect to running web application via HTTP, client PGX shell or client Java application
  • Added Hadoop support
    • Load/Store graph data from/to HDFS
    • Run PGX as YARN application (single-node only)
  • New built-in algorithm: Fattest path
  • Allow modification of the PGX runtime configuration values
  • Added support for PGX-managed scalars
  • New APIs to modify PGX-managed maps
  • Added support to load graph data from Oracle NoSQL
  • Added support to load graph data from Apache HBase
  • Create subgraphs from filter expressions
  • Updated Groovy dependency to 2.4.0
  • Support for Green-Marl specification 0.6.2
    • Placeholder in group assignments
    • Read-only input arguments
    • Removed @-syntax for reductions
    • Date/time type and edge built-ins
  • Simplified sparsification APIs
  • Added more Green-Marl compiler optimizations
  • Decreased the size of the Green-Marl compiler binary
  • Added Green-Marl compiler support for 32bit Linux platforms
  • Improved binary format loader: use memory mapping for better performance
  • Refactored text-based graph loaders
  • Load file-based graph data from classpath
  • Improved graph configuration handling
    • Added support for inheritance of configuration schemas
    • Added schema-specific graph factories and builders
    • Added support for loading configuration files from classpath
    • Added support for configuration files in the Java properties format
  • Analysis timeout is now called task timeout and also applies to loading tasks
  • Added constants Properties.ALL and Properties.NONE to fix inconsistencies between null and empty lists

0.8.1

  • Added more built-in algorithms
    • personalized pagerank
    • node betweenness centrality
    • approximate node betweenness centrality
    • approximate node betweenness centrality from seeds
    • closeness centrality unit length
    • closeness centrality double length
    • hits
    • eigenvector centrality
    • out degree centrality
    • in degree centrality
    • random walk with restart
  • Fixed some bugs in groovysh
    • print false, 0.0, 0, etc if they are a result of a command (see bug report)

0.8.0

Initial release