PGX 1.2.0
Documentation

Changelog

1.2.0

  • Added basic graph query support - queries can be expressed in PGQL, an SQL-like graph query language specifically designed for property-graph queries.
  • Added in-memory support for vector-type properties and vector scalars.
  • New built-in algorithms: stochastic-gradient-descent, kcore and approximate pagerank.
  • New recommendation API.
  • Added batch breadth-first search optimization: Green-Marl programs which do a BFS search from every vertex in the graph now run up 100x faster on certain graphs. This optimization is applied to two built-in algorithms vertex betweenness centrality and closeness centrality.
  • Added support for new text-based file formats: flat-file format
  • Added support for administering a PGX server instance remotely. Users who want to access the administrative interface require special server-side authorization.
  • Added support for more archive formats and protocols: zip, jar, tar, tgz, tbz2, gz, bz2 and ftp(s). All previous formats and protocols (http(s), hdfs, classpath and res) are still supported, but have a new implementation.
  • New Green-Marl compiler features:
    • Performance improvements
    • Added support for edge collections (edge set and edge sequence)
    • Added print() statement
  • PGX Shell improvements:
    • better help screen
    • added support to run scripts via pgx /path/to/script.groovy script-arg1 script-arg2
    • add --max-output-lines parameter to limit the maximum amount of elements printed if an iterable is returned
  • New case studies:
  • Added support for one session to point to multiple snapshots of the same graph.
  • Added support for renaming existing transient properties.
  • Performance improvement when setting/getting property values directly on PgxVertex/PgxEdge objects.
  • Transient properties created by the Analyst API now have more meaningful default names.
  • Calls to getVertex(...) and getEdge(...) now verify the given vertex/edge ID exists on the graph.
  • Updated third-party Cloudera (CDH) dependency to version 5.4.4
  • Updated third-party commons-codec dependency to version 1.10
  • Updated third-party Jackson dependency to version 1.9.2

1.1.1

  • Updated third-party dependency Netty to fix vulnerability (CVE)
  • Updated third-party dependency Groovy to fix vulnerability (CVE)

1.1.0

  • Improved Java API
    • The new convenient API has been introduced.
    • The Core interface became internal and thus is not exposed to users.
    • Analyst is no longer session-bound or graph-bound. You can now use the same Analyst to analyze multiple different graphs (or multiple snapshots of the same graph).
    • Every API method has blocking and non-blocking version with different names. The non-blocking versions are now suffixed with Async
    • Updated all documentation to reflect the new API
  • Changes to PGX Shell
    • The new shell (based on Groovy 1.8) is now included in download package -- no separate Groovy installation is required any more
    • Use blocking Java API directly in shell. Shell commands became consistent with Java API
    • Support for basic UNIX commands (ls, mv, pwd, cp and cat)
    • Built-in javadoc command which prints the Javadocs of shell variables, class names and methods directly in the shell
  • Improved filter expressions: you can now specify both vertex and edge filters. See our filter expression reference.
  • Added more built-in algorithms: degree centrality, degree distribution, filtered Dijkstra, bidirectional Dijkstra, Bellman Ford, hop distance, weakly and strongly connected components
  • Support to point at multiple snapshots of the same graph within the same session
  • Extended Supports in Data Import
  • Experimental Feature (OTN only) -- Distributed PGX
    • Distributed PGX is our proprietary distributed graph analytic framework that can process very large graph instances by leveraging multiple machines
    • The current version is an experimental preview version intended to eventually evolve into the distributed back-end for PGX
    • Refer to the experimental/dist/doc directory for documentation
  • Bug Fixes
    • Fixed bug: Java client sending a wrongful request when trying to delete a transient property
    • Fixed bug: Occasional crash on reverse edge creation and sorting huge graphs with single thread

1.0.0

  • Compiler optimization: improved performance by merging properties
  • Compiler robustness: better dead-code and return statement checks
  • Reduced memory consumption of date properties
  • Reduced download package size
  • Improved Javadocs

0.9.1

  • improved documentation
  • fixed bug which led to array index out of bounds error when undirecting a certain type of graph
  • fixed bug which wrongfully rejected list of given edge properties when requesting a bipartite subgraph in the remote case

0.9.0

  • 64bit support: ability to load more than 2^32 edges
  • Added remote support
    • Deploy PGX as a web application
    • Core interface exposed via REST
    • Connect to running web application via HTTP, client PGX shell or client Java application
  • Added Hadoop support
    • Load/Store graph data from/to HDFS
    • Run PGX as YARN application (single-node only)
  • New built-in algorithm: Fattest path
  • Allow modification of the PGX runtime configuration values
  • Added support for PGX-managed scalars
  • New APIs to modify PGX-managed maps
  • Added support to load graph data from Oracle NoSQL
  • Added support to load graph data from Apache HBase
  • Create subgraphs from filter expressions
  • Updated Groovy dependency to 2.4.0
  • Support for Green-Marl specification 0.6.2
    • Placeholder in group assignments
    • Read-only input arguments
    • Removed @-syntax for reductions
    • Date/time type and edge built-ins
  • Simplified sparsification APIs
  • Added more Green-Marl compiler optimizations
  • Decreased the size of the Green-Marl compiler binary
  • Added Green-Marl compiler support for 32bit Linux platforms
  • Improved binary format loader: use memory mapping for better performance
  • Refactored text-based graph loaders
  • Load file-based graph data from classpath
  • Improved graph configuration handling
    • Added support for inheritance of configuration schemas
    • Added schema-specific graph factories and builders
    • Added support for loading configuration files from classpath
    • Added support for configuration files in the Java properties format
  • Analysis timeout is now called task timeout and also applies to loading tasks
  • Added constants Properties.ALL and Properties.NONE to fix inconsistencies between null and empty lists

0.8.1

  • Added more built-in algorithms
    • personalized pagerank
    • node betweenness centrality
    • approximate node betweenness centrality
    • approximate node betweenness centrality from seeds
    • closeness centrality unit length
    • closeness centrality double length
    • hits
    • eigenvector centrality
    • out degree centrality
    • in degree centrality
    • random walk with restart
  • Fixed some bugs in groovysh
    • print false, 0.0, 0, etc if they are a result of a command (see bug report)

0.8.0

Initial release