Getting a Data Guide for a Collection with SODA for In-Database JavaScript

A data guide is a summary of the structural and type information contained in a set of JSON documents. It records metadata about the fields used in those documents. They provide great insights into JSON documents and are invaluable for getting an overview of a data set.

You can create a data guide using SodaCollection.getDataGuide(). To get a data guide in SODA, the collection must be JSON-only and have a JSON search index where the "dataguide" option is "on". Data guides are returned from sodaCollection.getDataGuide() as JSON content in a SodaDocument. The data guide is inferred from the collection as it currently is. As a collection grows and documents change, a new data guide is returned each subsequent time getDataGuide() is called.

Example 8-21 Generating a Data Guide for a Collection

This example gets a data guide for the collection employeesCollection (created in Example 8-8) using the method getDataGuide() and then prints the contents as a string using the method getContentAsString().

export function createDataGuide(){

  // open the collection
  const col = soda.openCollection('employeesCollection');
  if(col === null){
    throw new Error("'employeesCollection' does not exist");
  }

  // generate a Data Guide (requires the Data Guide index)
  const doc = col.getDataGuide();
  console.log(doc.getContentAsString());
}

The data guide can provide interesting insights into a collection, including all the fields and their data types. Although the Data Guide for employeesCollection may already be familiar to readers of this chapter, unknown JSON documents can be analyzed conveniently this way. The previous code block prints the following Data Guide to the screen:

{
  "type": "object",
  "o:length": 1,
  "properties": {
    "_id": {
      "type": "id",
      "o:length": 24,
      "o:preferred_column_name": "DATA$_id"
    },
    "email": {
      "type": "string",
      "o:length": 16,
      "o:preferred_column_name": "DATA$email"
    },
    "jobId": {
      "type": "string",
      "o:length": 16,
      "o:preferred_column_name": "DATA$jobId"
    },
    "salary": {
      "type": "number",
      "o:length": 8,
      "o:preferred_column_name": "DATA$salary"
    },
    "hireDate": {
      "type": "string",
      "o:length": 32,
      "o:preferred_column_name": "DATA$hireDate"
    },
    "lastName": {
      "type": "string",
      "o:length": 16,
      "o:preferred_column_name": "DATA$lastName"
    },
    "firstName": {
      "type": "string",
      "o:length": 16,
      "o:preferred_column_name": "DATA$firstName"
    },
    "managerId": {
      "type": "string",
      "o:length": 4,
      "o:preferred_column_name": "DATA$managerId"
    },
    "employeeId": {
      "type": "number",
      "o:length": 4,
      "o:preferred_column_name": "DATA$employeeId"
    },
    "phoneNumber": {
      "type": "string",
      "o:length": 16,
      "o:preferred_column_name": "DATA$phoneNumber"
    },
    "departmentId": {
      "type": "string",
      "o:length": 4,
      "o:preferred_column_name": "DATA$departmentId"
    },
    "commissionPct": {
      "type": "string",
      "o:length": 32,
      "o:preferred_column_name": "DATA$commissionPct"
    }
  }
}