Refining Data

Additional refinement of data is essential to accelerating your ability to share data from the Data Exchange, enrich the data for discovery, and for prepping language enriched data for Generative AI. After you have refined your data for these areas, you can review examples for querying JSON or Language data.

JSON Data Ready Automatically

When sharing data from Data Exchange, JSON data is prepped automatically on top of the entities in the industry model schema. The JSON Views that provide this data are created automatically in your model schema with a #JSON suffix on the end. For example, a standardized entity of PERSON has a corresponding JSON View PERSON#JSON created by default.

Prepping Language Data

Data Exchange is able to create Language Views that provide semantic enrichment for any entity in the industry model schema for which you have synced data. The Language Views created appear in your model schema with a #LANGUAGE suffix on the end. For example, a standardized entity of PERSON has a Language View PERSON#LANGUAGE created as part of this routine.

To prepare Language Data on top of your Model Schema:

After you have mapped and standardized your data, you can run the steps below to connect your data relationships and prep language data.

  1. Run the SYNC_TRIPLE_STORE procedure in order to dynamically connect data relationships to simplify querying your language data. This procedure needs to be rerun whenever new data has arrived in your model and is useful to run ongoing right after SYNC_COLLECTION to keep the graph relationships up to date. This process adds triples to the DE_TRIPLE_STORE table and surfaces dynamic data connections through the DYNAMIC_RELATED_ENTITIES view in your Collection.

     SET SERVEROUTPUT ON
     
     EXEC {your_model_schema}.DE_COLLECTION.SYNC_TRIPLE_STORE;
    
  2. Run the REFRESH_LANGUAGE_VIEWS procedure to create Language Views on top of your industry model data. All language views created have a #LANGUAGE suffix on the end and a unifying view named ALL#LANGUAGE is created to simplify patterns for querying data. This procedure only needs to rerun if you change or add mapping views in your Collection.

     SET SERVEROUTPUT ON
     
     EXEC {your_model_schema}.DE_AI.REFRESH_LANGUAGE_VIEWS;