The Data Enrichment modules increase the usability of your data by discovering value in its content.
Bundled in the Data Enrichment package is a collection of modules along with the logic to associate these modules with a column of data (for example, an address column can be detected and associated with a GeoTagger module).
During the sampling phase of the Data Processing workflow, some of the Data Enrichment modules run automatically while others do not. (You cannot configure which modules do or do not run.) However, you can run any module from Studio's Transform page.
When Data Processing is running against a Hive table, the Data Enrichment modules that run automatically obtain their input pre-screened by the sampling stage. For example, only an IP address is ever passed to the IP Address GeoTagger module.
All Data Enrichment modules ignore both the primary-key attribute of a record and any attribute whose data type is inappropriate for that module. For example, the Entity extractor works only on string attributes, so that numeric attributes are ignored.
Note that when the Data Processing workflow finishes, you can manually run any of these modules from Transform in Studio.
The supported languages are specific to each module. For details, see the topic for the module.
The types and names of output attributes are specific to each module. For details on output attributes, see the topic for the module.