Work with Cognitive Text Functionality
Cognitive Text functionality, along with Model Repository and Model Deployment, are components of Oracle Machine Learning Services.
- Topic detection or topic discovery
- Keyword identification
- Summary
- Sentiment analysis
- Similarity
- Feature extraction
The supported languages for cognitive text include English (American), Spanish, French and Italian.
jq
can be used to parse the JSON output into a readable format. The jq
utility is included in all the major Linux distributions repositories. On Oracle Linux and Red Hat systems, it is installed by using the command: $ sudo yum install jq
Note:
Usingjq
in the cURL command does not return the HTTP response. To return HTTP response, remove jq
command and add the -i
flag to the curl command.
1: Get a List of Model Endpoints
This example demonstrates how to return all cognitive text endpoints provided by OML Services. You cannot create new cognitive text endpoints, but you may browse and score against endpoints depending on access.
Run the following cURL command to get a list of all cognitive text endpoints:
curl -X GET --header "Authorization: Bearer ${token}" "<oml-cloud-service-location-url>/omlmod/v1/cognitive-text"
The command returns the following details:
{ "items": [ { "name": "topics", "description": "A cognitive text scoring endpoint that returns the most important topics in the provided text list.", "links": [ { "rel": "topics", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/topics" }, { "rel": "self", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/." } ] }, { "name": "keywords", "description": "A cognitive text scoring endpoint that returns the most important keywords in the provided text list.", "links": [ { "rel": "keywords", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/keywords" }, { "rel": "self", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/." } ] }, { "name": "summary", "description": "A cognitive text scoring endpoint that returns the summary of the provided text list.", "links": [ { "rel": "summary", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/summary" }, { "rel": "self", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/." } ] }, { "name": "similarity", "description": "A cognitive text scoring endpoint that returns the similarity of the probe text and a list of texts.", "links": [ { "rel": "similarity", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/similarity" }, { "rel": "self", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/." } ] }, { "name": "sentiment", "description": "A cognitive text scoring endpoint that returns the sentiment of the provided text list.", "links": [ { "rel": "sentiment", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/sentiment" }, { "rel": "self", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/." } ] }, { "name": "features", "description": "A cognitive text scoring endpoint that returns the features of the provided text list.", "links": [ { "rel": "features", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/features" }, { "rel": "self", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text/." } ] } ], "links": [ { "rel": "self", "href": "https://adb.us-sanjose-1.oraclecloud.com/omlmod/v1/cognitive-text" } ] }
2: Return Most Relevant Text Keywords
This example demonstrates how to return the most relevant keywords for any list of text strings.
You can pass the parameter topN
as part of the input to determine how many keywords to return per document. By default, 5
keywords are returned. You can also pass a language parameter. The default value is AMERICAN
. Other supported values are FRENCH
, SPANISH
and ITALIAN
. The keywords are sorted by weight.
curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/cognitive-text/keywords" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" \
--data '{
"topN":2,
"textList":["With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud."]
}' | jq
topN
is set to 2
. The command returns the following two keywords, along with their respective weights:[ { "text": "With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud.", "keywordResults": [ { "keyword": "Oracle", "weight": 0.4857284784454714 }, { "keyword": "algorithms", "weight": 0.43990432999766216 } ] } ]
3: Return Text Summaries
topN
as part of the input to determine how many sentences to return per document. By default, topN
is set to 3
. You can also pass a language parameter. The default value is AMERICAN
. Other supported values for language parameter are FRENCH
, SPANISH
, ITALIAN
.
Note:
The sentences in the summaries are returned in order of occurrence in the provided text strings.curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/cognitive-text/summary" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" \
--data '{
"topN":2,
"textList":["With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud."]
}' | jq
topN
is set to 2
. The command returns the following two summary details along with their respective weights:[ { "text": "With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud.", "summaryResults": [ { "sentence": "Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. ", "weight": 0.841429024903614 }, { "sentence": "Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud.", "weight": 0.7717966493441589 } ] } ]
4: Return Text Sentiment
This example demonstrates how to return the sentiment for any given list of text strings. A sentiment is an enumerated type with values positive, neutral, negative.
curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/cognitive-text/sentiment" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" \
--data '{
"textList":["With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud."]
}' | jq
[ { "text": "With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud.", "sentimentResults": [ { "sentiment": "neutral", "confidence": 0.7768593976551998 }, { "sentiment": "negative", "confidence": 0.15795427860813646 }, { "sentiment": "positive", "confidence": 0.06518632373666372 } ] } ]
5: Return the Most Relevant Text Topics
This example demonstrates how to return the most relevant topics for any list of text strings. You can pass the parameter topN
as part of input to determine how many topics to return. By default, 2
topics are returned. You can also pass a language parameter. The default value is AMERICAN.
Other supported values are FRENCH
, SPANISH
, and ITALIAN.
The topics are sorted by weight.
curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/cognitive-text/topics" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" \
--data '{
"topN":5,
"textList":["With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud."]
}' | jq
[ { "text": "With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud.", "topicResults": [ { "topic": "Oracle Database", "weight": 0.31551492019770755 }, { "topic": "Scalability", "weight": 0.2526506529399862 }, { "topic": "Oracle Corporation", "weight": 0.2375060259741595 }, { "topic": "Machine learning", "weight": 0.22823082222493368 }, { "topic": "Analysis of algorithms", "weight": 0.1507387699576783 } ] } ]
6: Return Text Similarity
This example demonstrates how to send a POST request to a cognitive-text endpoint to return semantic similarity between a probe text and a list of text strings. You can pass sort direction, for example ASC
for ascending or DESC
for descending). You can also pass a language parameter. The default value is AMERICAN
. The other supported values are FRENCH
, SPANISH
, and ITALIAN
.
curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/cognitive-text/similarity" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" \
--data '{
"probe":"algorithms",
"textList":["With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud."],
"sortDirection":"DESC"
}' | jq
0.43990432999766216
for semantic similarity between the probe and the text string. [ { "text": "With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects – both on-premises and in the Cloud.", "similarity": 0.43990432999766216 } ]
7: Return Numeric Features for Text Strings
The features
API allows for a document to be represented as a vector of 1024 floating point numbers. The feature vectors can be used in text processing. For example, you can use the feature vectors to look for similar documents based on the vector space. This can save compute resources because the text processing of a large text collection can be done in advance with only the vector being used to compute similarity.
This example demonstrates how to return numeric features for any list of text strings. You can pass a language parameter. The default value is AMERICAN
. Other supported values are FRENCH
, SPANISH
, and ITALIAN
.
Oracle Machine Learning
: curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/cognitive-text/features" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" \
--data '{
"textList":["Oracle Machine Learning"]
}' | jq
[ { "text": "Oracle Machine Learning", "scoringResults": { "weights": [ -0.01593325924999146, 0.019103851234810277, 0.0015129621606790617, -0.00757883632513583, -0.01606255829444141, -0.0184449494100069, -0.04012992671536982, ... ... ] } } ]
8: Cognitive Text Topic Discovery
Run this cURL command for cognitive text topic discovery:
curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/cognitive-text/topics" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" \
--data '{
"topN":5,
"textList":["With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects both on-premises and in the Cloud."]
}' | jq
In this example, the parameter topN
returns five topics.
The command returns the following output:
"topicResults": [ { "topic": "Oracle Corporation", "weight": 0.19987709210173316 }, { "topic": "Machine learning", "weight": 0.1905437857032045 }, { "topic": "Scalability", "weight": 0.1796066801401289 }, { "topic": "Data mining", "weight": 0.1452028606883496 }, { "topic": "Analysis of algorithms", "weight": 0.13456003371071854 } ] } ]
9: Cognitive Text Keyword Identification
curl -X POST "<oml-cloud-service-location-url>/omlmod/v1/cognitive-text/keywords" \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" \
--data '{
"topN":2,
"textList":["With Oracle Machine Learning, Oracle moves the algorithms to the data. Oracle runs machine learning within the database, where the data reside. This approach minimizes or eliminates data movement, achieves scalability, preserves data security, and accelerates time-to-model deployment. Oracle delivers parallelized in-database implementations of machine learning algorithms and integration with the leading open source environments R and Python. Oracle Machine Learning delivers the performance, scalability, and automation required by enterprise-scale data science projects both on-premises and in the Cloud."]
}' | jq
In this example, the parameter topN
returns two keywords. The output:
"keywordResults": [ { "keyword": "data", "weight": 0.5521764418721277 }, { "keyword": "algorithms", "weight": 0.46610186622828115 } ] } ]
10: Cognitive Text Sentiment Analysis
Run this cURL command for cognitive text sentiment analysis:
curl -X POST --header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer ${token}" \
<oml-cloud-service-location-url>/omlmod/v1/cognitive-text/sentiment \
-d '{"textList":["With Oracle Machine Learning, Oracle moves the algorithms to the data, processing data where it resides minimizing or eliminating data movement, achieving scalability, preserving security, and accelerating time-to-model deployment."]}' | jq
The command returns the following output:
[ { "text": "With Oracle Machine Learning, Oracle moves the algorithms to the data, processing data where it resides minimizing or eliminating data movement, achieving scalability, preserving security, and accelerating time-to-model deployment.", "sentimentResults": [ { "sentiment": "neutral", "confidence": 0.8003081130871371 }, { "sentiment": "negative", "confidence": 0.12303965647159439 }, { "sentiment": "positive", "confidence": 0.07665223044126848 } ] } ]