Receive a Partial Response from the LLM

The following code sample shows how to receive a partial response from the LLM using llm.generateTextStreamed(options).

When you send a prompt and related content to the LLM using llm.generateText(options) or llm.evaluatePrompt(options), you must wait until the entire response is generated before you can access the content of the response. By contrast, llm.generateTextStreamed(options) and llm.evaluatePromptStreamed(options) let you access the partial response content before the entire response has been generated. These methods are useful if you're sending a prompt that will generate a lot of content, letting you process the response more quickly and potentially improving the performance of your scripts.

This sample provides a short preamble (which is an optional parameter supported by Cohere models) and prompt to llm.generateTextStreamed(options), along with the model to use and a set of model parameters. The temperature model parameter controls the randomness and creativity of the response. Higher values (closer to 1) are appropriate for generating creative or diverse responses, which fits the prompt used in the sample.

The sample uses an iterator to examine each token returned by the LLM. Here, token.value contains the value of each token returned by the LLM, and response.text contains the partial response up to and including that token. For more information about iterators, see Iterator.

For instructions about how to run a SuiteScript 2.1 code snippet in the debugger, see On-Demand Debugging of SuiteScript 2.1 Scripts.

Note:

This sample script uses the require function so that you can copy it into the SuiteScript Debugger and test it. You must use the define function in an entry point script (the script you attach to a script record and deploy). For more information, see SuiteScript 2.x Script Basics and SuiteScript 2.x Script Types.

          /**
 * @NApiVersion 2.1
 */
require(['N/llm'], function(llm) {
    const response = llm.generateTextStreamed({
       preamble: "You are a script writer for TV shows.", 
       prompt: "Write a 300 word pitch for a TV show about tigers.",
       modelFamily: llm.ModelFamily.COHERE_COMMAND_R,
       modelParameters: {
          maxTokens: 1000,
          temperature: 0.8,          // High temperature values result in more varied and
                                     // creative responses
          topK: 3,
          topP: 0.7,
          frequencyPenalty: 0.4,
          presencePenalty: 0
       }
    });

    var iter = response.iterator();
    iter.each(function(token){
        log.debug("token.value: " + token.value);
        log.debug("response.text: " + response.text);
        return true;
    })
});

General Notices