Overview | Package | Class | Tree | Index | Help | |||
PREV CLASS | NEXT CLASS | FRAMES | NO FRAMES | ||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
FinalResult
is an extension to the Result
interface that provides information about a result that has been
finalized - that is, recognition is complete. A finalized result
is a Result
that has received either a
RESULT_ACCEPTED
or RESULT_REJECTED
ResultEvent
that puts it in either the ACCEPTED
or REJECTED
state (indicated by the getResultState
method of the Result
interface).
The FinalResult
interface provides information for
finalized results that match either a DictationGrammar
or a RuleGrammar
.
Any result object provided by a recognizer implements both the
FinalRuleResult
and FinalDictationResult
interfaces. Because both these interfaces extend the FinalResult
interface, which in turn extends the Result
interface,
all results implement FinalResult
.
The methods of the FinalResult
interface provide information
about a finalized result (ACCEPTED
or
REJECTED
state). If any method of the FinalResult
interface is called on a result in the UNFINALIZED
state, a
ResultStateError
is thrown.
Three capabilities can be provided by a finalized result:
training/correction, access to audio data, and access to alternative guesses.
All three capabilities are optional because they are not all relevant
to all results or all recognition environments, and they are
not universally supported by speech recognizers.
Training and access to audio data are provided by the
FinalResult
interface. Access to alternative guesses
is provided by the FinalDictationResult
and
FinalRuleResult
interfaces (depending upon the type
of grammar matched).
Training / Correction
Because speech recognizers are not always correct, applications need
to consider the possibility that a recognition error has occurred.
When an application detects an error (e.g. a user updates a result),
the application should inform the recognizer so that it can learn
from the mistake and try to improve future performance.
The tokenCorrection
is provided for an application to provide
feedback from user correction to the recognizer.
Sometimes, but certainly not always, the correct result is
selected by a user from amongst the N-best alternatives for a
result obtained through either the FinalRuleResult
or FinalDictationResult
interfaces. In other cases,
a user may type the correct result or the application may infer
a correction from following user input.
Recognizers must store considerable information to support
training from results. Applications need to be involved in
the management of that information so that it is not stored
unnecessarily. The isTrainingInfoAvailable
method
tests whether training information is available for a finalized result.
When an application/user has finished correction/training for a result
it should call releaseTrainingInfo
to free up
system resources. Also, a recognizer may choose at any time to free
up training information. In both cases, the application is
notified of the the release with a TRAINING_INFO_RELEASED
event to ResultListeners
.
Audio Data
Audio data for a finalized result is optionally provided by recognizers. In dictation systems, audio feedback to users can remind them of what they said and is useful in correcting and proof-reading documents. Audio data can be stored for future use by an application or user and in certain circumstances can be provided by one recognizer to another.
Since storing audio requires substantial system resources,
audio data requires special treatment. If an application wants to
use audio data, it should set the setResultAudioProvided
property of the RecognizerProperties
to true
.
Not all recognizers provide access to audio data. For those
recognizers, setResultAudioProvided
has no effect,
the FinalResult.isAudioAvailable
always returns
false
, and the getAudio
methods always return null
.
Recognizers that provide access to audio data cannot always provide
audio for every result. Applications should test audio availability
for every FinalResult
and should always test for
null
on the getAudio
methods.
Field Summary | |
static int | DONT_KNOW
The DONT_KNOW flag is used in a call to tokenCorrection
to indicate that the application does not know whether a
change to a result is because of MISRECOGNITION
or USER_CHANGE .
|
static int | MISRECOGNITION
The MISRECOGNITION flag is used in a call to
tokenCorrection to indicate that the change is
a correction of an error made by the recognizer.
|
static int | USER_CHANGE
The USER_CHANGE flag is used in a call to
tokenCorrection to indicate that the user has
modified the text that was returned by the recognizer to
something different from what they actually said.
|
Method Summary | |
AudioClip | getAudio()
Get the result audio for the complete utterance of a FinalResult .
|
AudioClip | getAudio(ResultToken fromToken,
ResultToken toToken)
Get the audio for a token or sequence of tokens. |
boolean | isAudioAvailable()
Test whether result audio data is available for this result. |
boolean | isTrainingInfoAvailable()
Returns true if the Recognizer
has training information available for this result.
|
void | releaseAudio()
Release the result audio for the result. |
void | releaseTrainingInfo()
Release training information for this FinalResult .
|
void | tokenCorrection(String[] correctTokens,
ResultToken fromToken,
ResultToken toToken,
int correctionType)
Inform the recognizer of a correction to one of more tokens in a finalized result so that the recognizer can re-train itself. |
Field Detail |
public static final int MISRECOGNITION
MISRECOGNITION
flag is used in a call to
tokenCorrection
to indicate that the change is
a correction of an error made by the recognizer.
public static final int USER_CHANGE
USER_CHANGE
flag is used in a call to
tokenCorrection
to indicate that the user has
modified the text that was returned by the recognizer to
something different from what they actually said.
public static final int DONT_KNOW
DONT_KNOW
flag is used in a call to tokenCorrection
to indicate that the application does not know whether a
change to a result is because of MISRECOGNITION
or USER_CHANGE
.
Method Detail |
public boolean isTrainingInfoAvailable() throws ResultStateError
true
if the Recognizer
has training information available for this result.
Training is available if the following conditions are met:
isTrainingProvided
property of the
RecognizerProperties
is set to true
.
TRAINING_INFO_RELEASED
event has not been issued.)
Calls to tokenCorrection
have no effect if the training
information is not available.
public void releaseTrainingInfo() throws ResultStateError
FinalResult
.
The release frees memory used for the training information --
this information can be substantial.
It is not an error to call the method when training information is not available or has already been released.
This method is asynchronous - the training info is not
necessarily released when the call returns.
A TRAINING_INFO_RELEASED
event is issued to
the ResultListener
once the information is released.
The TRAINING_INFO_RELEASED
event is also issued if the
recognizer releases the training information for any other reason
(e.g. to reclaim memory).
public void tokenCorrection(String[] correctTokens, ResultToken fromToken, ResultToken toToken, int correctionType) throws ResultStateError, IllegalArgumentException
The fromToken
and toToken
parameters
indicate the inclusive sequence of best-guess or alternative
tokens that are being trained or corrected. If toToken
is
null
or if fromToken
and toToken
are the same, the training applies to a single recognized token.
The correctTokens
token sequence may have the
same of a different length than the token sequence being corrected.
Setting correctTokens
to null
indicates
the deletion of tokens.
The correctionType
parameter must be one of MISRECOGNITION
,
USER_CHANGE
, DONT_KNOW
.
Note: tokenCorrection
does not change the result object.
So, future calls to the getBestToken
, getBestTokens
and getAlternativeTokens
method return exactly the same values as
before the call to tokenCorrection
.
correctTokens
- sequence of correct tokens to replace fromToken
to toToken
fromToken
- first token in the sequence being corrected
toToken
- last token in the sequence being corrected
correctionType
- type of correction: MISRECOGNITION
, USER_CHANGE
,
DONT_KNOW
FinalResult
public boolean isAudioAvailable() throws ResultStateError
ResultAudioProvided
property of
RecognizerProperties
was set to true
when the result was recognized.
Recognizer
was able to collect result audio for
the current type of FinalResult
(FinalRuleResult
or FinalDictationResult
).
The availability of audio for a result does not mean that all
getAudio
calls will return an AudioClip
.
For example, some recognizers might provide audio data only for
the entire result or only for individual tokens, or not for
sequences of more than one token.
public void releaseAudio() throws ResultStateError
isAudioAvailable
will return false
.
This call is ignored if result audio is not available or
has already been released.
This method is asynchronous - audio data is not necessarily
released immediately. A AUDIO_RELEASED
event
is issued to the ResultListener
when the audio is released
by a call to this method. A AUDIO_RELEASED
event is also
issued if the recognizer releases the audio for some other reason
(e.g. to reclaim memory).
public AudioClip getAudio() throws ResultStateError
FinalResult
.
Returns null
if result audio is not available or if it has been released.
public AudioClip getAudio(ResultToken fromToken, ResultToken toToken) throws IllegalArgumentException, ResultStateError
Returns null
if result audio is not available or
if it cannot be obtained for the specified sequence of tokens.
If toToken
is null
or if
fromToken
and toToken
are the same,
the method returns audio for fromToken
.
If both fromToken
and
toToken
are null
, it returns the audio
for the entire result (same as getAudio()
).
Not all recognizers can provide per-token audio, even if they can provide audio for a complete utterance.
Overview | Package | Class | Tree | Index | Help | |||
PREV CLASS | NEXT CLASS | FRAMES | NO FRAMES | ||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
JavaTM Speech API
Copyright 1997-1998 Sun Microsystems, Inc. All rights reserved
Send comments to javaspeech-comments@sun.com