HTMLLexer (Oracle Fusion Middleware Java API Reference for Oracle Extension SDK Reference)

The HTMLLexer is an implementation of the Lexer interface for the HTML language. It can be used to retrieve a token stream for a regular HTML document, as well as a JSP document. To retrieve JSP-specific tags, enable JSP recognition by calling setRecognizeJSP(true). To enable recognition of script text within <script> and </script> tags, use setRecognizeScripts(true).

Note that even when you enable recognition of JSP tags, you still need to enable recognition of embedded tags also. This means that it will properly handle an embedded tag found in an attribute value while scanning an HTML tag. This does not however check if it is legal for the embedded tag to be present - that is beyond the scope of this lexer. It is the caller's responsibility to do so.

This lexer does not assist in deciphering the contents of an HTML tag, nor does it help in identifying element bodies. All this lexer does is locate HTML and JSP tags within the document.

See Also:: Lexer, HTMLTokens

Nested Class Summary

Nested classes/interfaces inherited from class oracle.javatools.parser.AbstractLexer
`AbstractLexer.DefaultLexerToken`

Field Summary
`protected boolean`	`recognizeEmbeddedTags` Whether to recognize embedded tags.
`protected boolean`	`recognizeJSP` Whether to recognize JSP tags or not.

Fields inherited from class oracle.javatools.parser.AbstractLexer
`currentPos, textBuffer`

Fields inherited from interface oracle.javatools.parser.html.HTMLTokens
`TK_HTML_COMMENT, TK_HTML_DOCUMENT_TYPE, TK_HTML_PROCESSING_INSTRUCTION, TK_HTML_SCRIPT, TK_HTML_STYLE, TK_HTML_TAG, TK_HTML_TEXT, TK_JSP_COMMENT, TK_JSP_DECLARATION, TK_JSP_DIRECTIVE, TK_JSP_EXPRESSION, TK_JSP_SCRIPLET, TK_PHP_ASPTAG, TK_PHP_TAG`

Fields inherited from interface oracle.javatools.parser.Lexer
`TK_EOF, TK_NOT_FOUND`

Constructor Summary
`HTMLLexer()` Constructs a default `HTMLLexer` with a starting position of 0.

Method Summary
`void`	`backup()` Unlexes the last found token.
`protected boolean`	`isEmbeddedTagStart(int searchPosition)` Utility routine to determine whether the given search position is the start of an embedded tag.
`int`	`lex(LexerToken lexedToken)` Scans the text buffer at the current position and returns the token that was found.
`void`	`setCaretPosition(int caretPosition)` Sets a caretPosition, if this is set to a value other than -1, it indicates that the lexer is being used for code insight.
`void`	`setPosition(int offset)` Sets the current lex (read) position to the given offset in the buffer.
`void`	`setRecognizeEmbeddedTags(boolean recognizeEmbeddedTags)` Sets whether the lexer should recognize embedded HTML or JSP expression tags within an attribute value.
`void`	`setRecognizeJSP(boolean recognizeJSP)` Sets whether the `TagLexer` should recognize JSP tag symbols.
`void`	`setRecognizePHP(boolean recognizePHP)` Deprecated. The HTMLLexer should not be used for parsing PHP file.
`void`	`setRecognizeScripts(boolean recognizeScripts)` Sets whether the `HTMLLexer` should recognize script start and end tags and generate TK_HTML_SCRIPT tokens for script text.
`void`	`setRecognizeStyles(boolean recognizeStyles)` Sets whether the `HTMLLexer` should recognize style start and end tags and generate TK_HTML_STYLE tokens for style text.
`void`	`setSkipComments(boolean skipComments)` Sets whether the `HTMLLexer` should generate tokens for Java comments.
`protected void`	`skipEmbeddedTag()` Utility routine to skip over a found embedded tag.
`protected void`	`skipHTMLTag()` Utility routine which scans through the text buffer to find the end of an HTML tag.
`protected void`	`skipJSPEL()` Utility routine which scans through the text buffer to find the end of a JSP EL expression.
`protected void`	`skipJSPScriplet()` Utility routine which scans through the text buffer to find the end of a JSP scriplet tag.
`protected void`	`skipPHPASPTag()` Utility routine which scans to locate the end of an ASP-styled PHP tag.
`protected void`	`skipPHPTag()` Utility routine which scans through the text buffer to locate the end of a PHP tag.
`static java.lang.String`	`tokenToString(int token)` Utility routine to map the token to a string representation of the token (for debug printing.)
`static java.lang.String`	`tokenToText(int token)` Utility routine to map the token to the original text (if retrievable) of the token (for debug printing.)

Methods inherited from class oracle.javatools.parser.AbstractLexer
`createLexerToken, getTextBuffer, setTextBuffer`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

recognizeJSP

protected boolean recognizeJSP

Whether to recognize JSP tags or not.

recognizeEmbeddedTags

protected boolean recognizeEmbeddedTags

Whether to recognize embedded tags.

Constructor Detail

HTMLLexer

public HTMLLexer()

Constructs a default HTMLLexer with a starting position of 0. Clients must call setTextBuffer() to initialize the text buffer used for the Lexer. To start lexing from an offset other than 0, call setPosition().

Method Detail

setSkipComments

public void setSkipComments(boolean skipComments)

Sets whether the HTMLLexer should generate tokens for Java comments.

Parameters:: skipComments - true to ignore comments in token generation

setRecognizeScripts

public void setRecognizeScripts(boolean recognizeScripts)

Sets whether the HTMLLexer should recognize script start and end tags and generate TK_HTML_SCRIPT tokens for script text.

Parameters:: recognizeScripts - whether to recognize scripts

setRecognizeStyles

public void setRecognizeStyles(boolean recognizeStyles)

Sets whether the HTMLLexer should recognize style start and end tags and generate TK_HTML_STYLE tokens for style text.

Parameters:: recognizeStyles - whether to recognize styles

setRecognizePHP

@Deprecated
public void setRecognizePHP(boolean recognizePHP)

Deprecated. The HTMLLexer should not be used for parsing PHP file.

Sets whether or not PHP tags (regular & ASP-styled) tags will be recognized. If they are, TK_PHP_TAG & TK_PHP_ASPTAG will be generated for them. Note that recognition of script tags have to be done separately.

setCaretPosition

public void setCaretPosition(int caretPosition)

Sets a caretPosition, if this is set to a value other than -1, it indicates that the lexer is being used for code insight. By default the HTMLLexer treats a less than character followed by a space as text. If the caretPosition is after the less than character, then we want to treat it as an incomplete element.

lex

public int lex(LexerToken lexedToken)

Scans the text buffer at the current position and returns the token that was found. The token and offset information is also stored in the lexedToken instance passed in to the call.

Specified by:: lex in interface Lexer
Specified by:: lex in class AbstractLexer

Parameters:: lexedToken - the instance passed in where token info is stored
Returns:: the token that was found, same as calling lexedToken.getToken() (for convenience)

backup

public void backup()

Unlexes the last found token. The next call to lex() will return the last token and offset information found.

Specified by:: backup in interface Lexer
Specified by:: backup in class AbstractLexer

setPosition

public void setPosition(int offset)

Sets the current lex (read) position to the given offset in the buffer. It is the clients responsibility to ensure that this offset corresponds to the start of a token, otherwise unexpected (and incorrect) results may occur.

Specified by:: setPosition in interface Lexer
Overrides:: setPosition in class AbstractLexer

Parameters:: offset - the offset for the next lex() operation.

skipPHPTag

protected void skipPHPTag()

Utility routine which scans through the text buffer to locate the end of a PHP tag.

skipPHPASPTag

protected void skipPHPASPTag()

Utility routine which scans to locate the end of an ASP-styled PHP tag. The ending (%>) is the same as a JSP scriptlet ending, so this just delegates.

tokenToString

public static java.lang.String tokenToString(int token)

Utility routine to map the token to a string representation of the token (for debug printing.)

Parameters:: token - the token to map
Returns:: a printable representation of the token

tokenToText

public static java.lang.String tokenToText(int token)

Utility routine to map the token to the original text (if retrievable) of the token (for debug printing.)

Parameters:: token - the token to map
Returns:: a printable representation of the token

setRecognizeJSP

public void setRecognizeJSP(boolean recognizeJSP)

Sets whether the TagLexer should recognize JSP tag symbols.

Parameters:: recognizeJSP - true to recognize JSP tag symbol characters

setRecognizeEmbeddedTags

public void setRecognizeEmbeddedTags(boolean recognizeEmbeddedTags)

Sets whether the lexer should recognize embedded HTML or JSP expression tags within an attribute value. Note, this lexer simply determines whether an attribute value contains an embedded tag, it does not determine whether it is legal for it to do so. Whether it is legal or not is up to the client to determine

Parameters:: recognizeEmbeddedTags - whether to recognize embedded tags

skipHTMLTag

protected void skipHTMLTag()

Utility routine which scans through the text buffer to find the end of an HTML tag. Sets the current position following the trailing '>', or at the end of the file.

isEmbeddedTagStart

protected boolean isEmbeddedTagStart(int searchPosition)

Utility routine to determine whether the given search position is the start of an embedded tag. This only searches for whether it is the start of a non-comment HTML tag, or the start of a JSP expression tag.

Parameters:: searchPosition - the offset in the buffer to check for the start
Returns:: true if the offset is the start of an embedded tag

skipEmbeddedTag

protected void skipEmbeddedTag()

Utility routine to skip over a found embedded tag. This assumes that the caller already found an embedded tag start with isEmbeddedTagStart(), and that the current position is still at the start of the embedded tag. This will place the position at the character after the end of the tag.

skipJSPScriplet

protected void skipJSPScriplet()

Utility routine which scans through the text buffer to find the end of a JSP scriplet tag. Note that for JSP tags that contain scriplet code, the %> immediately ends the tag, even if it is in a String or comment. Note that if the String or comment is unterminated, the translated *.java file will probably not be compilable.

skipJSPEL

protected void skipJSPEL()

Utility routine which scans through the text buffer to find the end of a JSP EL expression. Note that for JSP tags that contain EL, the '}' immediately ends the tag, even if it is in a String or comment. Note that if the String or comment is unterminated, the translated *.java file will probably not be compilable.

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

Oracle Fusion Middleware Java API Reference for Oracle Extension SDK Reference
11g Release 1 (11.1.1.5.0)
E13403-06

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

oracle.javatools.parser.html Class HTMLLexer