Skip navigation links

Oracle Fusion Middleware Java API Reference for Oracle Extension SDK Reference
11g Release 1 (11.1.1.5.0)

E13403-06


oracle.javatools.parser.html
Class HTMLLexer

java.lang.Object
  extended by oracle.javatools.parser.AbstractLexer
      extended by oracle.javatools.parser.html.HTMLLexer

All Implemented Interfaces:
HTMLTokens, Lexer

public class HTMLLexer
extends AbstractLexer
implements HTMLTokens

The HTMLLexer is an implementation of the Lexer interface for the HTML language. It can be used to retrieve a token stream for a regular HTML document, as well as a JSP document. To retrieve JSP-specific tags, enable JSP recognition by calling setRecognizeJSP(true). To enable recognition of script text within <script> and </script> tags, use setRecognizeScripts(true).

Note that even when you enable recognition of JSP tags, you still need to enable recognition of embedded tags also. This means that it will properly handle an embedded tag found in an attribute value while scanning an HTML tag. This does not however check if it is legal for the embedded tag to be present - that is beyond the scope of this lexer. It is the caller's responsibility to do so.

This lexer does not assist in deciphering the contents of an HTML tag, nor does it help in identifying element bodies. All this lexer does is locate HTML and JSP tags within the document.

See Also:
Lexer, HTMLTokens

Nested Class Summary

 

Nested classes/interfaces inherited from class oracle.javatools.parser.AbstractLexer
AbstractLexer.DefaultLexerToken

 

Field Summary
protected  boolean recognizeEmbeddedTags
          Whether to recognize embedded tags.
protected  boolean recognizeJSP
          Whether to recognize JSP tags or not.

 

Fields inherited from class oracle.javatools.parser.AbstractLexer
currentPos, textBuffer

 

Fields inherited from interface oracle.javatools.parser.html.HTMLTokens
TK_HTML_COMMENT, TK_HTML_DOCUMENT_TYPE, TK_HTML_PROCESSING_INSTRUCTION, TK_HTML_SCRIPT, TK_HTML_STYLE, TK_HTML_TAG, TK_HTML_TEXT, TK_JSP_COMMENT, TK_JSP_DECLARATION, TK_JSP_DIRECTIVE, TK_JSP_EXPRESSION, TK_JSP_SCRIPLET, TK_PHP_ASPTAG, TK_PHP_TAG

 

Fields inherited from interface oracle.javatools.parser.Lexer
TK_EOF, TK_NOT_FOUND

 

Constructor Summary
HTMLLexer()
          Constructs a default HTMLLexer with a starting position of 0.

 

Method Summary
 void backup()
          Unlexes the last found token.
protected  boolean isEmbeddedTagStart(int searchPosition)
          Utility routine to determine whether the given search position is the start of an embedded tag.
 int lex(LexerToken lexedToken)
          Scans the text buffer at the current position and returns the token that was found.
 void setCaretPosition(int caretPosition)
          Sets a caretPosition, if this is set to a value other than -1, it indicates that the lexer is being used for code insight.
 void setPosition(int offset)
          Sets the current lex (read) position to the given offset in the buffer.
 void setRecognizeEmbeddedTags(boolean recognizeEmbeddedTags)
          Sets whether the lexer should recognize embedded HTML or JSP expression tags within an attribute value.
 void setRecognizeJSP(boolean recognizeJSP)
          Sets whether the TagLexer should recognize JSP tag symbols.
 void setRecognizePHP(boolean recognizePHP)
          Deprecated. The HTMLLexer should not be used for parsing PHP file.
 void setRecognizeScripts(boolean recognizeScripts)
          Sets whether the HTMLLexer should recognize script start and end tags and generate TK_HTML_SCRIPT tokens for script text.
 void setRecognizeStyles(boolean recognizeStyles)
          Sets whether the HTMLLexer should recognize style start and end tags and generate TK_HTML_STYLE tokens for style text.
 void setSkipComments(boolean skipComments)
          Sets whether the HTMLLexer should generate tokens for Java comments.
protected  void skipEmbeddedTag()
          Utility routine to skip over a found embedded tag.
protected  void skipHTMLTag()
          Utility routine which scans through the text buffer to find the end of an HTML tag.
protected  void skipJSPEL()
          Utility routine which scans through the text buffer to find the end of a JSP EL expression.
protected  void skipJSPScriplet()
          Utility routine which scans through the text buffer to find the end of a JSP scriplet tag.
protected  void skipPHPASPTag()
          Utility routine which scans to locate the end of an ASP-styled PHP tag.
protected  void skipPHPTag()
          Utility routine which scans through the text buffer to locate the end of a PHP tag.
static java.lang.String tokenToString(int token)
          Utility routine to map the token to a string representation of the token (for debug printing.)
static java.lang.String tokenToText(int token)
          Utility routine to map the token to the original text (if retrievable) of the token (for debug printing.)

 

Methods inherited from class oracle.javatools.parser.AbstractLexer
createLexerToken, getTextBuffer, setTextBuffer

 

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

 

Field Detail

recognizeJSP

protected boolean recognizeJSP
Whether to recognize JSP tags or not.

recognizeEmbeddedTags

protected boolean recognizeEmbeddedTags
Whether to recognize embedded tags.

Constructor Detail

HTMLLexer

public HTMLLexer()
Constructs a default HTMLLexer with a starting position of 0. Clients must call setTextBuffer() to initialize the text buffer used for the Lexer. To start lexing from an offset other than 0, call setPosition().

Method Detail

setSkipComments

public void setSkipComments(boolean skipComments)
Sets whether the HTMLLexer should generate tokens for Java comments.
Parameters:
skipComments - true to ignore comments in token generation

setRecognizeScripts

public void setRecognizeScripts(boolean recognizeScripts)
Sets whether the HTMLLexer should recognize script start and end tags and generate TK_HTML_SCRIPT tokens for script text.
Parameters:
recognizeScripts - whether to recognize scripts

setRecognizeStyles

public void setRecognizeStyles(boolean recognizeStyles)
Sets whether the HTMLLexer should recognize style start and end tags and generate TK_HTML_STYLE tokens for style text.
Parameters:
recognizeStyles - whether to recognize styles

setRecognizePHP

@Deprecated
public void setRecognizePHP(boolean recognizePHP)
Deprecated. The HTMLLexer should not be used for parsing PHP file.
Sets whether or not PHP tags (regular & ASP-styled) tags will be recognized. If they are, TK_PHP_TAG & TK_PHP_ASPTAG will be generated for them. Note that recognition of script tags have to be done separately.

setCaretPosition

public void setCaretPosition(int caretPosition)
Sets a caretPosition, if this is set to a value other than -1, it indicates that the lexer is being used for code insight. By default the HTMLLexer treats a less than character followed by a space as text. If the caretPosition is after the less than character, then we want to treat it as an incomplete element.

lex

public int lex(LexerToken lexedToken)
Scans the text buffer at the current position and returns the token that was found. The token and offset information is also stored in the lexedToken instance passed in to the call.
Specified by:
lex in interface Lexer
Specified by:
lex in class AbstractLexer
Parameters:
lexedToken - the instance passed in where token info is stored
Returns:
the token that was found, same as calling lexedToken.getToken() (for convenience)

backup

public void backup()
Unlexes the last found token. The next call to lex() will return the last token and offset information found.
Specified by:
backup in interface Lexer
Specified by:
backup in class AbstractLexer

setPosition

public void setPosition(int offset)
Sets the current lex (read) position to the given offset in the buffer. It is the clients responsibility to ensure that this offset corresponds to the start of a token, otherwise unexpected (and incorrect) results may occur.
Specified by:
setPosition in interface Lexer
Overrides:
setPosition in class AbstractLexer
Parameters:
offset - the offset for the next lex() operation.

skipPHPTag

protected void skipPHPTag()
Utility routine which scans through the text buffer to locate the end of a PHP tag.

skipPHPASPTag

protected void skipPHPASPTag()
Utility routine which scans to locate the end of an ASP-styled PHP tag. The ending (%>) is the same as a JSP scriptlet ending, so this just delegates.

tokenToString

public static java.lang.String tokenToString(int token)
Utility routine to map the token to a string representation of the token (for debug printing.)
Parameters:
token - the token to map
Returns:
a printable representation of the token

tokenToText

public static java.lang.String tokenToText(int token)
Utility routine to map the token to the original text (if retrievable) of the token (for debug printing.)
Parameters:
token - the token to map
Returns:
a printable representation of the token

setRecognizeJSP

public void setRecognizeJSP(boolean recognizeJSP)
Sets whether the TagLexer should recognize JSP tag symbols.
Parameters:
recognizeJSP - true to recognize JSP tag symbol characters

setRecognizeEmbeddedTags

public void setRecognizeEmbeddedTags(boolean recognizeEmbeddedTags)
Sets whether the lexer should recognize embedded HTML or JSP expression tags within an attribute value. Note, this lexer simply determines whether an attribute value contains an embedded tag, it does not determine whether it is legal for it to do so. Whether it is legal or not is up to the client to determine
Parameters:
recognizeEmbeddedTags - whether to recognize embedded tags

skipHTMLTag

protected void skipHTMLTag()
Utility routine which scans through the text buffer to find the end of an HTML tag. Sets the current position following the trailing '>', or at the end of the file.

isEmbeddedTagStart

protected boolean isEmbeddedTagStart(int searchPosition)
Utility routine to determine whether the given search position is the start of an embedded tag. This only searches for whether it is the start of a non-comment HTML tag, or the start of a JSP expression tag.
Parameters:
searchPosition - the offset in the buffer to check for the start
Returns:
true if the offset is the start of an embedded tag

skipEmbeddedTag

protected void skipEmbeddedTag()
Utility routine to skip over a found embedded tag. This assumes that the caller already found an embedded tag start with isEmbeddedTagStart(), and that the current position is still at the start of the embedded tag. This will place the position at the character after the end of the tag.

skipJSPScriplet

protected void skipJSPScriplet()
Utility routine which scans through the text buffer to find the end of a JSP scriplet tag. Note that for JSP tags that contain scriplet code, the %> immediately ends the tag, even if it is in a String or comment. Note that if the String or comment is unterminated, the translated *.java file will probably not be compilable.

skipJSPEL

protected void skipJSPEL()
Utility routine which scans through the text buffer to find the end of a JSP EL expression. Note that for JSP tags that contain EL, the '}' immediately ends the tag, even if it is in a String or comment. Note that if the String or comment is unterminated, the translated *.java file will probably not be compilable.

Skip navigation links

Oracle Fusion Middleware Java API Reference for Oracle Extension SDK Reference
11g Release 1 (11.1.1.5.0)

E13403-06


Copyright © 1997, 2011, Oracle. All rights reserved.