public class TagLexer extends AbstractLexer implements TagTokens
TagLexer
is an implementation of the
Lexer
interface for lexing an HTML tag. Specifically,
it is used for retrieving the pieces that make up an HTML tag, such
as the element name, the attribute name, or attribute value.
For the purposes of syntax highlighting, the '>', '<' and '='
characters can be returned as tokens by calling
setSkipSymbols(false)
. For parsing purposes, when you
wish to ignore these delimiter characters, call
setSkipSymbols(true)
.
Note that even when you enable recognition of JSP tags, you still need to enable recognition of embedded tags also. This means that it will properly handle an embedded tag found in an attribute value while scanning an HTML tag. This does not however check if it is legal for the embedded tag to be present - that is beyond the scope of this lexer. It is the caller's responsibility to do so.
Note, if the tags that you are examing with this lexer include JSP directive tags, make sure to enable JSP recognition, so that the '*@' characters are recognized as symbol characters instead of as part of the element names.
AbstractLexer.DefaultLexerToken
Modifier and Type | Field and Description |
---|---|
protected boolean |
recognizeEmbeddedTags
Whether to recognize embedded tags.
|
protected boolean |
recognizeJSP
Whether to recognize JSP tags or not.
|
currentPos, textBuffer
TK_ATTRIBUTE_NAME, TK_ATTRIBUTE_NAME_EMBEDDED, TK_ATTRIBUTE_VALUE, TK_ATTRIBUTE_VALUE_EMBEDDED, TK_ELEMENT_NAME, TK_SYMBOL
TK_EOF, TK_NOT_FOUND
Constructor and Description |
---|
TagLexer()
Constructs a default
TagLexer with a starting
position of 0. |
Modifier and Type | Method and Description |
---|---|
void |
backup()
Unlexes the last found token.
|
protected boolean |
isEmbeddedTagStart(int searchPosition)
Utility routine to determine whether the given search
position is the start of an embedded tag.
|
int |
lex(LexerToken lexedToken)
Scans the text buffer at the current position and returns the
token that was found.
|
void |
setPosition(int offset)
Sets the current lex (read) position to the given offset in the
buffer.
|
void |
setRecognizeEmbeddedTags(boolean recognizeEmbeddedTags)
Sets whether the lexer should recognize embedded HTML or JSP
expression tags within an attribute value.
|
void |
setRecognizeJSP(boolean recognizeJSP)
Sets whether the
TagLexer should recognize JSP
tag symbols. |
void |
setRecognizeSlash(boolean recognizeSlash)
Sets whether the
TagLexer should generate a symbol
token for the forward slash character |
void |
setSkipSymbols(boolean skipSymbols)
Sets whether the
TagLexer should generate tokens
for Java comments. |
protected void |
skipEmbeddedTag()
Utility routine to skip over a found embedded tag.
|
protected void |
skipHTMLTag()
Utility routine which scans through the text buffer to find
the end of an HTML tag.
|
protected void |
skipJSPEL()
Utility routine which scans through the text buffer to find
the end of a JSP EL expression.
|
protected void |
skipJSPScriplet()
Utility routine which scans through the text buffer to find
the end of a JSP scriplet tag.
|
int |
skipNameOrValue(boolean recognizeOpenSquareBracket)
Utility routine which scans through the text buffer to find the
end of a name or value based on whitespace.
|
static java.lang.String |
tokenToString(int token)
Utility routine to map the token to a string representation of
the token (for debug printing.)
|
static java.lang.String |
tokenToText(int token)
Utility routine to map the token to the original text (if
retrievable) of the token (for debug printing.)
|
createLexerToken, getTextBuffer, setTextBuffer
protected boolean recognizeJSP
protected boolean recognizeEmbeddedTags
public TagLexer()
TagLexer
with a starting
position of 0. Clients must call setTextBuffer()
to
initialize the text buffer used for the Lexer. To start lexing
from an offset other than 0, call setPosition()
.public void setSkipSymbols(boolean skipSymbols)
TagLexer
should generate tokens
for Java comments.skipSymbols
- true to ignore comments in token generationpublic void setRecognizeSlash(boolean recognizeSlash)
TagLexer
should generate a symbol
token for the forward slash characterrecognizeSlash
- true to generate a symbol token for slashpublic int lex(LexerToken lexedToken)
lexedToken
instance passed in to the
call.lex
in interface Lexer
lex
in class AbstractLexer
lexedToken
- the instance passed in where token info is storedlexedToken.getToken()
(for convenience)public void backup()
lex()
will return the last token and offset information found.backup
in interface Lexer
backup
in class AbstractLexer
public void setPosition(int offset)
setPosition
in interface Lexer
setPosition
in class AbstractLexer
offset
- the offset for the next lex()
operation.public int skipNameOrValue(boolean recognizeOpenSquareBracket)
public static java.lang.String tokenToString(int token)
token
- the token to mappublic static java.lang.String tokenToText(int token)
token
- the token to mappublic void setRecognizeJSP(boolean recognizeJSP)
TagLexer
should recognize JSP
tag symbols.recognizeJSP
- true to recognize JSP tag symbol characterspublic void setRecognizeEmbeddedTags(boolean recognizeEmbeddedTags)
recognizeEmbeddedTags
- whether to recognize embedded tagsprotected void skipHTMLTag()
protected boolean isEmbeddedTagStart(int searchPosition)
searchPosition
- the offset in the buffer to check for the startprotected void skipEmbeddedTag()
isEmbeddedTagStart()
, and that the current position
is still at the start of the embedded tag. This will place the
position at the character after the end of the tag.protected void skipJSPScriplet()
protected void skipJSPEL()