Sun[TM] Identity Manager 8.0 Resources Reference |
Chapter 2
Implementing the AttrParse ObjectThe AttrParse object encapsulates a grammar used to parse user listings. It is used primarily by mainframe-based resource adapters that receive a screen of data at a time and must parse out the desired results. (This technique is often called screen scraping.) The Shell Script and Scripted Gateway adapters also use AttrParse with getUser and getAllUsers actions.
The adapters that use the AttrParse object model the screen as a Java string. An instantiation of an AttrParse object contains one or more tokens. Each token defines a portion of the screen. These tokens are used to “tokenize” the screen string and allow the adapters to discover the user properties from the user listing.
After parsing a user listing, AttrParse returns a map of user attribute name/value pairs.
ConfigurationAs with all other Identity Manager objects, the AttrParse objects are serialized to XML for persistent storage. AttrParse objects can then be configured to support differences in customer environments. For example, the ACF2 mainframe security system is often customized to include additional fields and field lengths. Since AttrParse objects reside in the repository, they can be changed and configured to account for these differences without requiring that a custom adapter be written.
As with all Identity Manager configuration objects, objects that are to be changed should be copied, renamed, and then modified.
- From the Debug page, select AttrParse from the drop-down menu adjacent to the List Objects button. Click List Objects.
- From the list of available objects, select the object you want to edit.
- Copy, edit, and rename the object in your XML editor-of-choice.
- From the Configure page, select Import Exchange File to import the new file into Identity Manager.
- In your resource, change the AttrParse resource attribute to the name of the new AttrParse string.
For examples of AttrParse objects that ship with Identity Manager see the sample\attrparse.xml file. It lists the default AttrParse objects used by the screen scraping adapters.
AttrParse Element and TokensAttrParse Element
The AttrParse element defines the AttrParse object.
Attributes
Attribute
Description
name
Uniquely defines the AttrParse object. This value will be specified on the Resource Parameters page for the adapter.
Data
One or more tokens that parse user listings. The following tokens supported by the AttrParse object
Example
The following example reads the first 19 characters of a line, trims extraneous whitespace, and assigns the string as the value to the USERID resource attribute. It then skips forward five spaces and extracts the NAME resource attribute. This attribute has a maximum of 21 characters, and whitespace is trimmed. The sample checks for the string “Phone number: “. A telephone number will be parsed out and assigned to the PHONE resource attribute. The phone number begins after the space in “Phone number: “ and ends at the next space encountered. The trailing space is trimmed.
<AttrParse name='Example AttrParse'>
<str name='USERID' trim='true' len='19'/>
<skip len='5'/>
<str name='NAME' trim='true' len='21'/>
<t offset='-1'>Phone number: </t>
<str name='PHONE' trim='true' term=' '/>
</AttrParse>The following strings satisfy the Example AttrParse grammar. (The symbols represent spaces.)
gwashington123ABCDGeorgeWashingtonPhonenumber:123-1234
alincolnXYZAbrahamLincolnPhonenumber:321-4321
In the first case after parsing, the user attribute map would contain:
USERID=“gwashington123”, NAME=“George Washington”, PHONE=“123-1234”
Similarly, the second user attribute map would contain:
USERID=”alincoln”, NAME=”Abraham Lincoln”, PHONE=“321-4321”
The rest of the text is ignored.
collectCsvHeader Token
The collectCsvHeader token reads a line designated as the header of a comma seperated values (CSV) file.
The Scripted Gateway adapter is the only adapter that can use this token. The collectCsvHeader and collectCsvLines tokens are the only tokens that determine attributes that can be used with this adapter.
Each name in the header must be the same as a resource user attribute on the schema map on the resource adapter. If a string in the header does not match a resource user attribute name, it and the values in the corresponding position in the subsequent data lines will be ignored.
Attributes
Data
None
Example
The following example identifies accountId as the value to be used for the account ID. Whitespace and quotation marks are removed from values.
<collectCsvHeader idHeader='accountId' delim=',' trim='true' unQuote='true'/>
collectCsvLines Token
The collectCvsLines token parses a line in a comma seperated values (CSV) file. The collectCvsHeader token must have been previously invoked.
The Scripted Gateway adapter is the only adapter that can use this token. The collectCsvHeader and collectCsvLines tokens are the only tokens that determine attributes that can be used with this adapter.
Attributes
If any of the following attributes are not specified, then the value is inherited from the previously-issued collectCsvHeader token.
Data
None
Example
The following example removes whitespace and quotation marks from values.
<collectCsvLines trim='yes' unQuote='yes'/>
eol Token
The eol token matches the end of line character (\n). The parse position will be advanced to the first character on the next line.
Attributes
None
Data
None
Example
The following token matches the end of line character.
<eol/>
flag Token
The flag token is often used inside an opt token to determine if a flag that defines an account property exists on a user account. This token searches for a specified string. If the text is found, AttrParse assigns the boolean value true to the attribute, then adds the entry to the attribute map.
The parse position will be advanced to the first character after the matched text.
Attributes
Data
The text to match.
Examples
- The following token will match AUDIT at the current parse position, and if found, adds AUDIT_FLAG=true to the user attribute map.
<flag offset='-1' name='AUDIT'>AUDIT_FLAG</flag>
- The following token will match xxxxCICS at the current parse position, where xxxx are any four characters, including spaces. If this string is found, AttrParse adds CICS=true to the user attribute map.
<flag offset='4' name='CICS'>CICS</flag>
int Token
The int token captures an account attribute that is an integer. The attribute name and integer value will be added to the account attribute map. The parse position will be advanced to the first character after the integer.
Attributes
Data
None
Examples
- The following token matches a 6-digit integer and puts integer value of those digits into the attribute value map for the SALARY attribute.
<int name='SALARY' len='6'/>
If the value 010250 is found, AttrParse adds SALARY=10250 to the value map.
- The following token matches any number of digits and adds that integer value to the attribute map for the AGE attribute.
<int name='AGE' len='-1' noval='NOT GIVEN'/>
If the value 34 is found, for example, AGE=34 would be added to the attribute map. For string NOT GIVEN, a value will not be added to the attribute map for the AGE attribute.
loop Token
The loop token repeatedly executes the elements it contains until the input is exhausted.
Attributes
None
Data
Varies
Example
The following example reads the contents of a CSV file.
<loop>
<skipLinesUntil token=’,’ minCount=’4’ />
<collectCsvHeader idHeader=’accountId’ />
<collectCvsLines />
</loop>multiLine Token
The multiLine token matches a pattern that recurs on multiple lines. If the next line matches the multiLine's internal AttrParse string, the parsed output will be added to the account attribute map at the top level. The parse position will be advanced to the first line that doesn't match the internal AttrParse string.
Attributes
Data
Any AttrParse tokens to parse a line of data.
Example
The following multiLine token matches multiple group lines that have a GROUPS[space][space][space]= tag and a space delimited group list.
<multiLine opt='true'>
<t>GROUPS[space][space][space]=</t>
<str name='GROUP' multi='true' delim=' ' trim='true'/>
<skipToEol/>
</multiLine>AttrParse would add GROUPS = {Group1,Group2,Group3,Group4} to the account attribute map, given the following string is read as input:
GROUPS[space][space][space]= Group1[space]Group2\n
GROUPS[space][space][space]= Group3[space]Group4\n
Unrelated text...opt Token
The opt token parses optional strings that are arbitrarily complex, such as those that are composed of multiple tokens. If the match token is present, then the internal AttrParse string is used to parse the next part of the screen. If an optional section is present, the parse position will be advanced to the character after the end of the optional section. Otherwise, the parse position is unchanged.
Attributes
None
Data
Contains the apMatch token, followed by an AttrParse token.
apMatch — Contains the token to match to determine whether the optional section is present. apMatch is a subtoken that can be used only within the opt token. apMatch token always contains the flag token as a subtoken.
AttrParse — Specifies how to parse the optional part of the screen. This version of the AttrParse element does not use the name argument. It can contain any other token.
Example
The following opt token attempts to match a CONSNAME= text token. If it is found, then it will parse a string of length 8, trim whitespace, and add the string to the account attribute map for the NETVIEW.CONSNAME attribute.
<opt>
<apMatch>
<t offset='-1'> CONSNAME= </t>
</apMatch>
<AttrParse>
<str name='NETVIEW.CONSNAME' len='8' trim='true' />
</AttrParse>
</opt>skip Token
The skip token tokenizes areas of the screen that can be skipped and that don't contain useful information about the user that should be parsed. The parse position will be advanced to the first character after the skipped characters.
Attributes
Data
None
Examples
In the following examples, the first token skips 17 characters, while the second skips only one character.
<skip len='17'/>
<skip len='1'/>skipLinesUntil Token
The skipLinesUntil token skips over lines of input until one is found that has at least the specified number of instances of a given string.
Attributes
Attribute
Description
token
The string to search for.
minCount
The minimum number of instances of the string specified in the token attribute that must be present.
Data
None
Example
The following token skips forward to the next line that contains two commas. The parse position will be at the first character of that line.
<skipLinesUntil token=',' minCount='2'/>
skipToEol Token
The skipToEol token skips all characters from the current parse position to the end of the current line. The parse position will be advanced to the first character on the next line.
Attributes
None
Data
None
Example
The following token skips all characters until the end of the current line. The parse position will be at the first character of the next line.
<skipToEol/>
skipWhitespace Token
The skipWhitespace token is used to skip any number of whitespace characters. The system uses Java's definition of whitespace. The parse position will be advanced to the first non-whitespace character.
Attributes
None
Data
None
Example
The following token skips all the whitespace at the current parse position.
<skipWhitespace/>
str Token
The str token captures an account attribute that is a string. The attribute name and string value will be added to the account attribute map. The parse position will be advanced to the first character after the string.
Attributes
Data
None
Examples
- The following token matches a string of length 21 characters and trims whitespace off of the front and back.
<str name='NAME' trim='true' len='21'/>
Given the string [space][space]George Washington[space][space], AttrParse adds NAME=”George Washington” to the account attribute map.
- The following token matches a string of arbitrary length terminated by a ) (right parenthesis).
<str name='STATISTICS.SEC-VIO' term=')' />
Given the string, 2 – Monday, Wednesday - )text, AttrParse adds STATISTICS.SEC-VIO=”2 – Monday, Wednesday - “ to the account attribute map.
- The following token matches a list of words delimited by spaces from the current parse position to the end of the current line.
<str name='GROUP' multi='true' delim=' ' trim='true'/>
Given the string, Group1 Group2 newGroup lastGroup\n, AttrParse adds a list of group name strings {Group1, Group2, newGroup, lastGroup} to the account attribute map for the GROUP attribute.
- The following token performs the same function as the previous example, except the account attribute map will contain GROUP={Group1:Group2:newGroup:lastGroup}
<str name='GROUP' multi='true' delim=' ' trim='true' append='true' appendSeperator=':' />
t Token
The t token is used to tokenize text. It is commonly used to recognize labels during screen scraping and provide knowledge of where on the screen you are parsing. The parse position will be advanced to the first character after the matched text. The parser always moves left to right within a line of text.
Attributes
Data
The text to match
Examples
- The following token matches Address Line 1:[space] at the current parse position.
<t offset='-1'>Address Line 1: </t>
- The following token matches xxZip Code:[space] at the current parse position, where xx can be any two characters, including spaces.
<t offset='2'>Zip Code: </t>
- The following token matches Phone:[space] at the current parse position. If AttrParse finds the string Employee ID first, then it will generate an error.
<t offset='-1' termToken='Employee ID'>Phone: </t>