Sun Identity Manager 8.1 Resources Reference

Chapter 49 Implementing the AttrParse Object

The AttrParse object encapsulates a grammar used to parse user listings. It is used primarily by mainframe-based resource adapters that receive a screen of data at a time and must parse out the desired results. (This technique is often called screen scraping.) The Shell Script and Scripted Gateway adapters also use AttrParse with getUser and getAllUsers actions.

The adapters that use the AttrParse object model the screen as a Java string. An instantiation of an AttrParse object contains one or more tokens. Each token defines a portion of the screen. These tokens are used to “tokenize” the screen string and allow the adapters to discover the user properties from the user listing.

After parsing a user listing, AttrParse returns a map of user attribute name/value pairs.

Configuration

As with all other Identity Manager objects, the AttrParse objects are serialized to XML for persistent storage. AttrParse objects can then be configured to support differences in customer environments. For example, the ACF2 mainframe security system is often customized to include additional fields and field lengths. Since AttrParse objects reside in the repository, they can be changed and configured to account for these differences without requiring that a custom adapter be written.

As with all Identity Manager configuration objects, objects that are to be changed should be copied, renamed, and then modified.

ProcedureEditing an AttrParse Object

  1. From the Debug page, select AttrParse from the drop-down menu adjacent to the List Objects button. Click List Objects.

  2. From the list of available objects, select the object you want to edit.

  3. Copy, edit, and rename the object in your XML editor-of-choice.

  4. From the Configure page, select Import Exchange File to import the new file into Identity Manager.

  5. In your resource, change the AttrParse resource attribute to the name of the new AttrParse string.

    For examples of AttrParse objects that ship with Identity Manager see the sample\attrparse.xml file. It lists the default AttrParse objects used by the screen scraping adapters.

AttrParse Element and Tokens

AttrParse Element

The AttrParse element defines the AttrParse object.

Attributes

Attribute

Description

name

Uniquely defines the AttrParse object. This value will be specified on the Resource Parameters page for the adapter. 

Data

One or more tokens that parse user listings. The following tokens supported by the AttrParse object

Example

The following example reads the first 19 characters of a line, trims extraneous white space, and assigns the string as the value to the USERID resource attribute. It then skips forward five spaces and extracts the NAME resource attribute. This attribute has a maximum of 21 characters, and white space is trimmed. The sample checks for the string “Phone number: “. A telephone number will be parsed out and assigned to the PHONE resource attribute. The phone number begins after the space in “Phone number: “ and ends at the next space encountered. The trailing space is trimmed.

<AttrParse name=’Example AttrParse’>
   <str name=’USERID’ trim=’true’ len=’19’/>
   <skip len=’5’/>
   <str name=’NAME’ trim=’true’ len=’21’/>
   <t offset=’-1’>Phone number: </t>
   <str name=’PHONE’ trim=’true’ term=’ ’/>
</AttrParse>

The following strings satisfy the Example AttrParse grammar. (The· symbols represent spaces.)

gwashington123·····ABCD·George·Washington····Phone·number:·123-1234·
alincoln···········XYZ··Abraham·Lincoln······Phone·number:·321-4321·

In the first case after parsing, the user attribute map would contain:

USERID=“gwashington123”, NAME=“George Washington”, PHONE=“123-1234”

Similarly, the second user attribute map would contain:

USERID=”alincoln”, NAME=”Abraham Lincoln”, PHONE=“321-4321”

The rest of the text is ignored.

collectCsvHeader Token

The collectCsvHeader token reads a line designated as the header of a comma-separated values (CSV) file.

The Scripted Gateway adapter and Shell Script adapter, among others, can use this token. The collectCsvHeader and collectCsvLines tokens are the only tokens that the Scripted Gateway adapter can use.

Each name in the header must be the same as a resource user attribute on the schema map on the resource adapter. If a string in the header does not match a resource user attribute name, it and the values in the corresponding position in the subsequent data lines will be ignored.

Attributes

Attribute

Description

idHeader

Specifies which value in the header is considered the account ID. This attribute is optional, but recommended. If it is not specified, then the value for the nameHeader attribute will be used.

nameHeader

Specifies which value in the header is considered the name for the account. This is often the same value as idHeader, and if not specified, the value in idHeader is used. This attribute is optional but recommended.

delim

Optional. The string that separates values in the header. The default value is , (comma). 

minCount

Specifies the minimum number of instances of the string specified in the delim attribute that a valid header must have.

trim

Optional. If set to true, then if a value has leading or trailing blanks, remove them. The default is false.

unQuote

Optional. If set to true, then if a value is enclosed in quotes, remove them. The default is false.

Data

None

Example

The following example identifies accountId as the value to be used for the account ID. White space and quotation marks are removed from values.

<collectCsvHeader idHeader=’accountId’ delim=’,’ trim=’true’ unQuote=’true’/>

collectCsvLines Token

The collectCvsLines token parses a line in a comma-separated values (CSV) file. The collectCvsHeader token must have been previously invoked.

The Scripted Gateway adapter and Shell Script adapter, among others, can use this token. The collectCsvHeader and collectCsvLines tokens are the only tokens that the Scripted Gateway adapter can use.

Attributes

If any of the following attributes are not specified, then the value is inherited from the previously-issued collectCsvHeader token.

Attribute

Description

idHeader

Specifies which value is considered the account ID. 

nameHeader

Specifies which value is considered the name for the account. 

delim

Optional. The string that separates values in the header. The default value is , (comma). 

trim

Optional. If set to true, then if a value has leading or trailing blanks, remove them. The default is false.

unQuote

Optional. If set to true, then if a value is enclosed in quotes, remove them. The default is false.

Data

None

Example

The following example removes white space and quotation marks from values.

<collectCsvLines trim=’yes’ unQuote=’yes’/>

eol Token

The eol token matches the end of line character (\n). The parse position will be advanced to the first character on the next line.

Attributes

None

Data

None

Example

The following token matches the end-of-line character.

<eol/>

flag Token

The flag token is often used inside an opt token to determine if a flag that defines an account property exists on a user account. This token searches for a specified string. If the text is found, AttrParse assigns the boolean value true to the attribute, then adds the entry to the attribute map.

The parse position will be advanced to the first character after the matched text.

Attributes

Attribute

Description

name

The name of the attribute to use in the attribute value map. The name is usually the same as a resource user attribute on the schema map on the resource adapter, but this is not a requirement. 

offset

The number of characters to skip before searching for the text for the token. The offset can have the following values: 

  • 1 or higher moves the specified number of characters before trying to match the token’s text.

  • 0 searches for text at the current parse position. This is the default value.

  • -1 indicates the token’s text will be matched at the current parse position, but the parse position will not go past the string specified in the termToken attribute, if present.

termToken

A string to use as an indicator that the text being searched for is not present. This string is often the first word or label in the next line on the screen output. 

The parse position will be the character after the termToken string. 

The termToken attribute can only be used if the len attribute is negative one (-1). 

Data

The text to match.

Examples

Procedureflag Token Examples

  1. The following token will match AUDIT at the current parse position, and if found, adds AUDIT_FLAG=true to the user attribute map.


    <flag offset=’-1’ name=’AUDIT’>AUDIT_FLAG</flag>
  2. The following token will match xxxxCICS at the current parse position, where xxxx are any four characters, including spaces. If this string is found, AttrParse adds CICS=true to the user attribute map.


    <flag offset=’4’ name=’CICS’>CICS</flag>

int Token

The int token captures an account attribute that is an integer. The attribute name and integer value will be added to the account attribute map. The parse position will be advanced to the first character after the integer.

Attributes

Attribute

Description

name

The name of the attribute to use in the attribute value map. The name is usually the same as a resource user attribute on the schema map on the resource adapter, but this is not a requirement. 

len

Indicates the exact length of the expected integer. The length can have the following values: 

  • 1 or higher captures the specified number of characters and checks to see if the text is an integer value or if it matches the characters specified in the noval attribute.

  • -1 indicates the parser will take the longest string of digits starting at the current parse position unless the next characters equal the noval attribute. This is the default value.

noval

Optional. A label on the screen that indicates the attribute does not have an integer value. Essentially, it is a null value indicator. The parse position will be advanced to the first character after the noval string.

Data

None

Examples

Procedureint Token Examples

  1. The following token matches a 6-digit integer and puts integer value of those digits into the attribute value map for the SALARY attribute.


    <int name=’SALARY’ len=’6’/>

    If the value 010250 is found, AttrParse adds SALARY=10250 to the value map.

  2. The following token matches any number of digits and adds that integer value to the attribute map for the AGE attribute.


    <int name=’AGE’ len=’-1’ noval=’NOT GIVEN’/>

    If the value 34 is found, for example, AGE=34 would be added to the attribute map. For string NOT GIVEN, a value will not be added to the attribute map for the AGE attribute.

loop Token

The loop token repeatedly executes the elements it contains until the input is exhausted.

Attributes

None

Data

Varies

Example

The following example reads the contents of a CSV file.

<loop>
   <skipLinesUntil token=’,’ minCount=’4’ />
   <collectCsvHeader idHeader=’accountId’ />
   <collectCvsLines />
</loop>

multiLine Token

The multiLine token matches a pattern that recurs on multiple lines. If the next line matches the multiLine’s internal AttrParse string, the parsed output will be added to the account attribute map at the top level. The parse position will be advanced to the first line that doesn’t match the internal AttrParse string.

Attributes

Attribute

Description

opt

Indicates the internal AttrParse string might be optional. 

Indicates that there might be no lines that match the internal AttrParse string and that parsing should continue with the next token. 

Data

Any AttrParse tokens to parse a line of data.

Example

The following multiLine token matches multiple group lines that have a GROUPS[space][space][space]= tag and a space delimited group list.

<multiLine opt=’true’>
   <t>GROUPS[space][space][space]=</t>
   <str name=’GROUP’ multi=’true’ delim=’ ’ trim=’true’/>
   <skipToEol/>
</multiLine>

AttrParse would add GROUPS = {Group1,Group2,Group3,Group4} to the account attribute map, given the following string is read as input:

GROUPS[space][space][space]= Group1[space]Group2\n
GROUPS[space][space][space]= Group3[space]Group4\n
Unrelated text...

opt Token

The opt token parses optional strings that are arbitrarily complex, such as those that are composed of multiple tokens. If the match token is present, then the internal AttrParse string is used to parse the next part of the screen. If an optional section is present, the parse position will be advanced to the character after the end of the optional section. Otherwise, the parse position is unchanged.

Attributes

None

Data

Contains the apMatch token, followed by an AttrParse token.

apMatch. Contains the token to match to determine whether the optional section is present. apMatch is a subtoken that can be used only within the opt token. apMatch token always contains the flag token as a subtoken.

AttrParse. Specifies how to parse the optional part of the screen. This version of the AttrParse element does not use the name argument. It can contain any other token.

Example

The following opt token attempts to match a CONSNAME= text token. If it is found, then it will parse a string of length 8, trim white space, and add the string to the account attribute map for the NETVIEW.CONSNAME attribute.

<opt>
   <apMatch>
      <t offset=’-1’> CONSNAME= </t>
   </apMatch>
   <AttrParse>
      <str name=’NETVIEW.CONSNAME’ len=’8’ trim=’true’ />
   </AttrParse>
</opt>

skip Token

The skip token tokenizes areas of the screen that can be skipped and that don’t contain useful information about the user that should be parsed. The parse position will be advanced to the first character after the skipped characters.

Attributes

Attribute

Description

len

Indicates the number of characters to skip on the screen. 

Data

None

Examples

In the following examples, the first token skips 17 characters, while the second skips only one character.

<skip len=’17’/>
<skip len=’1’/>

skipLinesUntil Token

The skipLinesUntil token skips over lines of input until one is found that has at least the specified number of instances of a given string.

Attributes

Attribute

Description

token

The string to search for. 

minCount

The minimum number of instances of the string specified in the token attribute that must be present. 

Data

None

Example

The following token skips forward to the next line that contains two commas. The parse position will be at the first character of that line.

<skipLinesUntil token=’,’ minCount=’2’/>

skipToEol Token

The skipToEol token skips all characters from the current parse position to the end of the current line. The parse position will be advanced to the first character on the next line.

Attributes

None

Data

None

Example

The following token skips all characters until the end of the current line. The parse position will be at the first character of the next line.

<skipToEol/>

skipWhitespace Token

The skipWhitespace token is used to skip any number of white-space characters. The system uses Java’s definition of white space. The parse position will be advanced to the first non-white-space character.

Attributes

None

Data

None

Example

The following token skips all the white space at the current parse position.

<skipWhitespace/>

str Token

The str token captures an account attribute that is a string. The attribute name and string value will be added to the account attribute map. The parse position will be advanced to the first character after the string.

Attributes

Attribute

Description

name

The name of the attribute to use in the attribute value map. The name is usually the same as a resource user attribute on the schema map on the resource adapter, but this is not a requirement. 

len

Indicates the exact length of the expected string. The length can have the following values: 

  • 1 or higher captures the specified number of characters, unless the characters equal the noval attribute.

  • -1 captures all the characters from the current parse position until the next white-space character, unless the next characters equal the noval attribute. This is the default.

term

A string that indicates parsing should stop for this str token when any of the characters in the string are reached. If the len argument is 1 or higher, then either the str token will end at len, or the term character, whichever comes first.

termToken

A string to use as an indicator that the text being searched for is not present. This string is often the first word or label in the next line on the screen output. 

The parse position will be the character after the termToken string. The string added to the attribute map will be all the characters before the termToken was found. 

The termToken attribute can only be used if the len attribute is negative one (-1). 

trim

Optional. A true or false value that indicates whether the returned value or multiple values (if the multi attribute is specified) are trimmed before being added to the account attribute map. The default value is false.

noval

A label on the screen that indicates the attribute doesn’t have an string value. Essentially, it is a null value indicator. The parse position will be advanced to the first character after the noval string.

multiLine

A true or false value that indicates whether the string will span multiple screen lines.

This attribute can only be used if a len attribute is supplied and is assigned a value greater than zero. If multiLine is present, end of line characters will be skipped until the number of characters specified in the len attribute have been parsed. 

multi

A true or false value that indicates that the string captured is a multi-valued attribute that must be further parsed to find each sub-value. The multiple values can either be appended together using the appendSeparator or can be turned into a list of values.

delim

A delimiter for parsing the multi-valued string. This attribute can only be used if the multi attribute is specified. 

If this is not specified, then the multi str token is assumed to be delimited by spaces.

append

A true or false value that indicates that the multiple values should be appended together into a string using the appendSeparator. If append is not present, the multiple values will be put into a list for the account attribute value map. This attribute is used in conjunction with the multi attribute.

appendSeparator

Indicates the string to separate the multiple values for an append token. This attribute is only valid if the append attribute is set to true. If the appendSeparator is not present, the append attribute does not use a separator. Instead, it concatenates the multiple values into the result string.

Data

None

Examples

t Token

The t token is used to tokenize text. It is commonly used to recognize labels during screen scraping and provide knowledge of where on the screen you are parsing. The parse position will be advanced to the first character after the matched text. The parser always moves left to right within a line of text.

Attributes

Attribute

Description

offset

The number of characters to skip before searching for the text for the token. The offset can have the following values: 

  • 1 or higher moves the specified number of characters before trying to match the token’s text.

  • 0 searches for text at the current parse position. This is the default value.

  • -1 indicates the token’s text will be matched at the current parse position, but the parse position will not go past the string specified in the termToken attribute, if present.

termToken

A string that indicates parsing should stop for this token. The parse position will be the character after the termToken string. 

The termToken attribute can only be used if the offset attribute is negative one (-1). 

Data

The text to match

Examples