Skip Headers
Oracle® Outside In Content Access
Release 8.3.5
Go to Documentation Home
Home
Go to Table of Contents
Contents
Go to Index
Index
Go to Feedback page
Contact Us

Go to previous page
Previous
Go to next page
Next
View PDF

Content Description

This chapter discusses tagged content and other content topics.

SCCCA_BEGINTAG/SCCCA_ENDTAG: Tagged Content

The SCCCA_BEGINTAG and SCCCA_ENDTAG content types are used to tag or delimit other content for a particular purpose. This can be especially useful when searching for specific document property values like the author or title of a document. It can also be used to separate subdocument text like headers, footers, and footnotes from the main document text. Tagged text may be nested inside other tagged text, and tags may overlap each other.

Though most tag types are not particularly useful to developers, the Data Access technology provides all of the tag types rather than make a judgment as to usability. Each is briefly described below.

SCCCA_BEGINTAG Content Description

This section lists the applicable parameters and corresponding values.

  • dwType

    • SCCCA_BEGINTAG: Beginning of tagged content

    • SCCCA_ENDTAG: End of tagged content

  • dwSubType: Tag type - see "Tag Types" on page 7-1

  • dwData1: Additional ID - see "Document Property IDs" on page 7-3 or "Mail Field IDs" on page 7-6

  • dwData2: Not used

  • dwData3: Reserved

  • dwData4: Reserved

  • pDataBuf: Not used

Tag Types

This section lists the applicable values and corresponding descriptions.

  • SCCCA_ALTFONTDATA: Reserved

  • SCCCA_ANNOTATIONREFERENCE: Tags content that references an annotation

  • SCCCA_BOOKMARK: Delimits content tagged as a bookmark

  • SCCCA_CAPTIONTEXT: Tags content that is used as a caption on objects such as tables, equations and figures

  • SCCCA_CHARACTER: Reserved

  • SCCCA_COMPILEDFIELD: Tags content resulting from an application compiling a field code such as a date. The lack of consistent support by applications for this field makes it unreliable as a search property.

  • SCCCA_CONDITIONALSTYLE: Reserved

  • SCCCA_COUNTERFORMAT: Reserved

  • SCCCA_CUSTOMDATAFORMAT: Reserved

  • SCCCA_DATEDEFINITION: Reserved

  • SCCCA_DIAGRAM: Reserved

  • SCCCA_DIAGRAM_*: Reserved

  • SCCCA_DOCUMENTPROPERTY: Tags document property content - see "Document Property IDs" on page 7-3

  • SCCCA_DOCUMENTPROPERTYNAME: Name of a user-defined document property (SCCCA_USERDEFINEDPROP)

  • SCCCA_EMAILFIELD: Tags fields associated with email formats - see "Mail Field IDs" on page 7-6

  • SCCCA_EMAILFIELDNAME: Tags the name of a non-standard email field.

  • SCCCA_EMAILTABLE: Table of email fields

  • SCCCA_ENDNOTEREFERENCE: Tags content that references an endnote

  • SCCCA_FONTANDGLYPHDATA: Tags content that references font or glyph data

  • SCCCA_FOOTER: Delimits content tagged as footer

  • SCCCA_FOOTNOTEREFERENCE: Tags content that references a footnote

  • SCCCA_FRAME: Tags content stored within a frame

  • SCCCA_FRAME_EX: Tags content that references extended frames

  • SCCCA_GENERATEDFIELD: Reserved

  • SCCCA_GENERATOR: Reserved

  • SCCCA_HEADER: Delimits content tagged as header

  • SCCCA_HYPERLINK: Delimits content tagged as a hypertext link

  • SCCCA_INDEX: Reserved

  • SCCCA_INDEXENTRY: Delimits content that should be placed in the index

  • SCCCA_INLINEDATAFORMAT: Reserved

  • SCCCA_LINKEDOBJECT: Tags content referencing a linked object

  • SCCCA_LISTENTRY: Reserved

  • SCCCA_MERGEENTRY: Reserved

  • SCCCA_NAMEDCELLRANGE: Reserved

  • SCCCA_REFERENCEDTEXT: Tags text for later reference

  • SCCCA_SLIDENOTES: Tags content stored in speaker/slide notes in a presentation document

  • SCCCA_SSHEADERFOOTER: Tags content that references headers or footers in spreadsheet files

  • SCCCA_STYLE: Delimits a style definition. Styles may contain text, but typically do not.

  • SCCCA_SUBDOCPROPERTY: Tags metadata associated with a subdocument, such as a comment.

  • SCCCA_SUBDOCTEXT: Delimits content stored in subdocuments like headers, footers, frames and notes.

  • SCCCA_TOA: Reserved

  • SCCCA_TOAENTRY: Reserved

  • SCCCA_TOC: Reserved

  • SCCCA_TOCENTRY: Reserved

  • SCCCA_TOF: Reserved

  • SCCCA_VECTORSAVETAG: Reserved

  • SCCCA_XMPDATA: Document properties parsed out of the XMP data

  • SCCCA_XREF: Reserved

When dwSubType is SCCCA_DOCUMENTPROPERTY, dwData1 will be one of the values listed in the header file sccca.h. The following section, Document Property IDs, lists many of the common document property types. Any content generated between the begin and end tag defines the value of the document property.

When dwSubType is SCCCA_EMAILFIELD, dwData1 will be one of the values in "Mail Field IDs" on page 7-6, and any content generated between the begin and end tag defines the value of the email field.

Document Property IDs

The following is a list of document property IDs.

  • SCCCA_ABSTRACT

  • SCCCA_ACCOUNT

  • SCCCA_ADDRESS

  • SCCCA_APPVERSION

  • SCCCA_ATTACHMENTS

  • SCCCA_AUTHORIZATION

  • SCCCA_BACKUPDATE

  • SCCCA_BASEFILELOCATION

  • SCCCA_BILLTO

  • SCCCA_BLINDCOPY

  • SCCCA_CARBONCOPY

  • SCCCA_CATEGORY

  • SCCCA_CHECKEDBY

  • SCCCA_CLIENT

  • SCCCA_COMPANY

  • SCCCA_COMPLETEDDATE

  • SCCCA_COUNTBYTES

  • SCCCA_COUNTCHARS

  • SCCCA_COUNTCHARSWITHSPACES

  • SCCCA_COUNTLINES

  • SCCCA_COUNTMMCLIPS

  • SCCCA_COUNTNOTES

  • SCCCA_COUNTPAGES

  • SCCCA_COUNTPARAS

  • SCCCA_COUNTSLIDES

  • SCCCA_COUNTSLIDESHIDDEN

  • SCCCA_COUNTWORDS

  • SCCCA_CREATIONDATE

  • SCCCA_DEPARTMENT

  • SCCCA_DESTINATION

  • SCCCA_DISPOSITION

  • SCCCA_DIVISION

  • SCCCA_DOCCOMMENT

  • SCCCA_DOCNUMBER

  • SCCCA_DOCTYPE

  • SCCCA_EDITMINUTES

  • SCCCA_EDITOR

  • SCCCA_FORWARDTO

  • SCCCA_GROUP

  • SCCCA_HEADINGPAIRS

  • SCCCA_KEYWORD

  • SCCCA_LANGUAGE

  • SCCCA_LASTPRINTDATE

  • SCCCA_LASTSAVEDATE

  • SCCCA_LASTSAVEDBY

  • SCCCA_LINKSDIRTY

  • SCCCA_MAILSTOP

  • SCCCA_MANAGER

  • SCCCA_MATTER

  • SCCCA_OFFICE

  • SCCCA_OPERATOR

  • SCCCA_OWNER

  • SCCCA_PRESENTATIONFORMAT

  • SCCCA_PRIMARYAUTHOR

  • SCCCA_PROJECT

  • SCCCA_PUBLISHER

  • SCCCA_PURPOSE

  • SCCCA_RECEIVEDFROM

  • SCCCA_RECORDEDBY

  • SCCCA_RECORDEDDATE

  • SCCCA_REFERENCE

  • SCCCA_REVISIONDATE

  • SCCCA_REVISIONNOTES

  • SCCCA_REVISIONNUMBER

  • SCCCA_SCALECROP

  • SCCCA_SECONDARYAUTHOR

  • SCCCA_SECTION

  • SCCCA_SECURITY

  • SCCCA_SOURCE

  • SCCCA_STATUS

  • SCCCA_SYSTEM_FILECREATED

  • SCCCA_SYSTEM_FILEMODIFIED

  • SCCCA_SYSTEM_FILESIZE

  • SCCCA_SUBJECT

  • SCCCA_TITLE

  • SCCCA_TITLEOFPARTS

  • SCCCA_TYPIST

  • SCCCA_USERDEFINEDPROP

  • SCCCA_VERSIONDATE

  • SCCCA_VERSIONNOTES

  • SCCCA_VERSIONNUMBER

    Note:

    Document Properties with IDs of SCCCA_USERDEFINEDPROP or above are user-defined properties.

SCCCA_SUBDOCPROPERTY Document Properties

The following values are properties of SCCCA_SUBDOCPROPERTY:

  • SCCCA_SUBDOC_AUTHOR

  • SCCCA_SUBDOC_CREATEDATE

  • SCCCA_SUBDOC_LASTSAVEDATE

  • SCCCA_SUBDOC_TITLE

  • SCCCA_SUBDOC_NOTES

  • SCCCA_SUBDOC_AUTHORSHORT

Mail Field IDs

  • SCCCA_MAIL_ALTERNATE_RECIPIENT_ALLOWED

  • SCCCA_MAIL_ATTACHMENT

  • SCCCA_MAIL_ATTENDEES

  • SCCCA_MAIL_ATTR_HIDDEN

  • SCCCA_MAIL_ATTR_READONLY

  • SCCCA_MAIL_ATTR_SYSTEM

  • SCCCA_MAIL_AUTO_FORWARDED

  • SCCCA_MAIL_BCC

  • SCCCA_MAIL_CATEGORIES

  • SCCCA_MAIL_CC

  • SCCCA_MAIL_CCME

  • SCCCA_MAIL_CLIENT_SUBMIT_TIME

  • SCCCA_MAIL_COMPANY

  • SCCCA_MAIL_CONVERSATION_INDEX

  • SCCCA_MAIL_CONVERSATION_TOPIC

  • SCCCA_MAIL_CREATION_TIME

  • SCCCA_MAIL_CREATOR_ENTRYID

  • SCCCA_MAIL_CREATOR_NAME

  • SCCCA_MAIL_DEFERRED_DELIVERY_TIME

  • SCCCA_MAIL_DELETE_AFTER_SUBMIT

  • SCCCA_MAIL_EMAIL

  • SCCCA_MAIL_ENTRYID

  • SCCCA_MAIL_EXPIRES

  • SCCCA_MAIL_EXPIRY_TIME

  • SCCCA_MAIL_FLAGSTS

  • SCCCA_MAIL_FROM

  • SCCCA_MAIL_FULLNAME

  • SCCCA_MAIL_HOMEPHONE

  • SCCCA_MAIL_IMPORTANCE

  • SCCCA_MAIL_INET_MAIL_OVERRIDE_FORMAT

  • SCCCA_MAIL_INTERNET_ARTICLE_NUMBER

  • SCCCA_MAIL_INTERNET_CPID

  • SCCCA_MAIL_INTERNET_MESSAGE_ID

  • SCCCA_MAIL_JOBTITLE

  • SCCCA_MAIL_LASTMODIFIED

  • SCCCA_MAIL_LAST_MODIFIER_ENTRYID

  • SCCCA_MAIL_LAST_MODIFIER_NAME

  • SCCCA_MAIL_LATEST_DELIVERY_TIME

  • SCCCA_MAIL_LOCATION

  • SCCCA_MAIL_MESSAGE_CLASS

  • SCCCA_MAIL_MESSAGE_CODEPAGE

  • SCCCA_MAIL_MESSAGE_LOCALE_ID

  • SCCCA_MAIL_MESSAGE_SUBMISSION_ID

  • SCCCA_MAIL_MSGFLAG

  • SCCCA_MAIL_MSG_EDITOR_FORMAT

  • SCCCA_MAIL_NEWSGROUPS

  • SCCCA_MAIL_NORMALIZED_SUBJECT

  • SCCCA_MAIL_NT_SECURITY_DESCRIPTOR

  • SCCCA_MAIL_ORIGINATOR_DELIVERY_REPORT_REQUESTED

  • SCCCA_MAIL_PRIORITY

  • SCCCA_MAIL_PROFILE_CONNECT_FLAGS

  • SCCCA_MAIL_RCVD_BY_FLAGS

  • SCCCA_MAIL_RCVD_REPRESENTING_ADDRTYPE

  • SCCCA_MAIL_RCVD_REPRESENTING_EMAIL_ADDRESS

  • SCCCA_MAIL_RCVD_REPRESENTING_ENTRYID

  • SCCCA_MAIL_RCVD_REPRESENTING_FLAGS

  • SCCCA_MAIL_RCVD_REPRESENTING_NAME

  • SCCCA_MAIL_RCVD_REPRESENTING_SEARCH_KEY

  • SCCCA_MAIL_READ_RECEIPT_REQUESTED

  • SCCCA_MAIL_RECEIVED

  • SCCCA_MAIL_RECEIVED_BY_ADDRTYPE

  • SCCCA_MAIL_RECEIVED_BY_EMAIL_ADDRESS

  • SCCCA_MAIL_RECEIVED_BY_ENTRYID

  • SCCCA_MAIL_RECEIVED_BY_NAME

  • SCCCA_MAIL_RECEIVED_BY_SEARCH_KEY

  • SCCCA_MAIL_RECIPIENT_REASSIGNMENT_PROHIBITED

  • SCCCA_MAIL_REPLY_REQUESTED

  • SCCCA_MAIL_REPLY_TIME

  • SCCCA_MAIL_REPORT_TAG

  • SCCCA_MAIL_RESPONSE_REQUESTED

  • SCCCA_MAIL_RTFBODY

  • SCCCA_MAIL_RTF_IN_SYNC

  • SCCCA_MAIL_RTF_SYNC_BODY_COUNT

  • SCCCA_MAIL_RTF_SYNC_BODY_CRC

  • SCCCA_MAIL_RTF_SYNC_BODY_TAG

  • SCCCA_MAIL_RTF_SYNC_PREFIX_COUNT

  • SCCCA_MAIL_RTF_SYNC_TRAILING_COUNT

  • SCCCA_MAIL_SEARCH_KEY

  • SCCCA_MAIL_SENDER_ADDRTYPE

  • SCCCA_MAIL_SENDER_EMAIL_ADDRESS

  • SCCCA_MAIL_SENDER_ENTRYID

  • SCCCA_MAIL_SENDER_FLAGS

  • SCCCA_MAIL_SENDER_NAME

  • SCCCA_MAIL_SENDER_SEARCH_KEY

  • SCCCA_MAIL_SENSITIVITY

  • SCCCA_MAIL_SENT_REPRESENTING_ADDRTYPE

  • SCCCA_MAIL_SENT_REPRESENTING_EMAIL_ADDRESS

  • SCCCA_MAIL_SENT_REPRESENTING_ENTRYID

  • SCCCA_MAIL_SENT_REPRESENTING_FLAGS

  • SCCCA_MAIL_SENT_REPRESENTING_NAME

  • SCCCA_MAIL_SENT_REPRESENTING_SEARCH_KEY

  • SCCCA_MAIL_SIZE

  • SCCCA_MAIL_SUBJECT

  • SCCCA_MAIL_SUBMITTIME

  • SCCCA_MAIL_TO

  • SCCCA_MAIL_TRANSPORT_MESSAGE_HEADERS

  • SCCCA_MAIL_TRUST_SENDER

  • SCCCA_MAIL_WEBPAGE

  • SCCCA_MAIL_WORKPHONE

SCCCA_COMMENTREFERENCE Content Description

A SCCCA_COMMENTREFERENCE is placed in the actual location of the comment. The body of the comment may appear elsewhere and will be tagged with a SCCCA_BEGINTAG of type SCCCA_SUBDOCTEXT and will have the same Id as the SCCCA_COMMENTREFERENCE.

SCCCA_BREAK: Content Breaks

This content type is used internally, and may be ignored.

SCCCA_FILEPROPERTY: File Property Content

Returns the file identification information for a document. This property is generated by the CAReadFirst function.

SCCCA_FILEPROPERTY Content Description

This section lists the applicable parameters and corresponding values.

  • dwType: SCCCA_FILEPROPERTY

  • dwSubType: SCCCA_FILEID

  • dwData1: One of the file identifier values (FI_*) defined in sccfi.h

  • dwData2: The input file's initial character set

  • dwData3: Reserved

  • dwData4: Reserved

  • pDataBuf: Not used

SCCCA_GENERATED: Generated Information

Identical to SCCCA_TEXT, except that the characters come not from the original document, but from some other non-character data (numbers in spreadsheets, dates, etc.). Because the text is not from the original document, the characters do not contribute toward character counts.

SCCCA_GENERATED Content Description

This section lists the applicable parameters and corresponding values.

  • dwType: SCCCA_GENERATED

  • dwSubType: Possible values include the following:

    • SCCCA_DOCUMENTTEXT: Regular document text is returned with this subtype.

    • SCCCA_SPECIALTEXT: Used to return text elements that are manufactured by the technology due to special formatting attributes.

    • SCCCA_REVISIONDELETE: Will be OR-ed with either SCCCA_DOCUMENTTEXT or SCCCA_SPECIALTEXT when text has been deleted from the final version of a document as a result of a revision.

    • SCCCA_URLTEXT: Text for the Link Location part of a URL.

    • SCCCA_XMPMETADATA: Text from embedded XMP metadata.

  • dwData1: Number of characters provided in pDataBuf

  • dwData2: Original character set of the text in pDataBuf

  • dwData3: Reserved

  • dwData4: Reserved

  • pDataBuf: Text buffer. Filled with one or more single- or double-byte characters.

SCCCA_OBJECT: SubObjects

This content type is provided to allow the developer to access the content of SubObjects, like embedded graphics or objects in an archive. The SubObject can then be opened by DAOpenDocument, filling the IOSPECSUBOBJECT or the IOSPECARCHIVEOBJECT parameter with one of the following values:

SCCAA_OBJECT Content Description

  • dwType: SCCCA_OBJECT

  • dwSubType: Set to SCCCA_EMBEDDEDOBJECT (0) if the sub-object is an embedding or is set to the type of node if the object is from an archive. Possible values include the following:

    • SCCCA_EMBEDDEDOBJECT

    • SCCCA_ARCHIVEITEMCONTAINER

    • SCCCA_COMPRESSEDFILE

    • SCCCA_MESSAGE

    • SCCCA_CONTACT

    • SCCCA_CALENDARENTRY

    • SCCCA_NOTE

    • SCCCA_TASK

    • SCCCA_JOURNALENTRY

    • SCCCA_ATTACHMENT

  • dwData1: The internal SubObject identifier or a node identifier.

  • dwData2: Stream identifier for an alternate graphic.

  • dwData3: Stream identifier for an OLE object if one exists. Otherwise, it is CA_INVALIDITEM.

  • dwData4: Reserved

  • pDataBuf: Not used

SCCCA_SHEET: Sheet Names

This content type contains only the sheet name (worksheet in a spreadsheet, slide in presentation, etc.). This content is not optional. It is always created if the information is present. Of course, the client can ignore this text when it is returned.

SCCCA_SHEET Content Description

This section lists the applicable parameters and corresponding values.

  • dwType: SCCCA_SHEET

  • dwSubType: Reserved

  • dwData1: The length of the name in pDataBuf in characters.

  • dwData2: The original character set of the name in pDataBuf.

  • dwData3: Reserved

  • dwData4: Reserved

  • pDataBuf: Points to the sheet name in whatever output character set has been requested.

SCCCA_STYLECHANGE: Style Information

The SCCCA_STYLECHANGE content type is used to indicate changes in style information. This style information can be used to delimit particularly interesting content.

SCCCA_STYLECHANGE Content Description

This section lists the applicable parameters and corresponding values.

  • dwType: SCCCA_STYLECHANGE

  • dwSubType: Possible values include the following:

    • SCCCA_PARASTYLE: pDataBuf indicates the name of the style.

    • SCCCA_HEIGHTANDSPACING: When dwSubType is SCCCA_HEIGHTANDSPACING, dwData1 can be SCCCA_HEIGHT (dwData2 represents the new character height), SCCCA_SPACING (dwData3 represents the new line spacing) or both of these values OR-ed together.

    • SCCCA_INDENTS: When dwSubType is SCCCA_INDENTS, dwData1 can be SCCCA_LEFTINDENT (dwData2 represents the left indent), SCCCA_RIGHTINDENT (dwData3 represents the right indent), SCCCA_FIRSTINDENT (dwData4 represents the first line indent), or any of these values OR-ed together.

    • SCCCA_OCE: This content type provides information about the original charsets of the characters that follow. dwData1 represents the charset as defined in vtchars.h.

  • dwData1: Depends on the value of dwSubType.

  • dwData2: Depends on the value of dwSubType.

  • dwData3: Depends on the value of dwSubType.

  • dwData4: Depends on the value of dwSubType.

  • pDataBuf: Text buffer. Filled with one or more single- or double-byte characters.

  • dwDataBufSize: Size of pDataBuf, in bytes.

SCCCA_TEXT: Text Content

This content type denotes document text, including special characters such as page breaks and tabs.

The technology guarantees that the text generated by the Content Access technology is identical to the text generated by the Outside In Viewer technology raw-text feature. This allows character counts generated at indexing time using Content Access to be directly mapped to viewer positions at viewing time for search-hit highlighting. However, Content Access has abilities beyond the raw-text feature of the Viewer, such as the ability to retrieve non-visible text such as document properties and hidden text, and the ability to retrieve text from embedded documents.

When the output character is DBCS or Unicode, the character count will not be the same as the buffer byte count because these character sets may generate more than one byte per character. The byte ordering used for multi-byte character sets such as these will be system-dependent; on a computer using an Intel processor, the low byte will be first.

It is important to note that generated numeric data fields, such as date, time, and spreadsheet numbers, are not included in the content returned by SCCCA_TEXT. For information on how such text can be returned by Content Access, see "SCCCA_GENERATED: Generated Information".

SCCCA_TEXT Content Description

This section lists the applicable parameters and corresponding values.

  • dwType: SCCCA_TEXT

  • dwSubType: One of the following values:

    • SCCCA_DOCUMENTTEXT: Regular document text is returned with this subtype.

    • SCCCA_SPECIALTEXT: Used to return text elements that are manufactured by the technology due to special formatting attributes.

    SCCCA_DOCUMENTTEXT or SCCCA_SPECIALTEXT can be optionally OR-ed with any of the following to specify the type of text to be returned:

    • SCCCA_ALLCAPS

    • SCCCA_BOLD

    • SCCCA_DUNDERLINE

    • SCCCA_HIDDEN

    • SCCCA_ITALIC

    • SCCCA_OUTLINE

    • SCCCA_REVISIONDELETE: Text that has been deleted from the final version of a document as a result of a revision.

    • SCCCA_REVISIONADD: Text that has been added to the final version of a document as a result of a revision.

    • SCCCA_SMALLCAPS

    • SCCCA_STRIKEOUT

    • SCCCA_UNDERLINE

    • SCCCA_UNKNOWNMAP: This flag is set when PDF files don't contain a ToUnicode map. This indicates that the mappings may or may not be correct.

  • dwData1: Number of characters provided in pDataBuf

  • dwData2: Original character set of the text in pDataBuf

  • dwData3: Reserved

  • dwData4: Reserved

  • pDataBuf: Text buffer. Filled with one or more single- or double-byte characters.

Special Text Character Substitutions

  • Email Delimiter: 0x09

  • End of Database Record: 0x0A

  • End of File: 0x0D

  • End of Paragraph: 0x0D

  • End of Table Cell: 0x0D

  • End of Table Row: 0x0D

  • Hard Hyphen: 0x2D

  • Hard Line Break: 0x0A

  • Hard Page Break: 0x0C

  • Hard Space: 0x20

  • Implied Space: 0x20

  • Section Separator: 0x0D

  • Syllable Hyphen: 0x2D

  • Tab: 0x09

SCCCA_TREENODELOCATOR: Tree Node Locator

This content type contains information to be used in the SOTREENODELOCATOR structure, which is used by DAOpenRandomTreeRecord and DASaveRandomTreeRecord.

SCCCA_TREENODELOCATOR Content Description

  • dwType: SCCCA_TREENODELOCATOR

  • dwSubType: Reserved

  • dwData1: SOTREENODELOCATOR.dwSpecialFlags

  • dwData2: SOTREENODELOCATOR.dwData1

  • dwData3: SOTREENODELOCATOR.dwData2

  • dwData4: Reserved

  • pDataBuf: Not used