5 Searching Documents
This chapter includes the following sections:
5.1 Searching
Text search can be initiated by the developer by calling the Search method (see Search). This method's parameters specify search direction, case sensitivity, search starting location and can optionally display a default search dialog for search-text entry. If the text is found, this method returns 0. If an error occurred, search returns a -1, otherwise a return value of 1 indicates that the EOF was reached before the text was found. If the search is successful, the located text will be highlighted and scrolled into the viewing window. All properties related to text selection will be updated as well as the caret position which will be located at the beginning of the highlighted text.
SearchNext (see SearchNext) will continue (in either the same or a different direction) to look for the next occurrence of the text specified in the previous Search method. The return values and behavior are the same as for the Search method.
Private Sub Search_Ctrl_Click() 'Dim ret As Integer ' If the search text has changed, call the Search method, ' otherwise call SearchNext If LastSearchText = SearchText_Ctrl.Text Then ret = oixctrl1.SearchNext(SCCVW_SEARCHFORWARD) ' search forward Else Rem ** Search for text from the SearchText edit box control ' with no dialog, case sensitivity ' from the current caret position and search forward ret = oixctrl1.Search(OIX_FALSE, SearchText_Ctrl.Text, SCCVW_SEARCHCASE, SCCVW_SEARCHCURRENT, SCCVW_SEARCHFORWARD) End If If ret = 0 Then Search_Ctrl.Caption = "Search Next" ' change the ' search button ' to a search ' next LastSearchText = SearchText_Ctrl.Text ' save off the ' text used in ' the search Else If ret = 1 Then ' EOF reached. Display message and ' start over MsgBox "End of File Reached - wrapping to beginning of document" ret = oixctrl1.Search(OIX_FALSE, SearchText_Ctrl.Text, SCCVW_SEARCHCASE, SCCVW_SEARCHTOP, SCCVW_SEARCHFORWARD) End If End If End Sub Private Sub SearchText_Ctrl_Change() LastSearchText = "" ' When the text in the ' search text edit Search_Ctrl.Caption = "Search" ' box changes, reset the ' button caption to "Search" End Sub
5.2 Positions
The ActiveX control uses the concept of a position to specify locations within a file. Each position is a placeholder or bookmark into the currently viewed document. It is important to understand how positions are manipulated to be able to use some of the more powerful features of the Outside In control. For example, a number of the annotation related methods/properties work with positions to specify the location in the file for the annotation.
The position variable is actually a COM object that is referred to by both Visual Basic and Visual C++ as an OixPos, and is easily manipulated in both environments. When working with an OixPos, the following rule of thumb should be observed. When passing an OixPos to a method, pass it as an object (or by value). That is, always create a NEW OixPos before sending it to a method. However, OixPos properties will hold a reference to a OixPos object.
For the currently viewed document there is always a caret position defined. The caret position is the location on the screen where the cursor is displayed and is stored relative to the beginning of the file. The caret position can only be manipulated by the user through the use of the mouse and/or keyboard and may appear anywhere within the displayed text of the document.
There are four methods available to set an OixPos object/variable:
-
SetPositionToCurrent (see SetPositionToCurrent) sets the position object to the location of the caret position.
-
If the user has selected an area of text, SetPositionToSelection (see SetPositionToSelection) sets two parameters to the starting and ending positions of the currently selected text.
-
You can also set the position directly using the SetrActualCount (see SetActualCount) method. This method sets a position relative to the number of characters from the beginning of the document. Because different document types have different notions of page layout, this method is mostly used with word-processing documents. For example, the number of characters into a spreadsheet and a presentation can take on completely different meanings.
-
The last method to set an OixPos object is FindPosition (see FindPosition). FindPosition takes a starting position and a technique for locating the resulting OixPos object. The found position can be any of the following:
-
First position in the current page (starting position is ignored)
-
Last position in the current page (starting position is ignored)
-
The position just to the left of the starting position
-
The position just to the right of the starting position
-
Start of the current selection (starting position is ignored)
-
End of the current selection (starting position is ignored)
-
Location at the top/left of the viewing window (starting position) is ignored)
-
Location at the bottom/right of the viewing window (starting position is ignored)
-
Beginning of the line which contains the starting position
-
End of the line which contains the ending position
-
Beginning of the previous line (relative to the starting position)
-
End of the previous line (relative to the starting position)
-
Beginning of the word to the left of the starting position
-
Beginning of the next word to the right of the starting position
-
Beginning of the section to the left of the starting position
-
Beginning of the next section to the right of the starting position
-
Four methods can be used to manipulate or access the position objects.
CopyPosition and ComparePositions (see CopyPosition, and ComparePositions) both take two OixPos objects as parameters and either copy the location information from one to the other or return which one is closer to the beginning of the file. GetActualCount (see GetActualCount) is the counterpart to the SetActualCount (see SetActualCount) method and returns the number of characters from the beginning of the file specified by the OixPos object.
The DisplayPosition (see DisplayPosition) method will redraw the currently viewed document relative to the OixPos parameter. The following flags determine where the position will be located relative to the viewing window.
-
Top of the view window
-
Middle of the view window
-
Bottom of the view window
The following example illustrates the use of these methods.
Private Sub Annotate_Ctrl_Click() Dim oixStart As New OixPos Dim oixEnd As New OixPos Rem ** Check to see if any text is selected. If so, set Rem oixStart and oixEnd to the selected text. If not, Rem set oixStart to the current caret position Rem and oixEnd to the beginning of the next word. ** If Selection_Ctrl.Caption <= 0 Then oixctrl1.SetPositionToCurrent oixStart oixctrl1.FindPosition oixStart, oixEnd, 14 ' set the end ' position to ' be at the ' beginning of ' the next word oixctrl1.FindPosition oixEnd, oixStart, 13 ' set the ' beginning ' position of ' the "current ' word" word oixctrl1.SelectionAnchor = oixStart oixctrl1.SelectionEnd = oixEnd Else oixctrl1.SetPositionToSelection oixStart, oixEnd ' Swap the start and end positions if oixEnd is closer to ' the top of the document than oixStart If oixctrl1.ComparePositions(oixStart, oixEnd) 0 Then Dim oixTemp As New OixPos oixctrl1.CopyPosition oixTemp, oixStart oixctrl1.CopyPosition oixStart, oixEnd oixctrl1.CopyPosition oixEnd, oixTemp End If End If AnnotateSelection oixStart, oixEnd oixctrl1.DisplayPosition oixStart, 1 ' reposition text near ' the top of viewer End Sub
5.3 Annotations
Outside In provides a powerful way to bookmark or annotate selected areas of text and locate previously defined annotations, bookmarks or URLs. Annotations, document-defined bookmarks and URLs are treated as individual annotation types. Document defined bookmarks and URLs are automatically annotated based on the information present in the file being processed. Annotations are user-defined and must be created programmatically using the Annotation API. There are three types of annotations: hilited (text is highlighted), hidden (text is hidden), and picture (a picture is inserted into the document).
To hilite and/or annotate a block of text, two OixPos objects are needed: the starting and ending position. These can be obtained by retrieving the user's current text selection, by using position methods to locate text, or by using the Search method (see Search). Given two position objects, the text can be hilited using the AddAnnotationHilite method, or can be hidden using the AddAnnotationHideText method (see AddAnnotationHilite, and AddAnnotationHideText).
Another way to add an annotation is to insert a picture as an annotation. This type of annotation will be inserted at the designated OixPos position when the AddAnnotationPicture method (see AddAnnotationPicture) is called with a picture object as a parameter. This picture will be displayed inline as the document is viewed.
Private Sub AnnotateSelection(lType As Integer, Optional ByRef OixStart As OixPos, Optional ByRef oixEnd As OixPos) REM ** Create an annotation association with the selection REM between oixStart and oixEnd based on the type requested REM in the lType parameter. ** Dim Text As String Select Case lType Case 0: ' Hilite Text = GetAnnotation(oixStart, annotateid) oixctrl1.AddAnnotationHilite annotateid, 1, SCCVW_ANNOTATION_SCLICK, 0, 0, Text, oixStart, oixEnd Case 1: ' Hide Text oixctrl1.AddAnnotationHideText annotateid, 0, oixStart, oixEnd Case 2: ' Picture Dim P As New StdPicture CommonDialog1.ShowOpen Set P = LoadPicture( CommonDialog1.Filename ) oixctrl1.AddAnnotationPicture annotateid, P, 2, 0, oixStart End Select annotateid = annotateid + 1 End Sub Private Function GetAnnotation(ByRef Oix As OixPos, annotate As Integer) As String Rem ** AnnotationText_Form is a form with a Title text field, Rem Annotation Text control and a message text control Rem which is "popped-up" with a Show method to collect Rem annotation text. ** AnnotationText_Form.Left = oixctrl1.Left + 0.25 * oixctrl1.Width + Form1.Left AnnotationText_Form.Top = oixctrl1.Top + 0.4 * oixctrl1.Height + Form1.Top AnnotationText_Form.Title.Text = "Annotation #: " + Str(annotate) AnnotationText_Form.AnnotationText_Ctrl.Text = "" AnnotationText_Form.Show 1, Form1 GetAnnotation = AnnotationText_Form.AnnotationText_Ctrl.Text End Function
When adding an annotation, one of the parameters to the AddAnnotationHilite method (see AddAnnotationHilite) describes the user-interaction that will trigger an AnnotationEvent event (see AnnotationEvent). These include: single-click, double-click, and cursor-over. When one of these actions occurs, the action type and annotation information is passed to the AnnotationEvent event handler. The developer can then use this information to display additional annotation information.
Another parameter used when adding an annotation is an annotation style. Annotation styles are defined programmatically using the HiliteStyle method (see HiliteStyle). Styles must be defined before annotations are added. Each style is uniquely identified with an ID and define the way the viewer hilites the annotated text; foreground color, background color and font attributes can all be defined. Once defined, the unique style identifier can be passed as an argument to the AddAnnotationHilite method (see AddAnnotationHilite). Additionally, once a style has been defined and associated with an ID, the ID can not be reused or re-assigned until the currently loaded file changes.
Private Sub Form_Load() Dim fg As OLE_COLOR Dim bg As OLE_COLOR Dim ret As Boolean fg = RGB(255, 255, 255) bg = RGB(128, 128, 128) oixctrl1.HiliteStyle 1, 3, fg, bg, 0 End Sub Private Sub oixctrl1_AnnotationEvent(ByVal lEvent As Long, ByVal varData As Variant) Rem ** If the user single clicks on the annotation, popup the Rem annotation form ** If lEvent And SCCVW_ANNOTATION_SCLICK Then ShowAnnotation oixctrl1.AnnotationDataType, lId, varData End If End Sub Private Sub ShowAnnotation(lType As Long, annotate As Long, data As Variant) Rem ** Display annotation text - called from AnnotationEvent Rem and the AnnotationList_Ctrl_Click event ** Select Case lType Case 0: ' User annotation AnnotationText_Form.Title.Text = "Annotation #: " + Str(annotate) + "(" + Hex(annotate) + ")" Case 1: ' URL AnnotationText_Form.Title.Text = "URL: " + Str(annotate) + "(" + Hex(annotate) + ")" Case 2: ' Bookmark AnnotationText_Form.Title.Text = "Bookmark: " + Str(annotate) + "(" + Hex(annotate) +")" End Select AnnotationText_Form.AnnotationPict.Visible = False AnnotationText_Form.AnnotationText_Ctrl = True AnnotationText_Form.AnnotationText_Ctrl.Text = data AnnotationText_Form.Left = oixctrl1.Left + 0.25 * oixctrl1.Width + Form1.Left AnnotationText_Form.Top = oixctrl1.Top + 0.4 * oixctrl1.Height + Form1.Top AnnotationPicture_Form.Show 0, Form1 End Sub
Annotations can be manipulated using the ClearAnnotations, FindAnnotation, and GoToAnnotation methods (see ClearAnnotations, FindAnnotation, and GoToAnnotation). Each of these methods take a parameter which allows the developer to select multiple annotations using an ID mask. An annotation will match if the logical "and" of the annotation ID and the ID mask is equal to the ID mask. For example, if the ID mask is 225 (11100001), the annotation ID of 227 (11100011) would match, however, the annotation ID of 226 (11100010) would not. Calling the ClearAnnotations method (see ClearAnnotations) will remove all matching annotations.
Locating existing annotations is accomplished using the FindAnnotation and GoToAnnotation methods (see FindAnnotation and GoToAnnotation). Both methods locate user-defined annotations, document-defined bookmarks and URLs. Also, these methods can start from the beginning/end or look for the next/previous annotation and both utilize matching via a mask ID as described in the preceding paragraph. The main difference between the two methods is while FindAnnotation will update the read-only properties, it will not update the view window to bring the annotation into view. GoToAnnotation will update the view window to display the matched annotation, however, it will not update the read-only property values. This can be done manually by calling the GetAnnotationData method (see GetAnnotationData) with the annotation ID and type (annotation, bookmark or URL).
One final method is provided to copy the OixPos position of the last found annotation to the caret position. If the AnnotationSetPos method (see AnnotationSetPos) is passed a TRUE value, the text associated with the annotation is selected, otherwise, the caret position is just moved to the beginning of the annotation and the view updated.
The Annotation methods and the Annotation event populate the following read-only properties. AnnotationId contains the last used annotation ID (see AnnotationId); AnnotationStartPos and AnnotationEndPos store the OixPos objects for the last used annotation (see AnnotationStartPos and AnnotationEndPos). Each annotation can have additional data that is associated with it. The data and its type for the last used annotation are stored in the AnnotationData and AnnotationData read-only properties (see AnnotationData). For document-defined bookmarks and URLs, the Annotation property should be interpreted as a BSTR object.
It should be noted that all annotations are volatile. Once a new file is loaded, the annotation information is cleared. Therefore, the developer should provide a mechanism to save and restore annotations each time a document is viewed.
Private Sub ListAnnotations_Click() Rem ** Use the FindAnnotation method to enumerate through the Rem document and populate the AnnotationList_Ctrl list Rem box. ** Dim currentOix As New OixPos Dim Text As String oixctrl1.SetActualCount currentOix, 0 ' initialize ' currentOix to top ' of file While (oixctrl1.FindAnnotation(3, 0, 0, currentOix)) oixctrl1.CopyPosition currentOix, oixctrl1.AnnotationEndPos Select case oixctrl1.AnnotationDataType Case 0: ' Annotation Text = "AN:" Case 1: ' URL Text = "URL:" Case 2: ' Bookmark Text = "BM:" End Select AnnotationListBox_Ctrl.AddItem Text + Str(oixctrl1.AnnotationId) AnnotationListBox_Ctrl.ItemData (AnnotationListBox_Ctrl.NewIndex) = oixctrl1.AnnotationId Wend End Sub Private Sub AnnotationListBox_Ctrl_Click() Rem ** On a single click of the annotation list box, scroll Rem the document to the selected annotation. ** If (oixctrl1.GotoAnnotation(0, 1, AnnotationListBox_Ctrl.ItemData (AnnotationListBox_Ctrl.ListIndex)) = 0) Then MsgBox "Annotation not found" Else oixctrl1.AnnotationSetPos 0 ' set the caret position ' to the annotation oixctrl1.SetFocus ' reset the focus to the ' viewer so arrow keys work End If End Sub Private Sub AnnotationListBox_Ctrl_DblClick() Rem ** On a double click of the annotation list box, display Rem the data of the annotation in the annotation Rem data form. ** oixctrl1.FindAnnotation 0, 2, AnnotationListBox_Ctrl.ItemData (AnnotationListBox_Ctrl.ListIndex), 0 ShowAnnotation oixctrl1.AnnotationDataType, AnnotationListBox_Ctrl.ItemData (AnnotationListBox_Ctrl.ListIndex), oixctrl1.AnnotationData End Sub Private Sub AnnotationListBox_Ctrl_KeyDown(KeyCode As Integer, Shift As Integer) REM ** If delete key is pressed while a annotation is selected REM in the annotation list box, we delete it using REM ClearAnnotation ** If KeyCode = vbKeyDelete Then oixctrl1.ClearAnnotations 3, AnnotationListBox_Ctrl.ItemData (AnnotationListBox_Ctrl.ListIndex) End If End Sub
5.4 Raw Text
Often used in conjunction with the Annotation API, the Raw Text methods, events and properties allow the developer to programmatically access the data in the viewer as if it were all text. For example, a routine could be written to search through the raw text as the document was being loaded and add an annotation for each occurrence of a given character string. When the document is viewed, each matched character string would be hilited.
To enable raw text processing, the SystemRawText property (see SystemRawText) must be set to TRUE, and a RawTextEvent event handler must be written (see RawTextEvent). During the initial reading of the document, the RawTextEvent event is passed an ID that locates the raw text. The GetRawText method (see GetRawText) will populate RawTextOffset, RawTextString, and RawTextCharSet properties based on this ID (see RawTextOffset, RawTextString, and RawTextCharSet). Because the document is read in small increments, the developer should expect to see many calls to the RawTextEvent handler (each with a different raw text locator) before the entire document has been read.
As each page of the document is read, the raw text is accumulated into a raw text buffer within the control. The ID that is passed to the RawTextEvent event handler is actually an offset into this raw text buffer. These offsets may be stored in array fashion for later use. Passing the offset value into the GetRawText method copies the raw text information into the read-only properties.
Private Sub MainOIX_RawTextEvent(ByVal lTextOffset As Long) Rem ** RawText event handler will only get called when user Rem checks the appropriate check box on the form ** Dim oixStart As New OixPos Dim oixEnd As New OixPos Dim pos As Long Dim searchStr As String searchStr = "the" pos = 1 MainOIX.GetRawText (lTextOffset) Cs = MainOIX.RawTextCharSet pos = InStr(pos, MainOIX.RawTextString, searchStr, vbTextCompare) While (pos) ' pos is 1 based as returned from ' InStr but character count is 0 based MainOIX.SetActualCount oixStart, lTextOffset + (pos - 1) MainOIX.SetActualCount oixEnd, lTextOffset + (pos - 1) + Len(searchStr) MainOIX.AddAnnotationHilite Annotateid, 1, SCCVW_ANNOTATION_SCLICK, 0, 0, "Search Text", oixStart, oixEnd pos = InStr(pos + 1, MainOIX.RawTextString, searchStr, vbTextCompare) Annotatedid = Annotateid + 1 Wend End Sub