4 File ID Specification

The Outside In Technology File ID module uses an extremely fast and accurate proprietary algorithm to inspect data in a file until it can be matched with known data characteristics of a particular file type. This chapter provides an overview of the functions specific to the File ID SDK.

This chapter describes the following functions:

4.1 FIDeInit

This function tells the File Identification module that it will not be asked to read additional documents, so it should perform any necessary cleanup tasks. This function should be called at application shutdown time, and only if the module was successfully initialized with a call to FIInit.

Prototype

VTDWORD FIDeInit() 

Return Values

  • SCCERR_OK: Returned if the open was successful. Otherwise, one of the other SCCERR_ values in sccerr.h is returned.

4.2 FIGetFirstId

This function is called to get the first of all possible IDs that can be returned by FIIdFile and FIIdFileEx.

Prototype

VTBOOL FIGetFirstId(
   PFIGET   pFiGet,
   VTWORD   * pType,
   VTLPTSTR pTypeName,
   VTWORD   wNameCount);

Parameters

  • pFiGet: Pointer to a FIGET structure that is used internally by FI to track the GetFirst / GetNext process. You do not need to initialize this structure.

  • pType: Pointer to the 16-bit value that receives a file ID.

  • pTypeName: A buffer that receives the name of the ID returned through pType. For example, if 1500 (defined in sccfi.h as FI_BMP) were returned through pType, the string "Windows Bitmap" would be returned in this buffer.

  • wNameCount: Must contain the maximum number of bytes that can be placed in pTypeName.

Return Values

  • TRUE: An ID was returned and there may be more IDs.

  • FALSE: No ID was returned and there are no more IDs.

4.3 FIGetIDString

Returns the string associated with a particular FI ID. If no string is available for the specified ID, a value of zero is returned and the pTypeName buffer is not filled.

Prototype

VTWORD FIGetIDString(
VTWORD     wType,
VTLPTSTR   pTypeName,
VTWORD     wNameCount);

Parameters

  • wType: The file type ID with which the returned string is associated.

  • pTypeName: The buffer that is filled with the file type string.

  • wNameCount: Must contain the maximum number of bytes that can be placed in pTypeName.

Return Values

n: The number of characters filled in pTypeName.

4.4 FIGetNextId

This function is called to get the next of all possible IDs that can be returned by FIIdFile and FIIdFileEx.

Prototype

VTBOOL FIGetNextId(
   PFIGET    pFiGet,
   VTWORD    * pType,
   VTLPTSTR  pTypeName,
   VTWORD    wNameCount);

Parameters

  • pFiGet: Pointer to a FIGET structure that is used internally by FI to track the GetFirst / GetNext process. Must have been initialized by a call to FIGetFirstId.

  • pType: Pointer to the 16-bit value that receives a file ID.

  • pTypeName: A buffer that receives the name of the ID returned through pType. For example, if 1500 (defined in sccfi.h as FI_BMP) were returned through pType, the string "Windows Bitmap" would be returned in this buffer.

  • wNameCount: Must contain the maximum number of bytes that can be placed in pTypeName.

Return Values

  • TRUE: An ID was returned and there may be more IDs.

  • FALSE: No ID was returned and there are no more IDs.

4.5 FIIdFile

This function is called to retrieve the type ID of a file.

Prototype

VTWORD FIIdFile(
   VTDWORD   dwSpecType,
   VTVOID    * pSpec,
   VTDWORD   dwFlags,
   VTWORD    * pType);

Parameters

  • dwSpecType: Defines the file to be identified.

    • IOTYPE_ANSIPATH: Windows only. pSpec points to a NULL-terminated full path name using the ANSI character set and FAT 8.3 (Win16) or NTFS (Win32 and Win64) file name conventions.

    • IOTYPE_UNICODEPATH: Windows only. pSpec points to a NULL-terminated full path name using the Unicode character set and NTFS (Win32 and Win64) file name conventions.

    • IOTYPE_UNIXPATH: X Windows on UNIX platforms only. pSpec points to a NULL-terminated full path name using the system default character set and UNIX path conventions.

    • IOTYPE_REDIRECT: All platforms. pSpec points to a developer-defined structure that allows the developer to redirect the IO routines used to read the file. For more information, see Redirected IO.

  • pSpec: Defines the file to be identified. See the description of individual pSpec values in the preceding list.

  • dwFlags: One of the following values:

    • FIFLAG_NORMAL: This is the default value. When this is set, the File Identification code identifies all formats supported by Outside In as it has prior to version 6.0.

    • FIFLAG_EXTENDEDFI: When this flag is set, the set of possible text values that may be returned include FI_7BITTEXT, FI_ANSI8, FI_UNICODE, and FI_UTF8. FI_UTF8 is not guaranteed to be returned for all UTF8 files, which are very difficult to distinguish from non-UTF8-encoded 8-bit plain text.

  • pType: Pointer to the 16-bit value that receives the file's ID.

Return Values

  • 0: The file was successfully identified.

  • -1: File identification failed.

4.6 FIIdFileEx

This function is called to retrieve the type ID of a file, including text file types.

Prototype

VTWORD FIIdFileEx(
   VTDWORD   dwSpecType,
   VTVOID    * pSpec,
   VTDWORD   dwFlags,
   VTWORD    * pType,
   VTLPTSTR  pTypeName,
   VTWORD    wNameCount);

Parameters

  • dwSpecType: Defines the file to be identified.

    • IOTYPE_ANSIPATH: Windows only. pSpec points to a NULL-terminated full path name using the ANSI character set and FAT 8.3 (Win16) or NTFS (Win32 and Win64) file name conventions.

    • IOTYPE_UNICODEPATH: Windows only. pSpec points to a NULL-terminated full path name using the Unicode character set and NTFS (Win32 and Win64) file name conventions.

    • IOTYPE_UNIXPATH: X Windows on UNIX platforms only. pSpec points to a NULL-terminated full path name using the system default character set and UNIX path conventions.

    • IOTYPE_REDIRECT: All platforms. pSpec points to a developer-defined structure that allows the developer to redirect the IO routines used to read the file. For more information, see Redirected IO.

  • pSpec: Defines the file to be identified. See the description of individual pSpec values in the preceding list.

  • dwFlags: One of the following values:

    • FIFLAG_NORMAL: This is the default value. When this flag is set, all types with specific identification criteria are identified.

    • FIFLAG_EXTENDEDFI: When this flag is set, the set of possible text values that may be returned include FI_7BITTEXT, FI_ANSI8, FI_UNICODE, and FI_UTF8. FI_UTF8 is not guaranteed to be returned for all UTF8 files, which are very difficult to distinguish from non-UTF8-encoded 8-bit plain text.

  • pType: Pointer to the 16-bit value that receives the file's ID.

  • pTypeName: A buffer that receives the name of the ID returned through pType. For example, if 1500 (defined in sccfi.h as FI_BMP) were returned through pType, the string "Windows Bitmap" would be returned in this buffer.

  • wNameCount: Must contain the maximum number of bytes that can be placed in pTypeName.

Return Values

  • 0: The file was successfully identified.

  • -1: File identification failed.

4.7 FIInit

This function tells the File Identification module to perform any necessary initialization it needs to prepare for document access. This function must be called before the first time the application uses the module to retrieve data from any document.

FIInit should only be called once per application, at application startup time. Any number of documents can be opened for file identification between calls to FIInit and FIDeInit. If FIInit succeeds, FIDeInit must be called regardless of any other API calls.

Prototype

VTDWORD FIInit()

Return Values

  • SCCERR_OK: Returned if the open was successful. Otherwise, one of the other SCCERR_ values in sccerr.h is returned.

4.8 FIThreadInit

Multiple threads are supported only on the Windows, Linux, and Sun Solaris platforms. However, the FIThreadInit function is only implemented on the Sun Solaris and Linux platforms. Windows users can initialize multiple threads without calling this function. Failed initialization of this function does not impair other API calls. If the function is not called or fails, stub functions are called instead of mutex functions.

FIThreadInit initializes the technology, preparing it to be run in a thread. This preparation includes setting up mutex function pointers to prevent threads from clashing in critical sections of the technology's code. The developer must actually code the threads after this function has been called. FIThreadInit should be called just before the call to FIInit and only once per process. Both functions should be called before the developer's application begins the thread.

Prototype

VTLONG FIThreadInit(VTSHORT ThreadOption)

Parameters

  • ThreadOption: One of the following values:

    • FITHREAD_INIT_NOTHREADS: No thread support requested.

    • FITHREAD_INIT_PTHREADS: Support for PTHREADS requested.

    • FITHREAD_INIT_NATIVETHREADS: Support for native threading requested. Supported only on Solaris (Sun).

Return Values

  • FI_THREADINIT_SUCCESS: The open was successful.

  • FI_THREADINIT_FAILED: The open was unsuccessful.

  • FI_THREADINIT_ALREADY_CALLED: FIThreadInit has already been initialized. This value is returned if FIThreadInit is called more than once in an application.

4.9 FIThreadInitExt

Note:

Multiple threads are supported only on the Windows, Linux and Sun Solaris platforms. However, the FIThreadInitExt function is only implemented on the Sun Solaris and Linux platforms. Windows users can initialize multiple threads without calling this function. Failed initialization of this function does not impair other API calls. If the function is not called or fails, stub functions are called instead of mutex functions.

FIThreadInitExt initializes the technology, preparing it to be run in a thread. This preparation includes setting up mutex function pointers that the caller passes in to prevent threads from clashing in critical sections of the technology's code. The developer must actually code the threads after this function has been called. FIThreadInitExt should be called just before the call to FIInit and only once per process. Both functions should be called before the developer's application begins the thread.

Prototype

VTLONG FIThreadInit(VTLONG (*Lock)(VOID *), VTLONG
   (*UnLock)(VOID *))

Parameters

  • Lock: A function pointer to a mutex locking function such as pthread_mutex_lock. Unlock: A function pointer to a mutex unlocking function such as pthread_mutex_unlock.

Return Values

  • FI_THREADINIT_SUCCESS: The open was successful.

  • FI_THREADINIT_FAILED: The open was unsuccessful.

  • FI_THREADINIT_ALREADY_CALLED: FIThreadInit has already been initialized. This value is returned if FIThreadInit is called more than once in an application.