If crawled content cannot be indexed as-is, the crawler code must create a temporary file for indexing.
The following steps describe a typical custom mechanism to create a temporary indexable file with as little extraneous information as possible and set the content type and file name using the appropriate headers. In most cases, the resource has already been accessed in sttachToDocument, so there is no need to call the back-end system again. This example does not use credentials. If you do not want to create temporary files, you can implement an indexing servlet that returns indexable content.
logger.Debug('Entering Index.Page_Load()'); // try to get the .tmp filename from the Content Crawler string indexFileName = Request[Constants.INDEX_FILE]; if (indexFileName != null) { StreamReader sr = null; string filePath = ''; try { filePath = HttpUtility.UrlDecode(indexFileName); string shortFileName = filePath.Substring(filePath.LastIndexOf('\\') + 1); // set the proper response headers Response.ContentType = 'text/plain'; Response.AddHeader('Content-Disposition', 'inline; filename=' + shortFileName); // open the file sr = new StreamReader(filePath); // stream out the information into the response string line = sr.ReadLine(); while (line != null) { Response.Output.WriteLine(line); line = sr.ReadLine(); } } catch (Exception ex) { logger.Error('Exception while trying to write index file: ' + ex.Message, ex); } finally { // close and delete the temporary index file even if there is an error if(sr != null){sr.Close();} if(!filePath.Equals('')){File.Delete(filePath);} } //done return; } ...