If your crawls are downloading files with a lot of content (for example, large PDF or SWF files), you may see WARN messages about pages being skipped because the content limit was exceeded. To solve this problem, you should increase the download content limit to a setting that allows all content to be downloaded.
Any content longer than the size limit is not downloaded (i.e., the page is skipped).
To set the download content limit:
WARN com.endeca.itl.web.UrlProcessor Content limit exceeded for http://xyz.com/pdf/B2B_info.pdf. Page is skipped.