Use XML Sitemaps for Content Acquisition

We highly recommend that you use XML sitemaps to define the contents of external collections. Sitemaps are structured indexes created specifically for use by search engines. They list the URLs of the documents in your site and include important metadata about each document, such as when it was last updated, and how frequently it changes. Using XML sitemaps is the most efficient way to specify the documents that content processing will include in a web collection.

You can use existing XML sitemaps, or create new sitemaps using any of the many available tools. You can also manually create sitemap files. The following example shows a simple XML sitemap.

<?xml version="1.0" encoding="UTF-8"?>

	<urlset>xmlns="http://www.exampleofsitemap.com/schemas/oursitemap/0.5

	<url>
		<loc>http://www.location.com/</loc>
		<lastmod>2018-01-01</lastmod>
		<changefreq>monthly</changefreq>
		<priority>0.6</priority>

	</url>

</urlset>