public interface ResourceRewriter
This interface is used to provide the implementation for rewriting links inside the downloaded markup. In Site Capture context, each downloaded resource from the crawler is considered as a WebResource.
Site Capture OOTB provides a default implementation based on regular expression - PatternResourceRewriter which is used to rewrite the links inside the downloaded markup.
Refer to developer document to see the more details on PatternResourceRewriter.
Modifier and Type | Method and Description |
---|---|
byte[] |
rewrite(WebResource resource)
This method is automatically injected by the crawler framework and is used to provide an implementation for the resource rewriter
algorithm.
|
byte[] rewrite(WebResource resource)
resource
- A WebResource object which contains the information regarding the downloaded resource as part of crawl session.IOException
- Throws an IOException if there is any problem in rewriting the markup.