Print
Email
Content Management Solutions

>
>
ArchivalWare SPDR

ArchivalWare SPDR

ArchivalWare SPDR is a content management tool that enables users to search specific Internet and intranet sites quickly and automatically in order to harvest documents and metadata. The software synchronizes with the ArchivalWare repository for archive updating. This two step process automatically keeps the repository current with relevant documents and metadata from targeted websites.


Creating Crawls

The process starts with the creation of a “crawl job”. A crawl job is a set of search parameters that tell the Spider what, when, where, and how to search Internet and intranet sites for relevant documents. Once a crawl job has been defined, it will run automatically as scheduled, retrieving documents and metadata from specified sites. Users can manage a crawl job by using GUIs designed to facilitate the updating process.

Executing Crawls

Each crawl job scans the Internet or intranet collecting two types of data: URL path information for metadata and retrieved documents. The URL path information is converted to a metadata record and stored in the database. The retrieved documents are linked to their respective metadata record and stored for ArchivalWare synchronization. The crawl job can be set to re-run automatically, and will pick up new and changed documents, while also tracking documents that are no longer available and require removal from the ArchivalWare repository.

Adding Crawl Results

Once a crawl job is finished, the retrieved documents are synchronized with ArchivalWare based on a set of user defined criteria to automatically determine which results should be archived and placed in the ArchivalWare repository and which ones should be deleted. The ArchivalWare Spider can re-run the crawl job and re-synchronize as frequently as desired and be set to run automatically at specific dates and times.

Copyright © 2010 PTFS, Inc. All rights reserved.
Powered by Agency of Record