Frequently Asked Questions: General
While ArchivalWare™performs similar data management functions and can be configured to act as a digital library created with , say DSpace, it uses a very sophisiticated search engine and retreival mechanism for finding and retreival documents. ArchivalWare™was bullt using Convera's RetreivalWare as it's core data management and search system. As a result, ArchivalWare™can search for terms in a variety of ways beyond simple boolean or field searching. It also can search the entire full text of all added documents in similar to what you would find in Google's Book Search or Amazon's "Search Inside the Book" feature.
While most digital library systems let you perform a varety of boolean and field searching, ArchivalWare™takes it many steps further. Typical searching is a keyword-based endeavour where searchers attempt, as best they can, to retrieve relevant documents accurately by matching up their keywords with the same words used somewhere in the documents. With ArchivalWare's Pattern and Concept Search modes, variations on spelling as well of synonyms of query terms are searched. With ArchivalWare, you can additionally create a hierarchical, folder-based way of browsing the same documents.
Concept Search
Concept search mimics human communication patterns, using logic and inference to account for the use of different terms to express similar or related concepts. This ability enables RetrievalWare to return documents where most retrieval tools would fail, because ArchivalWare’s concept search finds documents about a given concept rather than documents containing only specific keywords or terms. As an example, a search for “international commerce” will find documents that discuss “foreign trade” or “global markets.” A proceess called disambiguation is applied to the query to ensure that ArchivalWare™returns only the conceptually relevant documents for that particular query. For example, the word “tank” when surrounded by words such as “military” and “vehicle” is more likely to be making reference to a fighting vehicle and less likely to be making reference to a container for holding fuel.
Pattern Searching
Often, words are misspelled due to Optical Character Recognition errors, typographical errors, foreign transcription errors, and legitimate spelling variations. Pattern searching is used to overcome these problems. By utilizing a sophisticated internal voting and rating scheme that considers a number of different features of the pattern instead of just character pairs— pattern searching can find the words and therefore the right documents missed by weaker, approximate searches based at the character level.
Boolean Searching
Boolean search is the common denominator among almost all search offerings. However, ArchivalWare™provides powerful Boolean search capability, which is also very useful as a filter when used in concert with concept and/or pattern search to limit results to only those very specific to the user’s query.
Concept search mimics human communication patterns, using logic and inference to account for the use of different terms to express similar or related concepts. This ability enables RetrievalWare to return documents where most retrieval tools would fail, because ArchivalWare’s concept search finds documents about a given concept rather than documents containing only specific keywords or terms. As an example, a search for “international commerce” will find documents that discuss “foreign trade” or “global markets.” A proceess called disambiguation is applied to the query to ensure that ArchivalWare™returns only the conceptually relevant documents for that particular query. For example, the word “tank” when surrounded by words such as “military” and “vehicle” is more likely to be making reference to a fighting vehicle and less likely to be making reference to a container for holding fuel.
Pattern Searching
Often, words are misspelled due to Optical Character Recognition errors, typographical errors, foreign transcription errors, and legitimate spelling variations. Pattern searching is used to overcome these problems. By utilizing a sophisticated internal voting and rating scheme that considers a number of different features of the pattern instead of just character pairs— pattern searching can find the words and therefore the right documents missed by weaker, approximate searches based at the character level.
Boolean Searching
Boolean search is the common denominator among almost all search offerings. However, ArchivalWare™provides powerful Boolean search capability, which is also very useful as a filter when used in concert with concept and/or pattern search to limit results to only those very specific to the user’s query.
Rather than just display a long list of ranked results, ArchivalWare™ search results can be dynamically classified into folders using one or more predefined and browsable classification hierarchies.
Once the best document or documents are found, pinpointing relevant hits within them is enhanced by hit-highlighting and hit-navigation. RetrievalWa’ re s hit-highlighting feature displays the relevant terms in the document in a highlighted fashion (see Figure 4). This makes it possible to quickly scroll through the document to find what is most relevant. RetrievalWare also offers an enhanced hit-navigation method called Best Hit. This feature instantly takes a user to the highest rated hit within the document.
For single-byte languages, basic language support for keyword, Boolean and pattern search modes is handled out-of-the-box by ensuring that RetrievalWare supports the character set for that language and that the character maps support the lang’uages characters. This makes it feasible to index and search across documents in a great number of character sets and languages:
- ISO 8859-1 (Latin1) West European: English, French, Spanish, Catalan, Basque, Portuguese, Italian, Albanian, Dutch, German, Danish, Swedish, Norwegian, Finnish, Faeroese, Icelandic, Irish, Scottish, Afrikaans and Swahili
- ISO 8859-2 (Latin2) Central and Eastern Europe: Czech, Hungarian, Polish, Romanian, Croatian, Slovak, Slovenian, Serbian
- ISO 8859-3 (Latin3) Esperanto, Maltese
- ISO 8859-4 (Latin4) Latvian, Lithuanian, Greenlandic, Lappish
- ISO 8859-5 (Cyrillic) Bulgarian, Bielorussian, Macedonian, Russian, Serbian and pre-1990 Ukrainian
- ISO 8859-6 (Arabic) Arabic
- ISO 8859-7 (Greek) Greek
- ISO 8859-8 (Hebrew) Hebrew, Yiddish
- ISO 8859-9 (Latin5) Turkish
- ISO 8859-10 (Latin6) Nordic
While Digital Library comes configured as extended Dublin Core the product includes 50 ad hoc fields and allows the Dublin fields to be renamed. As such Digital Library could be configured in an EAD format rather than the standard configuration which is based on extended Dublin Core.
ArvchivalWare is compliant with OAI-PMH and supports harvesting the metadata records in the required standard format.
ArchivalWare™is configured out of the box to support extended Dublin Core. Digital Library supports 50 ad hoc fields. Currently the product can be configured to support most of the METS requirements. A future release of version 4 of the product will allow for complete METS compliance.
A fully customizable, web-based interface for all public-facing components. ArchivalWare™ has an API and a flexible architecture which makes it easy to customize and integrate with other applications. Therefore, a web interface can be easily and cost effectively developed for ArchivalWare™that meets the organization’s needs. ArchivalWare™ can also be easily integrated with COTS portal products.
Access rights management components including support for LDAP v3 Digital Library is LDAP v3 compliant. When a user logs into Digital Library, an authority level designation can be transferred from the organization’s LDAP v3 authentication system to Digital Library where it is mapped to a set of rights and permissions within Digital Library.
