  Seeking OpenVMS Web Search Engine? 
 The Question is:
 
Hello,
I have just installed the Secure Web Server 1.0-1 and I am looking for a search
 engine for my intranet.
Thanks and regards,
JL Rossier
 
 
 The Answer is:
 
Search engine alternatives for Compaq Secure Web Server (see
URLs below):
 
- Commercial Product
- Open Source Solution
- Remote Search Service
 
Which option you choose will depend not only on your search
engine requirements (feature set, support, and maintenance),
but also on OpenVMS platform availability.
 
Commercially-available search engines for OpenVMS are scarce,
but there are a few companies offering pure Java search engine
implementations that may run on OpenVMS unmodified (talk to the
vendor).
 
Open source solutions exist, but primarily for the UNIX
platform.
 
The original SWISH (Simple Web Indexing for Humans)
source code has been ported to OpenVMS, but its more powerful
predecessor, SWISH-E, has not. Compaq may include the source
code for SWISH on the OpenVMS Freeware CD since it is no
longer readily available on the Internet.
 
ht://Dig is probably the most popular open source search engine.
It has not yet been ported to OpenVMS.
 
Remote search services do not (usually) require software to
be installed on the host system. The remote service simply
indexes your site and hosts a search interface. This reduces
maintenance costs.
 
Search engines fundamentally differ in the type of indexing
method they utilize: filesystem or spider. The filesystem
method of indexing simply scans the site's local filesystem
for files to index. This method is fast, but the disadvantages
are that the index is restricted to one host (multiple sites
cannot be indexed), the indexing occurs on the raw files so
server-side includes or JSP output is not indexed, and obsolete
files are included in the index (even though no URL link exists).
In contrast, spider indexing works by starting at a given URL
and scanning all reference links. This is slower, but does not
suffer the disadvantages of filesystem indexing.
 
Finally, search engines differ in their search capabilities
and whether a front-end user interface is included. Features
range from simple boolean searches to fuzzy searches, synonyms,
and meta data.
 
Here are some criteria to consider when selecting a search engine:
 
- How many web pages are served?
- What document formats must be supported (HTML, PDF, XML, DOC, etc.)?
- What indexing features are required?
	+ Duplicate page detection
	+ Indexing control and scheduling
	+ Robot indexing
	+ Indexing secure pages
	+ Indexing meta data
	+ Multiple character sets
- What search features are required?
	+ Phrase searching
	+ Boolean searching
	+ Wild card searching
	+ Field and meta data options (title, URL, etc)
	+ Date-range searching
	+ Relevance-ranking customization
 
Feature Comparison of SWISH/SWISH-E and ht://Dig:
 
SWISH (old)
 
- Ported to OpenVMS: Yes
- Implementation language: C
- Local file system indexing only (shows obsolete pages,
  does not scan multiple sites, does not process server
  side includes or JSP output)
- Smaller sites
- No HTML front end
- Search capabilities: boolean, field
- File formats: text, HTML
 
SWISH-E (new)
 
- Ported to OpenVMS: No
- Implementation language: C
- Spider-based indexing beginning with version 1.2 (skips
  obsolete pages, scans multiple servers, processes server-side
  includes or JSP output)
- Smaller sites
- No HTML front end
- Search capabilities boolean, field, meta-data support, word
  stemming, prefix wild-carding
- File formats: text, HTML
 
ht://Dig
 
- Ported to OpenVMS: No
- Implementation language: C/C++
- Spider-based indexing (skips obsolete pages, scans multiple
  servers, processes server-side includes or JSP output)
- Larger sites
- Includes HTML front end
- Search capabilities: boolean, field, meta-data support, word
  stemming, prefix wild-carding, synonym, soundex, metaphone
- File formats: text, HTML, PDF/Postscript/Word (with separate
  external converters)
 
Commercial Software:
 
ASTAware SearchKey Pro (pure Java) (http://www.astaware.com/r_prod_info.html)
Trident Search Site Server (pure Java) (http://www.noviforum.si/)
 
Open Source Search Tools:
 
VMS
 
WWWVMSINDEX (http://www.sil.org/ftp/pub/software/vms/)
SWISH 1.1 - replaced by SWISH-E (see below)
 
Non-VMS
 
ht://Dig (http://htdig.sourceforge.net/)
Harvest (http://www.tardis.ed.ac.uk/harvest/)
PLweb Turbo (http://www.pls.com/plweb.htm)
SWISH-E (http://sunsite.berkeley.edu/SWISH-E/)
SWISH++ (http://homepage.mac.com/pauljlucas/software/swish/)
Isearch (http://www.cnidr.org/ir/isearch.html)
mnoGoSearch (was UdmSearch) (http://search.mnogo.ru/)
 
Remote Search Services:
 
Atomz (http://www.atomz.com/)
FreeFind (http://www.freefind.com/)
PicoSearch (http://www.picosearch.com/)
SearchButton (http://www.SearchButton.com/)
SiteMiner (http://siteminer.mycomputer.com/)
