Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z
faqs.org - Internet FAQ Archives

Web and Internet Search Engine FAQ (WISE FAQ) Dec'98


[ Usenet FAQs | Web FAQs | Documents | RFC Index | Neighborhoods ]
Archive-name: web/wisefaq
Posting-Frequency: monthly

See reader questions & answers on this topic! - Help others by sharing your knowledge
Web and Internet Search Engine FAQ 
(WISE FAQ (copyright) 1997-1998)
Copyright 1997-1998  Ken Bogucki  

krb@infobasic.com
kenbog@netcom.com

WISE FAQ (c) Ver. 3.6  DEC, 1998 

An HTML Version of this FAQ can be obtain at: 
  http://www.infobasic.com/pageone.htm

The current ASCII version of this FAQ can be found at: 
  ftp://rtfm.mit.edu

This web site undergoes considerable change and new information
is added weekly.  This web site also contains a collection of
various search sites & internet databases with links to some of
the best core sites on the net.

CHANGES IN THIS FAQ...

1. Some minor sections have been deleted.

2. The inclusion of a listing of select search engines, general, 
Meta and Geo specific search engines: Section 9.0

3. Minor revisions to Excite and Lycos.

Beginning immediately, the WISE FAQ and its associated
documents can be found at our new web site
http://www.infobasic.com/pageone.htm  All email queries, complaints
or corrections should be, when possible, addressed to

wisefaq@infobasic.com  

COPYRIGHT
This FAQ is copyrighted material. The copyright is owned by the
author of this FAQ, Ken Bogucki kenbog@netcom.com  This FAQ may
not be reproduced or distributed, in whole or in part, for
commercial purposes without the express written permission of the
author. This FAQ may be used for non-commercial purposes as long
as the author is notified in advance and the entire FAQ is used
without alterations (except for formatting purposes) and the
copyright notice & warranty notice remain intact and a part of
the FAQ.
  

WARRANTY.
This FAQ is an AS-IS document.

!! WEB SITE ADDRESSES ARE CASE-SENSITIVE !!
When necessary, double brackets [] are used in this FAQ for
clarity.  These brackets are not part of any search expression. 
Their only purpose is to separate the search words,
expressions and results from the surrounding text.


CONTENTS***

1.0  Introduction 
     1.1  (Reserved)
     1.2  Definitions

2.0  Search Engine Queries, A Quick Tutorial
     2.1  All Search Engines Are Not Created Equal 
     2.2  Understanding Search Syntax & Odds and Ends

3.   General Search Engines
     
3A   Alta Vista  http://www.altavista.digital.com
     3A.1  Alta Vista Simple Searches
     3A.2  Alta Vista Complex Searches
     3A.3  Restricting A Simple and Complex Search
     3A.4  Sorting Results by Ranking
      3A.4.1  Simple Search Ranking
      3A.4.2  Complex Search Ranking
     3A.5  Misc. Information about Alta Vista

3B   Excite  http://www.excite.com
     3B.1  Excite Concept Based Queries
     3B.2  Excite Advanced Queries
     3B.3  Excite Exact Match Queries

3C   Lycos  http://www.lycos.com
     3C.1  Lycos Simple Searches
     3C.2  Lycos Complex Searches
     
3D   Infoseek   http://www.infoseek.com
     3D.1  Infoseek Simple Searches
     3D.2  Infoseek Complex Searches

3E   Web Crawler  http://www.webcrawler.com
     3E.1  Basic Searches
     3E.2  Using Logical Word Operators

3F   Yahoo  http://www.yahoo.com
     3F.1  Yahoo Menu/Simple Searches
     3F.2  Yahoo Complex Searches
     
3G   Euroferret  http://www.euroferret.com
     
3H   (Reserved)

3I   Hot Bot  http://www.hotbot.com
     3I.1 Hot Bot Simple Searches
     3I.2 Hot Bot Complex Searches

4.0  Meta Search Engines  
4A   Internet Sleuth  http://www.isleuth.com
     4A.1 Accessible Search Engines
4B   Meta Crawler  http://www.metacrawler.com

4C   ProFusion  http://profusion.ittc.ukans.edu

5.0  Specialized Search Engines

6.0  (Reserved)
     
7.0  Subject Trees

8.0  Quick Reference Card
     8.1  Alta Vista
     8.2  Excite
     8.3  Lycos
     8.4  Web Crawler
     8.5  Yahoo
     8.6  Infoseek

9.0  A Partial List of Select Search Engines
     9.1  General Search Engines
     9.2  Meta Search Engines
     9.3  Geo Specific Search Engines

10.0  Contacting the Author

****
1.2  DEFINITIONS

***These definitions are applicable only to this FAQ.

-APPLET    A Java program found on some Web pages.
-DOMAIN    Last portion of an internet address; .com, .mil,
           .net, .uk, .it
-HOST      The computer where the Web page is located 
-META      A program used to manipulate other programs.
-URL       Full internet address, http://www.xyz.com;
           ftp://abc.xyz.com; etc.
-WILDCARD  Symbol used to denote a number of missing letters,
           usually this symbol is a "*". 
-POINTER   A search result that "points" to other sources of
           information.

****
2.0   SEARCH ENGINE QUERIES, A QUICK TUTORIAL.

2.1  ALL SEARCH ENGINES ARE NOT CREATED EQUAL 

Different search engines accomplish their job by taking different
approaches to indexing the web.  Some engines index every word of
every page, some index the first hundred words, some index every
word and filter out noise words.  Noise words are words like:
but, the, are, is, at, --words that have no particular meaning
when used alone.  In the phrase, [ "the quick brown fox jumped
over the lazy dog" ], the noise words might be: the, quick, over,
the. The definition of "noise words" will vary from search engine
to search engine.

To get a better understanding of this concept and how it's
applied,  go to the Excite search engine, http://www.excite.com,
and run the following phrase search (a phrase search is any group
of words enclosed in quotation marks):  [ "to be or not to be" ].
Excite will not find matches for this phrase search.  Excite
considers all the words in this phrase as "noise words" and 
Excite does not index noise words.  One of the most famous
phrases in the English language cannot be found at Excite by
using that phrase as the search criteria. Now go to Alta Vista,
http://www.altavista.digital.com, and run the same phrase search.
Alta Vista will display more than 500 hits for this query. This
does not make Alta Vista the best search engine for all your
needs.  It is, however, the most inclusive search engine.  

2.2  UNDERSTANDING SEARCH SYNTAX, ODDS AND ENDS 

These are general suggestions, however, they do apply to most
search engines.

Some expressions used in this tutorial:

" " this is used to denote a phrase expression or search.  All
the words in the [ " " ] must be found at a web site to produce a
hit.

[ OR ] a OR b will find either the "a" keyword or the "b" keyword

[ AND ] a AND b must find both "a" and "b" at a web site to
produce a hit.

[ ( ) ] these are used to organize a complex search expression


A.) On the surface, two different queries may appear the same. 
However, search engines will interpret the queries differently,
consequently the results will be dissimilar.

For example: [ (labor OR labour) AND union ] is not the same as 
[ "labor union" OR "labour union" ].  The queries appear to ask
the same question, however, search engines will see differences
in the structure of the two queries.  These differences will
effect the result.

The first expression will find those web pages that contain any
of the words, anywhere in the document, regardless of the number
of words separating the different sides of the AND expression.
The first expression will find: [ labor should organize into a
union ], [ labor and management should realize that success
depends on the union of their interests and aims ], etc. 

The second expression will only find those web pages where the
words "labor" and "union" or "labour" and "union" appear next to
each other.  This is because the [" "] in the second expression
makes that query a phrase search.  Phrase searches require the
words in the phrase to be next to each other in the web document.
The second expression will find: [ a labor union is in the
interest of workers ], [ a labour union is the best way to
counter management ].  The second expression will not find 
[ labor should organize into a union ].  Note, the first example or
expression, however, will also find the same pages as the second
example. The reverse is not true.

B.) The web is referred to as the world wide web.  It is
important to realize that words and phrases that are common in
North America, for example, are not necessarily common anywhere
else in the world.  Searching for corrugated steel in the UK is
probably useless.  In the UK corrugated steel is usually called
corrugated iron.  Likewise, there are regional differences in
terms and concepts.  The individual words soda or pop can refer
to a soft drink. In some parts of the USA, soda you mix with
Scotch and pop is a soft drink.

Also keep in mind differences in spelling: labor/labour,
color/colour, organise/organize.  A world wide search for 
[ "labor organizations" ] might be best if the search query was:
[ "labor organization" OR "labour organisation" ].  The search
criteria would be better phrased: [ "labor organization" OR
"labour organisation" OR "trade unions" ].

Allow for the possibility of misspelled words.  One search for
politics also found hits when the search word was misspelled
"polotics".  Remember, English is not always the first language
of the people publishing web pages.  

C.) Probably one of the more flexible search options available at
most search engines is the "*" operator or wild card operator.
Wild card searches allow  queries to contain incomplete words,
however, this kind of query will probably yield a considerable
number of unnecessary hits. For example: 
[ orang* ] will produce hits for [ orange ], [ oranges ] and 
[ orangutan ].  If you're searching for something to eat instead
of something that co-starred with Clint Eastwood, consider
restricting wildcard searches with addition search words.  For
example: [ orang* AND fruit ] will not produce hits about Clint
Eastwood's co-star.  The search has been limited with the
inclusion of the word [ fruit ].

D.) The position and organization of the keywords in the search
query is also important. For example, if you're looking for
timely information on earthquakes your keywords might be:
"earthquake", "information", and "important". If you run a
complex search at Alta Vista using the following query: [
"earthquake information" AND important ], Alta Vista will display
more than 960+ hits. The query: [ important AND "earthquake
information" ] will generate less than 850 hits and the query [
"important earthquake information" ] will generate only 1 hit. 
The last expression may seem the most logical expression to use,
however, things are not always that simple. 

SUMMARY OF VARIOUS SEARCHES.........................

earthquake AND important AND information  22846 hits
earthquake AND "important information"     1012 hits
"earthquake information" AND important      960 hits
"important information" AND earthquake      957 hits
important AND "earthquake information"      850 hits
"important earthquake information"            1 hit

....................................................

If you're looking for office furniture on the net there are a
number of possible search expressions and each expression will
provide varying degrees of success.

For example these search expressions were run at Alta Vista, the
results of each search is listed.
...................................................

"office furniture for sale"          40 hits
"for sale office furniture"           8 hits
...................................................

E.) Also important at most search engines is the case of the
query.  In most instances search engines assume a lower case
query is a case insensitive query.  This means that the search
engine will find both upper and lower case occurrences of the
search expression.  However, if the search expression contains
upper case letters the search engine will treat the query as
a case sensitive query and will only find exact matches for the
query.  Obviously this will effect the results of any query.  For
example:

................................................

"Apples Peaches Pumpkin Pie"   65 hits
"apples peaches pumpkin pie"  116 hits
.................................................

Both of these queries were run at Alta Vista.  The first
expression is a case sensitive search.  The second expression is
a case insensitive search.  This second expression produced
results that included web sites where the case sensitive
expression, "Apples Peaches Pumpkin Pie", could also be found.
The first expression, the case sensitive expression, only found
exact matches to that search query.  In most cases, the value of
an upper case query rests in its use as a utility to restrict a
search. 

F.) If searching for information in a specific country, consider
using a search engine that will allow you to restrict the search
to a specific country domain.  For a list of country domain names
go to http://www.infobasic.com/100codes.htm.  The following
sections in this FAQ, about specific search engines, will explain
the process of restricting searches based on domain names.

G.) Lastly, some keywords used in a search expression are useless.
This is not because the keywords are not specific enough, it is
because the keywords are too common on the web.  For example: if
you're looking for a piece of shareware and you run the query [
shareware AND download ] Alta Vista will report 280,000 hits.
However a search for a specific piece of shareware (by name),
[ "xyz.zip" ], will produce fewer and more precise hits.  Even a
partial file name, [ xy*.* ], is more effective than the first
example. The keywords, "shareware" and "download", are too common
on the web to produce any kind of meaningful result.
One last word, some search engines go to some lengths to
advertise the fact that their site will generate twice as many
hits as "xyz" or that they index twice as many pages as so and
so.  The issue of quantity is secondary.  The real question
relates to the quality of the first 10, 20 or 30 hits.  If your
query is properly structured, the information you're looking for
should show up in the first several dozen hits.  If you haven't
found the information you need in the first two or three pages or
if the ranking falls below 75%--consider restructuring your query
and try the search again.


****
3A.0  ALTA VISTA SEARCH ENGINE  http://www.altavista.digital.com
Alta Vista is one of the more complex search engines.  It may
seem intimidating, however, for those with a serious interest or
pressing need to find information, Alta Vista may be the place to
go.  

Like other search engines, Alta Vista has simple and complex
searches.  It also contains several other options that allow the
user to optimize their time and efforts.  One is ordering your
search results based on ranking (not necessarily confined to the
original search criteria) and the ability to restrict the search
to certain types and locations of Web pages.  

3A.1  ALTA VISTA SIMPLE SEARCHES

apples peaches "orange juice" : documents where only "apples" or
"peaches" or the phrase "orange juice" appear.

+apples +pears -"orange juice" : documents where only "apples"
and "oranges" appear and not the phrase "orange juice".
Wildcard Operator "*"

app* : all documents that contain the words "apples", "applets",
"appraise",  etc.  It will not find "applications" or
"applicable".  The "*" notation can only be used to represent a
max. of 5 characters.  

The above Operators can be used in any combination.  For example:


+oranges -app* : documents that contain the word "oranges" but
not the words "apples", "apply" and "applets", etc.

3A.2  ALTA VISTA COMPLEX SEARCHES

There are two ways to construct an Alta Vista complex search. 
You can use either Logical Word Expressions or Logical Symbol
Expressions in the search request.  Alta Vista will interpret
both types of logical expressions the same way.

WORD EXPRESSION   is the same as  SYMBOL EXPRESSION 
----------------------------------------------------
a AND b           is the same as       a & b
a OR b            is the same as       a | b
a NOT b           is the same as       a ! b
a NEAR b          is the same as       a ~ b


SPECIAL NOTE: Logical word and symbol expressions are precise
search tools.  The search expression... apple AND peach...will
find "apple" and "peach" but not "apples" and "peaches".

In Alta Vista, the complex search page contains an editing window
3 lines by 70 characters.  This window allows you to viewand edit
the entire complex search expression at one glance.

AND
apple AND orange : sites that contain the word "apple" as well as
the word "orange", however, this expression will not display
those sites that have "apples" and "oranges" in the same
document. (See Special Note above)

OR
apple OR orange : sites that contain either the word "apple" or
the word "orange".

NOT
apples NOT oranges : sites that contain the word "apples" but not
the word "oranges"

NEAR
apple NEAR juice :  will generate a list of pages where the word
"juice" is within ten words of the word "apple". Note, the Alta
Vista NEAR operator uses a default 10 word range.
3A.3  RESTRICTING A SIMPLE AND COMPLEX SEARCH
This is a method of confining the Web search to certain pages or
sites that meet specific criteria. [partial list] 

anchor:click-here : only search pages that contain the phrase
"click-here" in the text of a hyperlink.

applet:<java class> : only search pages that have the specified
Java class applet in the applet tag of the Web page.

domain:ie : only search pages that originate in the domain .ie
(Ireland), or any of the other country codes and the
miscellaneous standard codes, .com, .org, .mil, etc.

host:xyz.com : only search those pages that reside at the host
name xyz.com.

image:apples.jpg : search those sites that contain the image tag,
"apple.jpg".

link:xyz.com : search those sites with a link to xyz.com.  If you
have a Web page and are curious about how many other pages carry
a link to your page then run this search; 
link:www.yourhomepage.com.

title:"Apples and Oranges" : search those pages that have "Apples
and Oranges" in the title of the Web page.


3A.4  SORTING RESULTS BY RANKING 

Ranking results, simply, is a way to sort the results of
your search. For example, if you use a complex search for "apples"
and "oranges", you can instruct Alta Vista to sort the results so
that those sites with the most references to "apples" appear
first in the result list.  Simple searches are sorted
automatically by Alta Vista.

3A.4.1  Simple Search Ranking

Alta Vista automatically uses a formula to sort the results of a
simple query.  Results are ranked according to the following
criteria:
  1.  results score highest if the search criteria are meant in
the first few words of a document
  2.  query words and phrases are found close to each other in a
document
  3.  query words or phrases appear more than once in a Web
document.

3A.4.2  Complex Search Ranking

On the complex search page, there is a separate window for
ranking.  After establishing the search expression, go to the
ranking window and insert those words (these words need not be
the same words you used in the search expression) that will be
used to sort the result list.  For example, if your search
expression is; "apples & oranges", you may then use the ranking
window and include the word "California".  The end result is that
the search will produce all those documents that contain the word
"apples" a
nd the word "oranges" in the same document.  With the ranking
example above, Alta Vista will then sort the result list so that
all documents that have a reference to "California" will appear
first in the list.  More than one word or phrase may be used in
the ranking window.


3A.5  MISCELLANEOUS INFORMATION ABOUT ALTA VISTA

1.  Alta Vista can handle phrases in a search expression in a
number of ways, however, the best way to search for a phrase is
with the use of double quotes.  For example: "United States Army"
or "orange juice", etc.

2.  In Alta Vista a lower case search expression is a
case-insensitive search.  Using capital letters in the search
expression restricts the results to exact matches.  For example,
if you search for "oranges" you will get "oranges", "",
"oRanges", etc. (case-insensitive search).  If you search for
"Oranges" you will only get "Oranges" in your search results. 
The results will not show up instances of "oranges" "oRanges",
"oRanGes", etc. (case-sensitive search).

3. The wildcard marker [ * ] has certain restrictions.  The "*"
marker requires that at least three letters preceded the notation,
for example, "*go" & "or*" will not work, however, "ora*" and
"appl*" will work.  Also the "*" marker will only display from
0-5 letters;  "appl*" will display "apples" & "applets" but not
"applications".

****
3B  EXCITE SEARCH ENGINE  http://www.excite.com

Excite uses several methods for finding the requested
information.  One is a concept based query, another is an advanced
based query and the last is an exact match query.

NOTE: Excite provides it's own relevancy rating.  The user cannot
directly change or alter this rating.

Excite uses " " marks to indicate a phrase search, for example,
"apple butter" will find those sites where the phrase --apple
butter-- can be found but not those sites that list only the word
apple.

3B.1

A concept based query utilizes the relationship between words and
ideas to find matches.  For example, in a concept based search
the keyword "fruit" will yield "fruit", but also, "apples",
"oranges", etc. Concept based queries rely on the user requesting
information in the form of one or more keywords.

3B.2  ADVANCED BASED QUERIES

In a Advanced based query the operators "+" and
"-" are used. 

+apples +oranges : documents that have the word "apples" and the
word "oranges" on the same page.

-apples +oranges : documents that have the word "oranges" but not
the word  "apples".

+apple -pears -tarts : documents that have the word "apple" but
not the words "pears" or "tarts".  This query will not return
"apple tarts" but will return "apple turnovers".

3B.3  EXACT MATCH QUERIES

Exact match queries use Logical Word Expressions to find
documents.  The logical word operators are: AND, OR, AND NOT plus
().  Using the logical word operators will turn off Excite's
concept based search.  A keyword search for "fruit" will instruct
Excite to search only for those sites that contain the word
"fruit". Excite will display sites that contain related words
like "apples", "oranges", etc.

apples AND oranges : sites that contain both the words "apples" &
"oranges" in the same document.

apples OR oranges : sites that contain either the word "apples"
or the word "oranges".

apples AND NOT oranges : sites that contain only the word
"apples" but not those sites that contain the word "oranges".

() is an organizational operator. For example, "apples AND
NOT(oranges OR peaches)" will produce sites that contain the word
"apples" but not the words "oranges" or "peaches".

****
3C  LYCOS SEARCH ENGINE   http://www.lycos.com

Lycos has two search levels, simple and complex.  In the case of
Lycos, the complex search function is menu driven and not
difficult to use, however, because of its menu interface this
Lycos search is somewhat more restrictive than other search
engines.

3C.1  STANDARD SEARCH (Simple)
Standard searches do not use Logical Word Operators.  

apples oranges peaches  :  will yield sites in which all three
words appear

[ - ] This is a restrictive operator. 
apples oranges -berries :  all documents in which "apples" and
"oranges" appear but not those pages where "berries" appear.  If
"apples", "oranges" and "berries" appear in the same document,
this document will not appear in the search results.

[ $ ] This is a wildcard operator.
app$ : will yield all pages in which the words, "apples",
"applications", "applets" appear.

[ . ] This a delimiting tag.  Searching for "apple" will yield
"apples" and "apple", however, if the search were "apple." then
only those documents with the word "apple" will be returned and
not those pages with the word "apples".

3C.2  CUSTOM SEARCHES(Pro Search)

Complex searches are done through a menu interface.  All of this
is fairly intuitive.  

Just a very brief explanation is required here.  Everything that
appears on the complex search page has a corresponding on screen
example and explanation.

****
3D  INFOSEEK  http://www.infoseek.com
Infoseek has two search options, simple and complex.  Both search
options provide only limited query syntax.  Infoseek has no way
to rank search results.  However, Infoseek is fast and is more
than suitable for those quick search needs.  The site is low
graphics and works well with text browsers.

3D.1  INFOSEEK SIMPLE SEARCHES

Infoseek's simple searches use a combination of commas, plus and
minus signs, quotes (to make phrase searches) and caps.

apples oranges : will find pages with either "apples" or
"oranges".

+apples oranges :  normally will return pages with just "apples",
however, pages that contain "oranges" as well are acceptable. 
Those pages, however, will receive a lower ranking.

"apple juice" : will display those pages where the words "apple"
and "juice" appear next to each other.

Caps are used to indicate proper names and a case sensitive
search:
Johnny Appleseed  : will find only pages with the name "Johnny
Appleseed".

Johnny,Appleseed  : will find pages with either name.  Note:
commas are only used to separate names.

apples -grapes : will find pages with "apples" but not with the
word "grapes".


3D.2  INFOSEEK COMPLEX SEARCHES

There are only a few addition symbols that distinguish a complex
query from a simple query.

the pipe symbol [ | ] is used to construct a search within a set
of search results.

fruit | apple | juice :  will find pages that refer to "fruit"
then search out those pages within that result that contain the
word "apple". Finally, the last group of results will be searched
for any pages that contain the word "juice".

title:fruit : will find any pages where the word "fruit" appears
in the title of the web page.

url:www.orange.com : will find those site that contain the
address "www.orange.com".  The search expression [ url:fruit ]
will find those sites that have the word "fruit" in the URL, for
example, "www.fruit.com".

link:www.juice.com : will find those sites that are linked to the
specified URL

site:xyz.com : will bring up all the sites located at the
specified address.

****
3E  WEBCRAWLER   http://www.webcrawler.com

One of the better Web search engines is WebCrawler, simplybecause
of its flexibility.  

3E.1  BASIC SEARCHES

apples oranges pineapples  :  will provide information on those
documents that contain any of the words: "apples", "oranges",
"pineapples".  A simple search expression.

3E.2  USING LOGICAL WORD OPERATORS

AND
apples AND oranges  :  will provide information on documents
where both the words "apples" and "oranges" appear.  
OR
apples OR oranges  :  will display information on pages that
contain either of the two search words.  This is similar to the
Simple Search example except that this search employees specific
logical word operators.  The first search  could also be run as:
apples OR oranges OR pineapples.

NOT
fruit NOT apples  :  displays information about "fruit" but not
those pages that reference "apples".

NEAR
cheese NEAR/15 wine  :  will display those pages that contain the
word "cheese" and is within 15 words of the word "wine".  Note,
you can specify  any number of words in the NEAR operator,
NEAR/20, NEAR/5, etc..

ADJ
world ADJ war  :  will display Web pages that contain the word
"world" immediately followed by the word "war"

"  "
Quotes have the same effect as the ADJ command above: "world war"
will provide the same results as:  world ADJ war.

()
Parenthesis are used to organize complex search expressions. For
example:
(wine NEAR/10 cheese) AND apples or "California wine" AND prices
NOT (white OR rose)

****
3F   YAHOO  http://www.yahoo.com

Yahoo is one of the most intuitive search engines to use. There
are two ways to search Yahoo, one is a very simple, menu driven
search and the second is by use of logical word operators. 
However, this second search option is also a menu driven search.

3F.1  MENU/BASIC SEARCHES
The Menu interface is easy to use and understand.  Simply select
the type of material you want to search (WEB, Usenet, etc.) and
how the search should be conducted. Select how the results should
be displayed, 20, 30, 40 per page and click the search button.

3F.2  MENU/ADVANCED SEARCHES 

[ + ] 
apples +oranges : those sites that have "apples" as well as
"oranges" in the same document.

[ - ]
apples -oranges : those sites that have "apples" but not those
sites that have "oranges".
[ t: ]
A restriction operator that will confine the search to Web page
titles. For example, t:apples will restrict the search to pages
with the word "apples" in the title of the page.  It will not
search a page if the page title is "Oranges".  The correct usage
of the "t:" operator in a search expression is [ +t:oranges
+apples ] this expression will yield documents that have the word
"apples" in the Web page and the word "oranges" in the Web page
title.  The expression, "+apples t:oranges" is incorrect.  The "t
:" operator must immediately precede the search word.

[ u: ]
A restrictive operator. Confines the search for the keywords to
certain URLs.
For example, [ u:xyz ] will restrict the search to URLs that have
an "xyz" in the url address.  The "u:" operator follows the same
rules listed for the "t:" operator.

[" "]
Phrase combining operator: "orange juice", "apple juice", etc. 

[ * ]
Wildcard search.  For example, "pea*" will return "pears",
"peas", etc.

****
3.G  EUROFERRET at http://www.euroferret.com

EuroFerret is a small search engine run off several Sun
computers. This search engine specializes in web pages located in
the European community.  The search syntax is extremely simple.  

Euroferret accomplishes its magic by examining web pages and
deciding on the 60 most important words and 12 key phrases in
each document.  Euroferret works on the assumption, for
example, that page titles are more important than disclaimers.

The search at Euroferret is very intuitive.  Once a search is
run, Euroferret will list the best possible matches to the query
and will suggest terms, through check boxes, that might be used
to further refine the search.

However, because of the way that EurroFerret indexes web pages do
not expect miraculous results. Though it purports to index more
pages than Alta Vista, Euroferret's indexing is less concise and
all inclusive than other search engines.
That said, Euroferret is still a good place to go if the
information you're looking for is located in the European
community and you have a reasonable handle on the subject matter.
The engine is fast and the results reliable.

Euroferret also displays a text only version for people who may
not want the graphics or who use a text browser like Lynx.

****
3.I  HOT BOT at http://www.hotbot.com

This is a service of WIRED MAGAZINE.

Hot Bot uses a graphic interface with pull down menus and check
boxes to make searching easier.  However, Hot Bot lacks some of
the sophisticated query options available at other sites.  Even
some of the more elemental query options are missing from Hot
Bot.  For example, Hot Bot does not allow proximity searches
("apple" within 10 words of "juice") and Hot Bot does not support
wild card searches.  At most search engines, a search for "appl*"
will yield results that contain "apple", "apples", "applejack",
and "applesauce."  This wild card search is not possible at Hot
Bot.

*****
3I.1  HOT BOT SIMPLE SEARCHES

A phrase or group of words are entered and a pull down menu
specifies if the result should include all the words, some of the
words, a Boolean search or if the words entered should appear in
the title of a web page.

Hot Bot allows simple searches to be enhanced by permitting the
user to select which countries the web search should concentrate.
In addition, the user can refine their search by specifying if
the web page should be several weeks old, several months old or
several years old ( a number of time parameters can selected). 
The user can further restrict the search by specifying that the web
pages must have, audio, video, images or Shockwave material. 
This pull down menu interface is very intuitive and really needs
little explanation.

3I.2  HOT BOT COMPLEX SEARCHES

The complex search or Super Search is an expanded version of the
simple search option. The date can actually be specific, before
such a date but not after this date, etc.  The word or phase
search itself can be further broken down and the kind of media
type included in a search is expanded to include, Java, Java
scripts, Acrobat, ActiveX and can also include extensions, .gif
.txt .dbf etc. All of this is accomplished either through pull
down menus or check boxes.


****
4.0  META SEARCH ENGINES

A Meta Search engine will search a number of general search 
engines at the same time from a single query.

4A  INTERNET SLEUTH  http://www.isleuth.com

Internet Sleuth is a unique search engine.  It will allow you to
search several Web search engines simultaneously, up to six
different search engines.  However, it is important to realize
that since the search expression must be understood by all search
engines the expression must be common to the multiple search
engines.  Simple searches and phrase searches are the best.  For
example, "a basket of apples and oranges" : this phrase search is
understood by most search engines.

Internet Sleuth also allows you to use multiple search engines
for Usenet, Web Reviews, News and Headlines, Business and
Finance, and software searches.  Below is a list of some of the
search engines available for various topics.

Internet Sleuth uses a graphic interface.  The interface is self
explanatory.

4A.1  ACCESSIBLE SEARCH ENGINES

WEB SEARCH ENGINES AVAILABLE
  Lycos
  Excite
  Alta Vista
  Magellan
  Web Crawler 
  Yahoo

REVIEWED Web SITES
  Excite Previews
  Lycos Top 5%
  Yahoo, New Listings
  Magellan Reviewed Sites

NEWS AND HEADLINES
  AP Headlines
  News Tracker
  Washington Post Headlines
  Electronic Newsstand

BUSINESS AND FINANCE
  CNN Financial News
  Business Wire
  Hoover's Company Capsules
  PR Newswire
  APL Quote Service

SOFTWARE
  Info-Mac
  shareware.com
  Winsite Windows Software

USENET NEWS
  Alta Vista Usenet News
  Deja News
  Hotbot
  Reference.com

****
4B  METACRAWLER   http://www.metacrawler.com

Metacrawler is a multiple search engine site.  MetaCrawler will
simultaneously run searches on several search engines and display
the results.  Currently, Metacrawler uses, Infoseek, Excite,
Lycos, Yahoo and Alta Vista to run its simultaneous searches.

There are two search options in MetaCrawler, a standard search
page and a power search page.

STANDARD SEARCH
In the standard search you simply type in your keywords (no
special syntax, or Logical Word or Logical Symbol Operators) then
click the appropriate button if you want all words found, any of
the words found or if you want the keywords treated as a phrase. 
Click the "GO" button and MetaCrawler will process your request
through the various search engines. The results will be displayed
in the usual format.

POWER SEARCH
The power search is basically the same as the standard search. 
There are, however, several additional options included in the
power search page.  These  options allow you to decide how many
search results are to appear per page, how many results per
source, and where, geographically, the results should be
obtained.  For example: everywhere, North America, Europe, South
America, etc.

****
4C  PROFUSION   http://profusion.ittc.ukans.edu

There are two pages for the ProFusion search engine. The
first page requires the use of a browser that supports tables and
Java scripts.  The second page does not have these requirements. 
The search syntax for both pages is the same.

(Java enabled and table capable browsers)
http://profusion.ittc.ukans.edu

(other browsers)
http://profusion.ittc.ukans.edu/ProFusion1.html

ProFusion allows the user to search either the Web or the Usenet.

There are three type of searches available at this site: default,
Boolean searches and phrase searches.  For the sake of clarity
and uniformity Boolean searches will be referred to as either
Logical Word or Logical Symbol searches.

A default search is nothing more than a list of multiple keywords
or a single keyword search.

A phrase search is any search enclose in [ " " ].  In a phrase
search all the words must appear together exactly as they appear
in the " " marks.

Logical Word or Logical Symbol expressions allow for greater
versatility in the query.  The Logical Word and Logical Symbol
expressions used at ProFusion are identical to those used at Alta
Vista and their meaning and use is identical.
ProFusion uses the following search engines.  The number of
engines and the choice of engines is left up to the user.  When
ProFusion returns the results it will delete any duplication
among the selected search engines.

Alta Vista
Excite
Lycos
Open Text
Yahoo
Infoseek
Hot Bot
Magellan.

*****
5.0  Specialized Search Engine. 

This subject covers search engines that seek out specific types of
information from the web and internet, medical, legal, etc.  As
the web becomes more and more complex more of this type of
search engine will become more common place.  Essentially most of
the engines work in a similar fashion to the general search
engines and usually the search syntax is not as complicated as
the general engines.  
This FAQ provides general information on these sites and an
explanation of the search syntax when necessary.  In most cases
the search syntax utilizes simple logical word or logical symbol
expressions.

5.1  

5.2  Internet Legal 
     http://www.ilrg.com/ 
"...Internet Legal Resource Guide. A categorized index of 3100
select web sites in 238 nations, islands, and territories, as
well as more than 850 locally stored web pages and other files,
this site was established to serve as a comprehensive resource of
the information available on the Internet concerning law and the
legal profession ... Designed for everyone...it is quality
controlled to include only the most substantive legal resources
online."

5.3  Newswise (Medical Research) 
     http://www.newswise.com/search-1.htm 
5.4  Satellite Ency.
     http://www.tele-satellit.com/cgi-bin/local_search 

5.5  Sydney Math Search 
     http://www.maths.usyd.edu.au:8000/MathSearch.html 
Provides a search ability for over 90,000 documents on mathematics
and statistics around the web.  Most of the documents relate to
research or university level mathematics.  Search instructions are
very easy and the site is usable with both a text based browser
and the usual graphics browser.

5.6  U.S. Business Advisor 
     http://bacchus.fedworld.gov/Search_Online.html 
A web site that provides searching capability for business
information from USA Federal sites. Indexing the contents of over
a half a million government sites and notes those sites that
contain information of value to business.  The search expression
is a plain language or natural language query. Site provides
access to a text only version.


****
7.0  SUBJECT TREES

Subject trees are not search engines.  Subject trees are pages
where web sites and sources of information are arranged according
to subject. For example the subject heading "History" might lead
to subsections: "American", "European", "African", "Asia," each
subsection listing an appropriate list of general web sites. 
Following the "American" link might lead to even more web sites
also sorted by specific headings: "American Revolution,"
"American Civil War," "Mexican-American War," etc.  Each section
leading to addition web sites and each section again broken down
to more specific headings.  For example, "American Civil War"
might lead to subheadings: "Union Forces," "Naval Battles," etc.,
each subsection with appropriate web site listings.

Two of the best general subject trees around are:

BUBL at http://bubl.ac.uk

Berkeley Subject Tree at
http://sunsite.berkeley.edu/InternetIndex.html

These sites are worth a first visit when beginning any net
research project.

****
8.0  REFERENCE CARD

NOTE: This reference card is designed on the assumption that you
have a basic understanding of the search expressions and criteria
covered in prior sections of this FAQ.

The double brackets [] in the reference card are not part of the
query syntax.  

****
8.1  ALTA VISTA  http://www.altavista.digital.com

[apples "orange juice"]     "apples" or the phrase "orange juice"
[+apples -"orange juice"]   "apples" & not the phrase
"orangejuice"
[app* (wildcard)]           "apples", "applets", "appraise"
(wildcard in Alta Vista requires Min. of three letters before the
wildcard and will return from 0-5 characters Max.)
Complex Searches (Can use either logical word or symbol
expressions)
AND or &, OR or |, NOT or !, NEAR or ~
[apple AND orange]          "apple" & the word "orange"
[apple OR orange]           "apple" or the word "orange"
[apples NOT oranges]        "apples" but not the word "oranges"
[apple NEAR juice]          "juice" within ten words of "apple"
 
RESTRICTING A SIMPLE AND COMPLEX SEARCH
[anchor:click-here]         pages with "click-here" in the
hyperlink.
[applet:<java class>]       pages with the Java class in the
applet tag
[domain:xyz]                pages in the domain "xyz"
[host:xyz.com]              sites at the host name xyz.com.
[image:a.jpg]               sites with an image tag, "a.jpg".
[link:xyz.com]              sites with a link to xyz.com.
[text:orange]               sites with "orange" in the visible
text
[title:"A, B and C"]        sites with "A, B and C" in the title.

RANKING
Simple searches: The ranking is automatic.

Complex searches: Enter any word or groups of words in the
ranking window. Alta Vista will sort the results based on these
words.


****
8.2  EXCITE  http://www.excite.com

Concept Based Search
[+apples +pears]            "apples" and "pears" 
[-apples +peach]            "peach" but not "apples" 
[+apples -pears -berries]   "apples" but not "peaches" or
"berries"

Exact match queries use Logical Word Expressions to find Web
documents.  The Logical Word Operator are: AND, OR, AND NOT. 
Using logical word expressions will turn off Excite's concept
based option.  Precise searches require the use of Logical Word
Operators.

[apples AND peaches]         pages with "apples" and "peaches"
[apples OR peaches]          pages with either "apples" or
"peaches"
[apples AND NOT peaches]     pages with "apples" but not
with"peaches"

****
8.3  LYCOS  http://www.lycos.com

STANDARD SEARCH
Standard searches do not use logical word operators.  
[apples oranges peaches]    pages where any of the words appear
[apples +berries]           "apples" and "berries"
[apples -berries]           "apples" but not "berries" 
[app$ (wildcard)]           "apples", "applets" etc..
[apple.]                    "apple" but not the word "apples"

CUSTOM SEARCHES

Complex searches are done through an intuitive menu interface.  

****
8.4  WEBCRAWLER  http://www.webcrawler.com

[apples oranges or apples OR oranges]  pages that contain any of
the words.
[apples AND oranges]        "apples" and "oranges"
[fruit NOT apples]          "fruit" but not "apples"
[cheese NEAR/(x) wine]      "wine" is within "x" words of
"cheese"
[world ADJ war]             "world" & "war" are next to each
other
[".. "  Phrases searches]   "us army", "jack and jill went up the
hill"
[(..)]                       used to organize search expressions

****
8.5  Yahoo  http://www.yahoo.com

Advanced Options:

[apples +oranges]           "apples" as well as "oranges" 
[apples -oranges]           "apples" but not with "oranges".
[t:]                        confines the search to certain Web
titles.
[u:]                        confines the search to certain URLs.
[" "] phrase operator       "orange juice", "apple juice", etc. 
[pea*  (wildcard)]          "pears", "peas", "peaches" etc. 

****
8.6  Infoseek http://www.infoseek.com

Simple Searches

[apples oranges]           either "apples" or "oranges".
[+apples oranges]          "apples", pages with "oranges" are
ranked lower.
["apple juice"]            "apple" and "juice" appear next to
each other.

Caps are used to indicate proper names and a case sensitive
search:
[Johnny Appleseed]         will find the name "Johnny Appleseed".
[Johnny,Appleseed]         will find either name.  
Note: commas are only used to separate names.

[apples -grapes]           "apples" but not "grapes".

Complex Searches

[fruit | apple | juice]    will find "fruit" then search results
for "apple" then search those results for "juice".
[title:fruit]              "fruit" in the title of the page.
[url:www.orange.com]       sites with address "www.orange.com".  
[url:fruit]                sites with "fruit" in the URL,
"www.fruit.com" or "www.fruitandnuts.com".
[link:www.juice.com]       will find sites linked to the
specified URL
[site:xyz.com]             will find all sites at the specified
address.

****
9.0 Partial List of Select Search Engines


1.0 General Search Engines

Alta Vista at http://www.altavista.digital.com/ 
AT1 at http://www.at1.com/ 
Excite at http://www.excite.com/ 
Galaxy at http://www.einet.net/search.html 
Go2.com at http://www.goto.com/ 
HotBot at http://www.hotbot.com/ 
i-Explorer at http://www.i-explorer.com/home.dll?? 
Identify at http://www.identify.com/ 
Infohiway at http://www.infohiway.com/ 
Infoseek at http://guide.infoseek.com/ 
Internet Explorer at http://www.iexplorer.com/ 
Internic Directory at http://www.internic.net/dod/ 
Intuitive Web Index at http://intuitive.iexp.com/ 
Jayde at http://www.jayde.com/ 
Aliweb at http://www.nexor.com/public/aliweb/search/doc/form.html 
LEO at http://www.leo.org/cgi-bin/leo-search 
Linkcentre at http://linkcentre.com/ 
LinkMaster at http://linkmaster.com/ 
LinkMonster at http://www.linkmonster.com/ 
LinkStar at http://www.linkstar.com/home/partners/search-engines 
Lycos at http://www.lycos.com/ 
Magellan at http://www.mckinley.com/ 
Matilda at http://www.aaa.com.au/ 
Nerd World at http://www.nerdworld.com/ 
NetFind at http://www.aol.com/netfind/ 
Northern Light at http://www.northernlight.com/ 
Open Text at http://index.opentext.net/ 
REX at http://www.skyline.net/REX/ 
Tradewave Galaxy at http://galaxy.tradewave.com/ 
web://411 at http://www.sserv.com/web411/ 
WebCrawler at http://www.webcrawler.com/ 
Websitez at http://www.websitez.com/ 
What-U-Seek at http://www.whatuseek.com/ 
WWWWorm at http://wwwmcb.cs.colorado.edu/wwww.html 
Yahoo at http://www.yahoo.com/ 

9.2 Meta Search Engines

http://www.all4one.com/ All4One
http://www.cyber411.com/ Cyber411
http://www.dogpile.com/ Dogpile
http://www.w3com.com/fsearch/ FrameSearch
http://www.highway61.com/ Highway 61
http://m5.inference.com/ifind/ i-Find
http://www.isleuth.com/ Internet Sleuth
http://www.mamma.com/ Mamma
http://www.metacrawler.com/ MetaCrawler
http://metasearch.com/ MetaSearch
http://www.cosmix.com/motherload/insane/ Mother Load Insane Search
http://www.primecomputing.com/pssearch.htm Prime Search
http://www.designlab.ukans.edu/profusion/ Pro-Fusion
http://guaraldi.cs.colostate.edu:2000/form Savvy Search
http://search.onramp.net/ Search.onramp.net
http://www.he.net/~kamus/use2en.htm Use It!       

9.3 Geo Specific Search Engines

http://www.countries.com/index.shtml Countries.com
http://www.arab.net/search/welcome.html Arab.net
http://www.samilan.com/ South Asian Internet Resources
http://www.intercom.com.au/wombat/ Web Wombat --Australian
http://www.argos.com.br/ Argos --Brazil
http://www.cade.com.br/ Cade --Brazil
http://www.radaruol.com.br/index.html Radar --Brazil
http://canada411.sympatico.ca/ Canada 411
http://www.canlinks.net/ CANLinks --Can.
http://maplesquare.com/ Maple Square --Can.
http://www.chilnet.cl/buscai.htm? ChilNet --Chile
http://www.euroferret.com/ Euroferret
http://www.god.co.uk/ G.O.D. --Europe
http://lokace.iplus.fr/ Lokace --Fr.
http://vroom.web.de/ web.de --Ger.
http://www.genius.net/indolink/ INDOLink --India
http://www.arianna.it/ Arianna --It.
http://www.keycomm.it/ricerche.htm Ricerche --It.
http://www.ipoline.com/~man/jpsearch.htm Japan Super Search
http://senrigan.ascii.co.jp/index-e.html Senrigan --Japan
http://simmany.hnc.net/ Simmany --Korea
http://www.nois.nl/nlurl2/ NL-URL --Dutch
http://www.zoek.nl/ Zoek --Dutch
http://accessnz.co.nz/ Access New Zealand
http://nzexplorer.co.nz/ NZExplorer --New Zealand
http://www.aeiou.pt/ aeiou --Port.
http://www.cusco.viatecla.pt/ Cusco --Port.
http://www.sapo.pt/ Sapo --Port.
http://scotland.org/ Scotland.org
http://www.ananzi.co.za/ Ananzi --So. Africa
http://charybdis.marques.co.za/zaworm.htm ZA Worm --So. Africa
http://www.elcano.com/ Elcano --Sp.
http://www.search.ch/ Swiss Search
http://www.ipoline.com/~man/twsearch.htm Taiwan Super Search


****
10.0  Contact Information

Corrections, additions or comments can be sent to:

Ken Bogucki
krb@infobasic.com      

http://www.infobasic.com/pageone.htm  

END WISE FAQ (c)
=========================




 


User Contributions:

Comment about this article, ask questions, or add new information about this topic:


[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
kenbog@netcom.com (Ken Bogucki)





Last Update March 27 2014 @ 02:12 PM