Resumé:
File sizes and load times of Findfast vs. full-text search are calculated to
show what makes Findfast superior in terms of speed and bandwidth.
Over DSL with 3 Mbps even a full-text program searching an inverted index
the size of the database is reasonably fast. However, there are viewers with
analogue connections surfing at a speed of 14.4 or 28.8 bps. A file that loads
in 1 second over DSL needs about 20 hours at 14.4 bps.
Many export managers are not aware that prospective customers in some
developing countries willing to order online cannot do so because the next power
cut will happen long before the first full-text search operation in a large
database is completed.
Loading time calculation is a complicated process.
Here data obtained using the load time calculator of Microsoft FrontPage are
presented.
On Findfast startup at 14.4 bps, the slowest analogue connection, the
browser loads:
File to load
Size
Load time
HTML file with
images
7 kB
41.2 seconds
Program
24 kB
24.8 seconds
Database
48 kB
50.5 seconds
By experimenting with databases of different sizes it was found that the load
time for 100 additional listings currently in use is 12.625 seconds. An
accommodation directory of this type with 10,000 listings would require a
database in the current format that loads in 5 minutes 16 seconds over 14.4 bps
or 1 minute 9 seconds over ISDN or 2 seconds over DSL.
Once the database has been loaded it will be available in a cache for the
duration of the session. The program can detect the viewer's Internet access
speed. When a viewer with slow Internet access loads the database Findfast will
recommend to download the database as a .ZIP file to be installed for future
use. When the user accepts it will not be loaded again by the browser. Instead
the locally available database will be used.
After loading the database Findfast operating speed depends in the first
phase solely on the viewer's input speed. All steps from selecting a search
option till the start of listing display occur in the viewer's CPU.
The next phase is display of the listings. The menu offers one button to click
on for every group of listings. Associated with each button is a string of text
file names copied from the 3-dimensional array when the search was performed.
When one of the buttons is clicked a dynamic HTML page is loaded into which the
text files - each one a HTML table- are copied. The file size is 4 kB and loads
over 14.4 bps in 45 seconds. Populating it with 10 listings takes about a minute
at that speed.
In the accommodation directory where Findfast is implemented 400 listings
(total 1.2 MB) contain the essential information of 400 dedicated pages, the
documents to be retrieved (total 3.5 MB) in condensed form. Apart from names and
addresses the information is expressed by icons.
10,000 Findfast listings would have a total size of 30 MB.
10,000 dedicated pages (documents) would have a total size of 87.5 MB.
The inverted index of a database with 87.5 MB of text would be smaller than 87.5
MB because stopwords would not be indexed. Assuming that 87.5 MB of text require
30 MB of inverted index Findfast and full-text search speed can be compared.
To perform a full-text search on an 87.5 MB database a 30 MB index would
have be saved on the viewer's hard disk. A viewer who wants to find only a few
kB of text would have to wait until the inverted index is downloaded (if there
is sufficient free disk space).
The same viewer searching a Findfast managed database with 87.5 MB of text would
have to load only the 1.2 MB managemen files, in most cases keeping it in the
CPU.
After loading the files a search in a database with 10,000 documents would not
take noticably more time than a search in a database with 400 documents.
Displaying a group of 10 of 10,000 listings cannot take more time than loading
10 of 400 because listings not viewed are not stored on the viewer's computer.
To the viewer listings not viewed and documents not accessed simply do not
exist.
Full-text search is more like a lottery: Keywords match is no guarantee that a
document found has anything to do with the viewer's scope of interest. Entering
too many keywords can exclude the very documents the viewer wanted to find.
Full-text search was developed long before the dawn of the Internet, running
under C/PM operating system to retrieve text from a CDROM. After 30 years it is
time to say bye-bye to good old full-text search.
A suggestion for a new product that Findfast makes possible:
Classified telephone directories (Yellow Pages) have an established position
in the market. To avoid the necessity of loading huge index files their
publishers cut them to relatively small segments which makes operation
unbearably slow when for instance a buyer is looking for manufacturers of Blue
Widgets in the entire country. The alphabetical order that was practical when
only few companies had telephones is still being used although it was obsolete
already 20 years ago.
What the market needs is a directory of goods and services with listings
that the advertisers themselves create. Only they know how to describe their
products. Only they know which words people use when they want to become
customers.
Such a directory would take over all the advertisers of specialized directories
in the web because buyers will soon refuse to visit content sites that offer
nothing better than full-text search.
Sooner or later one company will fill the market niche.
6
Implementing Findfast does not affect appearance.
Flash, Shockwave, MP3, MIDI can be used as before.
The "look" and "feel" of the website will not
change.
Older Netcape versions cannot display the Virtual Site Map
Block pop-ups while you are here.
Activex is not required to view this site.
If you accidentally click on the right part of the
sponsor's banner this page will be overwritten.
Use your browser's Back button to return to this site.