S.O.S, hints and examples

Single word

The single word version of S.O.S is written i C using NDBM database.

Substitutions
The special character tokens: ö, Ö, å, Å, &auml and Ä are substituted with the corresponding character before they are added to the database.

Other characters like -,!:() are removed from the word before they are added.
For example, the word "Mac-lab!" causes the single words "MAC" and "LAB" to be inserted separatly.

Important!

The program uses caseless exact single word match.

Averge search takes about 0.3 sek, exclusive transfer time.


Perl regular expression

The Perl regular expression version is written i Perl.

Substitutions
> and < is replaced by &gt; and &lt;

Characters like are substituted so they will match both the character itself and the standard html token. For example, the search string l would be replaced with l|&ouml;l .

Important!
The use of characters like can slower the process dramatically and should be avoided if possible.

Finishing all the substitutions, S.O.S then makes a caseless search in all the html-files. To find all the documents, the program uses s special indexfile to keep track of the many documents located in our server. This indexfile is updated daily. However, any changes made in an existing file, will have immediate effect.

Average search takes about 15-50 sek depending on the current load of the system and the structure of the regular expression.


Examples


Suppose You wish to find every document containing the word operativsystem

Search for:
Single word Perl regular expression


Are there any links to Anders Wilhelms homepage?

Search for:
Single word Perl regular expression



Are there any links to "wwwtdb", w/o the ".cs.umu.se"? (These links doens't work outside the ".cs" domain, so if any of your documents comes up on this list, please correct them.)

Search for:
Single word Perl regular expression