SWISH 1.2.1

Appendix B : Configuration File Example



[Index] [Previous Chapter] [Next Chapter]
# SWISH configuration file
# Miles O'Neal, meo@rru.com, 12/Aug/1998
# AutoMagic Miles O'Neal
# Lines beginning with hash marks (#) and blank lines are ignored.

#--- Index definition variables

IndexDir /Users/meo/public_html
# This is a space-separated list of files and directories to be
# recursively indexed. You can specify more than one of these directives.

IndexFile /Users/meo/public_html/index.swish
# This is the name of the generated index file.  An extension of
# ".swish" is recommended.

# FOR BETTER PERFORMANCE, REMOVE ANY OF THESE YOU DON'T NEED.
IndexOnly .htm .html .txt .text
IndexOnly .shar .z .gz .tgz .eps .ps .doc .xls
IndexOnly .gif .jpg .jpeg .xpm .xbm .tif .tiff .bmp .gray .miff .pcd
IndexOnly .pcx .pic .pict .png .pnm .rgb .rle .sun .tga .xwd
IndexOnly .aif .aiff .au .cdr .hcom .raw .sb .sf .smp
IndexOnly .sw .snd .ub .ul .uw .voc .wav
IndexOnly .avi .flc .fli .iff .jfif .mov .mpg .mpeg .qt .pfx
# Only files with these suffixes will be indexed.  If omitted, swish
# will index everything it can under directories named as IndexDirs.
# (case-insensitive)

IndexReport 2
# Verbosity while indexing (0 = silent, 1 = minimum info,
# 2 = 1 + all directories traversed, 3 = 2 + all files indexed)

FollowSymLinks no
# Put "yes" to follow symbolic links in indexing, else "no".

# FOR BETTER PERFORMANCE, REMOVE ANY OF THESE YOU DON'T NEED.
NoContents .shar .z .gz .tgz .eps .ps .doc .xls
NoContents .gif .jpg .jpeg .xpm .xbm .tif .tiff .bmp .gray .miff .pcd
NoContents .pcx .pic .pict .png .pnm .rgb .rle .sun .tga .xwd
NoContents .aif .aiff .au .cdr .hcom .raw .sb .sf .smp
NoContents .sw .snd .ub .ul .uw .voc .wav
NoContents .avi .flc .fli .iff .jfif .mov .mpg .mpeg .qt .pfx
# Files with these suffixes will only their file names indexed, not
# their contents. (case-insensitive)

#ReplaceRules replace "/www/pages/" "/"
ReplaceRules replace "Users/meo/public_html" "~meo"
# ReplaceRules allow you to change pathnames as they are added to
# the index - usually into URLs.  Syntax of the variants is:
#   ReplaceRules replace "original_string" "string_you_want"
#   ReplaceRules prepend "string_to_prepend_to_paths"
#   ReplaceRules append "string_to_append_to_paths"

FileRules pathname contains admin testing demo trash construction
FileRules pathname contains .swish /stats test/ confidential
FileRules filename contains .bak .orig .old old. .swish srchindex
FileRules filename contains , Archive
FileRules filename is .htaccess .htpasswd .htgroup
FileRules title contains construction
#FileRules directory contains .htaccess
# File names matching these criteria will not be indexed.
# Syntax:
#   FileRules pathname contains string1 string2 string3 ...
#     (ignore if any part of pathname matches one of these strings)
#   FileRules filename is filename
#     (ignore any *exactly* matching filename)
#   FileRules filename contains string1 string2 string3 ...
#     (ignore if filename contains any of these strings
#       (case-insensitive))
#   FileRules title contains string1 string2 string3 ...
#     (ignore any HTML file whose title contains any of these strings
#       (case-insensitive))
#   FileRules directory contains string1 string2 string3 ...
#     (ignore any directory containing any of these file names
#       (case-insensitive))

IgnoreLimit 50 400
# This automatically omits words that appear too often in the files
# (these words are called stopwords). Specify a whole percentage
# and a number, such as "50 125". This omits words that occur in
# over 80% of the files and appear in over 256 files. Comment out
# to turn off auto-stopwording.

IgnoreWords SwishDefault
# The IgnoreWords option allows you to specify words to ignore.
# Comment out for no stopwords; the word "SwishDefault" will
# include a list of default stopwords. Words should be separated by spaces
# and may span multiple directives.

EmphasizeComments no
# Put "yes" to more heavily weight words in comments when indexing.
# Put "no" to treat words in comments the same as any other words.

EmphasizeMetaTags yes
# Put "yes" to more heavily weight words in META tags when indexing.
# Put "no" to treat words in META tags the same as any other words.

TitleTopLines 5
# Max. number of lines into file to scan for TITLE tag

DefaultRule and
# default rule to apply to multiple wors - can be "and" or "or"

MinWordLimit 4
MaxWordLimit 15
# These define the minimum and maximum word length.

AsciiEntities yes
# Set "yes" to convert named ASCII entities to closest ASCII equivalent
# (numeric entities will always be converted if possible)

IndexTags no
# Set this to yes if, for some bizarre reason, you want to index
# all HTML tags.

IgnoreAllV yes
# Set to yes to ignore words consisting of all vowels

IgnoreAllC yes
# Set to yes to ignore words consisting of all consonants

IgnoreAllN yes
# Set to yes to ignore words consisting of all numbers

IgnoreRowV 3
# Max. number of vowels allowed in a row when indexing - more than this
# will get the word thrown out

IgnoreRowC 4
# Max. number of consonants allowed in a row when indexing - more than this
# will get the word thrown out

IgnoreRowN 3
# Max. number of numbers allowed in a row when indexing - more than this
# will get the word thrown out

IgnoreSame 3
# Max. number of times the same character can appear adjacent to itself -
# more than this will get the word thrown out



#--- Human-useful information to place in index files

IndexName "The Searchable Miles"

IndexDescription "This is the definitive search guide to the web pages \
 and other Internet resources of Miles O'Neal.  So what?"
# In reality there can be no backslash or line break here!

IndexPointer "http://www.rru.com/~meo/"

IndexAdmin "Suzi Styrofoam (suzi@rru.com)"

DocUrl "http://www.netads.com/Doc/swish/"

# These are output by swish when it's used in search mode.  Front
# ends may wish to use them.
[Index] [Previous Chapter] [Next Chapter]

Last update: 18/Aug/1998