#!@@PERL_BIN@@

$version = "1.0";
$ans = "";

sub YesNo {
    $ans = "";
    ($prtext) = @_;
    while ($ans eq "") {
        print "$prtext\n";
        $ans = <STDIN>;
        $ans =~ s,\n,,;
        if ($ans =~ /^[Yy]/) {
            return 1;
        } elsif ($ans =~ /^[Nn]/) {
            return 0;
        } else {
            print "Please respond with either a `yes' or `no' answer.\n";
            $ans = "";
        }
    }
}

sub GetAns {
    $ans = "";
    ($prtext) = @_;
    while ($ans eq "") {
        if ($prtext ne "") {
            print "$prtext\n";
        }
        $ans = <STDIN>;
        $ans =~ s,\n,,;
    }
}

sub GetAnsDef {
    $ans = "";
    $ans = <STDIN>;
    $ans =~ s,\n,,;
}

sub GetMultiAns {
    $ans = "";
    $foo = "";
    while ($foo eq "") {
        $foo = <STDIN>;
        $foo =~ s,\n,,;
        if ($foo ne "" && $foo ne ".") {
            $ans .= " $foo";
            $foo = "";
        } else {
            $foo = "done";
        }
    }
}

if ($#ARGV >= 0) {
    $file = $ARGV[0];
} else {
    GetAns("Please enter the name of the file to create");
    $file = $ans;
}
if (-e $file) {
    if (-w $file) {
        if (! YesNo("Do you wish to replace $file ?")) {
            exit 1;
        }
    } else {
        print "%file is not writeable!\n";
        exit 2;
    }
} else {
    $old = -1;
    $new = 0;
    while ($new >= 0) {
        $new = index($file, '/', $new);
        if ($new >= 0) {
            $old = $new;
        } else {
            last;
        }
    }
    if ($old >= 0) {
        $dir = substr($file, 0, $old);
        if (! -w $dir) {
            die "%dir is not writeable!\n";
        }
    }
}

print << "__EOD__";
This script will create or replace a SWISH configuration file, $file.
__EOD__

if (! YesNo("Do you wish to continue?")) {
    exit;
}

if ($ENV{"LOGNAME"} ne "") {
    $login = $ENV{"LOGNAME"};
    $user = `grep $login /etc/passwd`;
    @stuff = split(':', $user);
    $id = "$stuff[4]  <$login\@" . `hostname -d` . ">  " . localtime(time);
    $id =~ s,\n,,;
}

print "\n";
GetAns("What is the short title for this index?");
$short_title = $ans;

$data = 
"# SWISH configuration file
# $id
# AutoMagic $short_title
# Lines beginning with hash marks (#) and blank lines are ignored.
# Generated by mkswishconf $version

#--- Index definition variables

";

print << "__EOD__";
First we need to set up the variables that affect indexing and searching.

Enter the list of space-separated files and/or directories covered
by this index.  You may use multiple lines; the first blank line
ends the list.  If this is left blank, indexing will require a list
of directories and/or files to index on the swish comand line.
__EOD__
GetMultiAns();
$data .= "
IndexDir $ans
# This is a space-separated list of files and directories to be
# recursively indexed. You can specify more than one of these directives.

";


print << "__EOD__";
Enter the path of the index to be generated.  A filename of index.swish
is suggested.  To accept index.swish as the default, simply enter a
blank line.
__EOD__
GetAnsDef();
if ($ans ne "") {
    $data .= "\nIndexFile $ans";
} else {
    $data .= "\nIndexFile index.swish";
}
$data .= "
# This is the name of the generated index file.  An extension of
# `.swish' is recommended.

";


print << "__EOD__";

Enter the verbosity level (0 - 3) for indexing.

    0 = completely silent indexing
    1 = brief summary after indexing is complete
    2 = level 1 + lists easch directory as it's traversed
    3 = level 2 + lists each file as it's traversed

NOTE: swish may take a while to index large directories and/or files,
so level 0 may be disconcerting when nothing appears to be happening.
Conversely, levels 2 and 3 may be too much output when you have a lot
of files and/or directories.  You may wish to start with level 2 or 3
until you are comfortable with swish, then manually edit the configuration
file to change the value to 1 or 0.
__EOD__
GetAnsDef();
if ($ans eq 0 || $ans eq 1 || $ans eq 2 || $ans eq 3) {
    $data .= "\nIndexReport $ans";
} else {
    $data .= "\nIndexReport 3";
}
$data .= "
# Verbosity while indexing (0 = silent, 1 = minimum info,
# 2 = 1 + all directories traversed, 3 = 2 + all files indexed)

";


print << "__EOD__";

When indexing, swish ignores words that occur too frequently.  You
can define what `too frequently' means, specify a whole percentage
and a number, such as  80 256 .  (This omits words that occur in
over 80% of the files and appear in over 256 files.)  To turn off
this feature, enter a single, blank line.
__EOD__
$ans = "done";
while ($ans ne "") {
    GetAnsDef();
    if ($ans eq "") {
        $data .= "\n#IgnoreLimit 50 100";
    } elsif ($ans =~ /[ ]*[0-9]+[ ]+[0-9]+/) {
        $data .= "\nIgnoreLimit $ans";
        $ans = "";
    } else {
        print "Invalid limits - try again!\n";
    }
}
$data .= "
# This automatically omits words that appear too often in the files
# (these words are called stopwords). Specify a whole percentage
# and a number, such as \"80 256\". This omits words that occur in
# over 80% of the files and appear in over 256 files. Comment out
# to turn off auto-stopwording.

";


print << "__EOD__";

You may also give swish a list of words to ignore when indexing.
Enter as many as you wish, separated by spaces nd/or using multiple
lines.  You may also enter the word `SwishDefault' which represents
swish's builtin list of common words to ignore.  If you don't want
swish to ignore any words by default, enter a single, blank line.

NOTE: Turning this feature off means you swish will attemt to index
words like `a', `an', `and', `the', etc.

Enter the words swish should ignore.
__EOD__
GetMultiAns();
if ($ans ne "") {
    $data .= "\nIgnoreWords $ans";
} else {
    $data .= "\n#IgnoreWords SwishDefault\n";
}
$data .= "\
# The IgnoreWords option allows you to specify words to ignore.
# Comment out for no stopwords; the word \"SwishDefault\" will
# include a list of default stopwords. Words should be separated by spaces
# and may span multiple directives.
";

print << "__EOD__";

When indexing, should swish follow symbolic links?  If you are unsure,
simply enter a blank line, and the default of `no' will be used.
__EOD__
GetAnsDef();
if ($ans ne "") {
    $data .= "\nFollowSymLinks $ans";
} else {
    $data .= "\nFollowSymLinks no";
}
$data .= "
# Put `yes' to follow symbolic links in indexing, else `no'.

";


print << "__EOD__";

Enter the list of space-separated file suffixes you want indexed.
Only files with these suffixes will be indexed.  Case does not
matter.  If you want swish to index everything in the list of
directories and files from the previous question, simply enter a
blank line.
__EOD__
GetMultiAns();
if ($ans ne "") {
    $data .= "\nIndexOnly $ans";
} else {
    $data .= "\n#IndexOnly .htm .html .txt .shar .Z .z .gz .tgz .tar\n";
    $data .= "#IndexOnly .gif .jpg .jpeg .ps .xbm .xpm";
}
$data .= "
# Only files with these suffixes will be indexed.  If omitted, swish
# will index everything it can under directories named as IndexDirs.
# (case-insensitive)

";

print << "__EOD__";
Enter list of space-separated file suffixes for files which you
want only the file name indexed.  You may use multiple lines; the
first empty line terminates the list.  To use the default list
(.shar .Z .gz .tgz .tar .gif .jpg .jpeg .pict .tiff .bmp .xbm .xpm
.ps .pdf .ram .ra .au .snd .wav .hqx .mpg .mpeg) enter a single,
blank line.

All binary files, data files, archives, compressed files, and most
PostScript files are good candidates for this.
__EOD__
GetMultiAns();
if ($ans ne "") {
    $data .= "\nIndexOnly $ans";
} else {
    $data .= "
NoContents .shar .Z .gz .tgz .tar .gif .jpg .jpeg .pict .tiff .bmp
NoContents .ps .pdf .ram .ra .au .snd .wav .hqx .mpg .mpeg";
}
$data .= "
# Files with these suffixes will only their file names indexed, not
# their contents. (case-insensitive)

";


print << "__EOD__";
Now we get to a tricky part - replacement rules.  These are lists
of path components you wish to change in the index, and what you
wish to change them to, or components to prepend or append to paths
in the index.

There are three types of replacement rules - replace, append and
prepend.  You may enter any combination of rules, in any order,
one per line, terminated by a blank line.

The formats are:
    replace \"original_string\" \"string_you_want\"
    prepend \"string_to_prepend_to_paths\"
    append \"string_to_append_to_paths\"

You *must* include the double quotes!

If you enter only a single, empty line, the defaults will be used:
    replace \"/usr/local/etc/httpd/htdocs\" \"/\"
    replace \"home/$login/public_html\" \"~$login\"
   
Enter your replacement rules, followed by a blank line, or a single,
empty line to select the defaults:
__EOD__
$ans = "done";
$ans2 = "";
while ($ans ne "") {
    GetAnsDef();
    if ($ans eq "") {
        if ($ans2 eq "") {
            $ans2 = "
ReplaceRules replace \"/usr/local/etc/httpd/htdocs\" \"/\"
ReplaceRules replace \"home/$login/public_html\" \"~$login\"";
        }
        last;
    } elsif ($ans =~ /[ ]*replace[ ]*".+" ".*"/ ||
      $ans =~ /[ ]*append[ ]*".+"/ ||
      $ans =~ /[ ]*prepend[ ]*".+"/) {
        $ans2 .= "\nReplaceRules $ans";
    } else {
        print "Invalid replacement rule - try again!\n";
    }
}
$data .= "$ans2";
$data .= "
# ReplaceRules allow you to change pathnames as they are added to
# the index - usually into URLs.  Syntax of the variants is:
#   ReplaceRules replace \"original_string\" \"string_you_want\"
#   ReplaceRules prepend \"string_to_prepend_to_paths\"
#   ReplaceRules append \"string_to_append_to_paths\"

";


print << "__EOD__";
This is the last tricky part!  Now you need to define filenames, paths,
and directory components for files which should *not* be indexed at all.
Entering a single, blank line results in swish using these defaults:

    pathname contains admin testing demo trash construction confidential
    pathname contains .swish /stats test/ /,
    filename contains .bak .orig .old old. .swish srchindex .txt
    filename is .htaccess .htpasswd .htgroup
    title contains construction

*** press [ENTER] to continue ***
__EOD__

GetAnsDef();

print << "__EOD__";
The possible file rule types are:
    pathname contains string1 string2 string3 ...
      (ignore if any part of pathname matches one of these strings)
    filename is filename
      (ignore any *exactly* matching filename)
    filename contains string1 string2 string3 ...
      (ignore if filename contains any of these strings
        (case-insensitive))
    title contains string1 string2 string3 ...
      (ignore any HTML file whose title contains any of these strings
        (case-insensitive))
    directory contains string1 string2 string3 ...
      (ignore any directory containing any of these file names
        (case-insensitive))

Enter your file rules now, one per line, terminated by a blank line,
or a blank line to accept the defaults:
__EOD__
$ans = "done";
$ans2 = "";
while ($ans ne "") {
    GetAnsDef();
    if ($ans eq "") {
        if ($ans2 eq "") {
            $ans2 = "
FileRules pathname contains admin testing demo trash construction confidential
FileRules pathname contains .swish /stats test/ /,
FileRules filename contains .bak .orig .old old. .swish srchindex .txt
FileRules filename is .htaccess .htpasswd .htgroup
FileRules title contains construction";
        }
        last;
    } elsif ($ans =~ /[ ]*pathname contains[ ]*.+/ ||
      $ans =~ /[ ]*filename is[ ]*.+/ ||
      $ans =~ /[ ]*filename contains[ ]*.+/ ||
      $ans =~ /[ ]*title contains[ ]*.+/ ||
      $ans =~ /[ ]*directory contains[ ]*.+/) {
        $ans2 .= "\nFileRules $ans";
    } else {
        print "Invalid file rule - try again!\n";
    }
}
$data .= "$ans2";
$data .= "
# File names matching these criteria will not be indexed.
# Syntax:
#   FileRules pathname contains string1 string2 string3 ...
#     (ignore if any part of pathname matches one of these strings)
#   FileRules filename is filename
#     (ignore any *exactly* matching filename)
#   FileRules filename contains string1 string2 string3 ...
#     (ignore if filename contains any of these strings
#       (case-insensitive))
#   FileRules title contains string1 string2 string3 ...
#     (ignore any HTML file whose title contains any of these strings
#       (case-insensitive))
#   FileRules directory contains string1 string2 string3 ...
#     (ignore any directory containing any of these file names
#       (case-insensitive))
";

print << "__EOD__";
Now we need to define a few informational parameters which don't
affect indexing or searching; their values are simply stored in
the configuration file and output by swish in search mode.

__EOD__
$data .= "
#--- Human-useful information to place in index files

# These are output by swish when it's used in search mode.  Front
# ends may wish to make use of them.
";

GetAns("Enter the name for this index:");
$data .= "IndexName $ans\n";

GetAns("Enter a one-line description for this index:");
$data .= "IndexDescription $ans\n";

GetAns("Enter the primary URL being indexed:");
$data .= "IndexPointer $ans\n";

GetAns("Enter an email address for the index administrator:");
$data .= "IndexAdmin $ans\n";

open DATA, ">$file" || die "Could not oen $file";
print DATA $data;
close DATA;
