| 1 |
=pod |
|---|
| 2 |
|
|---|
| 3 |
=head1 Migrating from Swish-e to Swish3 |
|---|
| 4 |
|
|---|
| 5 |
If you haven't already, read the L<Introduction to Swish3|swish_intro.7> |
|---|
| 6 |
document first. |
|---|
| 7 |
|
|---|
| 8 |
This document is intended for users already familiar with Swish-e |
|---|
| 9 |
version 2.x who want to migrate to using Swish3. |
|---|
| 10 |
|
|---|
| 11 |
=head2 The Tool Chain |
|---|
| 12 |
|
|---|
| 13 |
Swish3 is intended to be one part of a search system tool chain. |
|---|
| 14 |
In this section we will look at how Swish-e implements each of the tool |
|---|
| 15 |
chain features, and then compare it to Swish3. |
|---|
| 16 |
|
|---|
| 17 |
=head3 Aggregator |
|---|
| 18 |
|
|---|
| 19 |
Swish-e has two built-in aggregators, for filesystem and web, |
|---|
| 20 |
indicated with the B<-S> flag to the B<swish-e> command. Swish-e also |
|---|
| 21 |
has a third B<-S> option called B<prog>, which is short for C<program>. |
|---|
| 22 |
The C<program> is an aggregator that you define. Swish-e ships with several |
|---|
| 23 |
example aggregators, including a filesystem crawler called B<DirTree.pl> |
|---|
| 24 |
and a web crawler called B<spider.pl>. There are also example aggregators |
|---|
| 25 |
for pulling data from a database and for specific kinds of documents, like |
|---|
| 26 |
Hypermail mail archives. |
|---|
| 27 |
|
|---|
| 28 |
Swish3 has no built-in aggregators. Instead, Swish3 takes the B<-S prog> approach |
|---|
| 29 |
of defining an API for external aggregators to follow. |
|---|
| 30 |
|
|---|
| 31 |
=head3 Normalizer |
|---|
| 32 |
|
|---|
| 33 |
Swish-e has a feature called B<FileFilter> which allows you define an external |
|---|
| 34 |
program to call if a document's name matches a particular pattern. The |
|---|
| 35 |
file is handed to the external program and the output of the external program |
|---|
| 36 |
is treated as the contents of the document. For example, you can specify |
|---|
| 37 |
that all documents that end with C<.pdf> are first filtered through |
|---|
| 38 |
the B<pdftotext> command. |
|---|
| 39 |
|
|---|
| 40 |
Swish-e also comes with a set of Perl modules bundled together as |
|---|
| 41 |
B<SWISH::Filter>. SWISH::Filter is used by the external aggregators like |
|---|
| 42 |
B<DirTree.pl> and B<spider.pl>, thus making those programs both aggregators |
|---|
| 43 |
and normalizers. |
|---|
| 44 |
|
|---|
| 45 |
Swish3 has no built-in normalizer or feature like B<FileFilter>. Instead, |
|---|
| 46 |
Swish3 assumes that something like SWISH::Filter will be used to standardize |
|---|
| 47 |
documents before they are handed to Swish3. |
|---|
| 48 |
|
|---|
| 49 |
|
|---|
| 50 |
=head2 Configuration |
|---|
| 51 |
|
|---|
| 52 |
One of the biggest changes is the configuration file format. Swish3 uses |
|---|
| 53 |
XML-style configuration files, and supports a subset of the configuration |
|---|
| 54 |
options available in Swish-e. |
|---|
| 55 |
|
|---|
| 56 |
This section documents the configuration options supported in Swish3. |
|---|
| 57 |
|
|---|
| 58 |
=head2 See Also |
|---|
| 59 |
|
|---|
| 60 |
=over |
|---|
| 61 |
|
|---|
| 62 |
=item |
|---|
| 63 |
|
|---|
| 64 |
L<Introduction to Swish3|swish_intro.7> |
|---|
| 65 |
|
|---|
| 66 |
=item |
|---|
| 67 |
|
|---|
| 68 |
L<Perl implementation of Swish3|SWISH::Prog> |
|---|
| 69 |
|
|---|
| 70 |
=item |
|---|
| 71 |
|
|---|
| 72 |
L<libswish3 API|libswish3.3> |
|---|
| 73 |
|
|---|
| 74 |
=back |
|---|