root/libswish3/trunk/doc/swish_migration.7.pod

Revision 2087, 2.4 kB (checked in by karpet, 4 months ago)

init some docs

Line 
1 =pod
2
3 =head1 Migrating from Swish-e to Swish3
4
5 If you haven't already, read the L<Introduction to Swish3|swish_intro.7>
6 document first.
7
8 This document is intended for users already familiar with Swish-e
9 version 2.x who want to migrate to using Swish3.
10
11 =head2 The Tool Chain
12
13 Swish3 is intended to be one part of a search system tool chain.
14 In this section we will look at how Swish-e implements each of the tool
15 chain features, and then compare it to Swish3.
16
17 =head3 Aggregator
18
19 Swish-e has two built-in aggregators, for filesystem and web,
20 indicated with the B<-S> flag to the B<swish-e> command. Swish-e also
21 has a third B<-S> option called B<prog>, which is short for C<program>.
22 The C<program> is an aggregator that you define. Swish-e ships with several
23 example aggregators, including a filesystem crawler called B<DirTree.pl>
24 and a web crawler called B<spider.pl>. There are also example aggregators
25 for pulling data from a database and for specific kinds of documents, like
26 Hypermail mail archives.
27
28 Swish3 has no built-in aggregators. Instead, Swish3 takes the B<-S prog> approach
29 of defining an API for external aggregators to follow.
30
31 =head3 Normalizer
32
33 Swish-e has a feature called B<FileFilter> which allows you define an external
34 program to call if a document's name matches a particular pattern. The
35 file is handed to the external program and the output of the external program
36 is treated as the contents of the document. For example, you can specify
37 that all documents that end with C<.pdf> are first filtered through
38 the B<pdftotext> command.
39
40 Swish-e also comes with a set of Perl modules bundled together as
41 B<SWISH::Filter>. SWISH::Filter is used by the external aggregators like
42 B<DirTree.pl> and B<spider.pl>, thus making those programs both aggregators
43 and normalizers.
44
45 Swish3 has no built-in normalizer or feature like B<FileFilter>. Instead,
46 Swish3 assumes that something like SWISH::Filter will be used to standardize
47 documents before they are handed to Swish3.
48
49
50 =head2 Configuration
51
52 One of the biggest changes is the configuration file format. Swish3 uses
53 XML-style configuration files, and supports a subset of the configuration
54 options available in Swish-e.
55
56 This section documents the configuration options supported in Swish3.
57
58 =head2 See Also
59
60 =over
61
62 =item
63
64 L<Introduction to Swish3|swish_intro.7>
65
66 =item
67
68 L<Perl implementation of Swish3|SWISH::Prog>
69
70 =item
71
72 L<libswish3 API|libswish3.3>
73
74 =back
Note: See TracBrowser for help on using the browser.