|
|
|
@2189
|
[2189]
|
11/03/08 21:48:34 |
karpet |
switch another int to boolean |
|
|
|
@2188
|
[2188]
|
11/03/08 21:44:48 |
karpet |
first pass at the dom-specific property and metaname feature. |
|
|
|
@2187
|
[2187]
|
11/01/08 13:54:06 |
karpet |
change some var and sub names for clarity |
|
|
|
@2184
|
[2184]
|
10/24/08 15:05:31 |
karpet |
add raw tagstack to parser_data. this is to allow for metanames or … |
|
|
|
@2178
|
[2178]
|
09/26/08 23:59:57 |
karpet |
some versions of html parser were passing through extra whitespace.
seems … |
|
|
|
@2176
|
[2176]
|
09/22/08 22:57:38 |
karpet |
all tests passing, all (known) leaks fixed |
|
|
|
@2171
|
[2171]
|
09/22/08 00:08:00 |
karpet |
get rid of the circular reference in TokenList/Token?; make swish_init() a … |
|
|
|
@2169
|
[2169]
|
09/21/08 22:28:58 |
karpet |
nevermind with cpts. |
|
|
|
@2166
|
[2166]
|
09/21/08 21:59:57 |
karpet |
store codepoints instead of start_byte (since we no longer need the … |
|
|
|
@2165
|
[2165]
|
09/20/08 21:44:21 |
karpet |
simply the ->value by pointing directly into the internal buffer, and … |
|
|
|
@2162
|
[2162]
|
09/20/08 15:37:55 |
karpet |
refactor to simplify signatures for TokenIterator?. Now a TI always creates … |
|
|
|
@2159
|
[2159]
|
09/20/08 01:05:08 |
karpet |
simplify signatures |
|
|
|
@2158
|
[2158]
|
09/19/08 07:16:36 |
karpet |
yank words.c in favor of tokenizer.c -- benchmarking shows tokenizer.c is … |
|
|
|
@2155
|
[2155]
|
09/18/08 23:51:41 |
karpet |
avoid xmlStrncat because (a) it fails under linux and (b) the realloc is … |
|
|
|
@2153
|
[2153]
|
07/31/08 23:10:16 |
karpet |
fix the bump_word feature so that ->pos for Word or Token reflects … |
|
|
|
@2150
|
[2150]
|
07/29/08 21:35:42 |
karpet |
ditch SWISH_META_CONNECTOR and SWISH_PROP_CONNECTOR in favor of … |
|
|
|
@2148
|
[2148]
|
07/21/08 23:51:24 |
karpet |
change top-level tokenizer functions to use same signature so that we can … |
|
|
|
@2142
|
[2142]
|
05/05/08 23:30:12 |
karpet |
add some tokenizer tests and (doh!) include tokenizer.c |
|
|
|
@2141
|
[2141]
|
04/30/08 00:03:02 |
karpet |
port the ascii optimizations in words.c to tokenizer.c and expose some … |
|
|
|
@2140
|
[2140]
|
04/28/08 22:02:04 |
karpet |
alternate utf8-savvy tokenizer with iterator. initial naive benchmark … |
|
|
|
@2132
|
[2132]
|
04/16/08 08:14:47 |
karpet |
clarify/rename vars |
|
|
|
@2130
|
[2130]
|
04/15/08 23:12:59 |
karpet |
fix bug with XMLClassAttributes. all tests pass... for now. |
|
|
|
@2123
|
[2123]
|
04/15/08 10:01:26 |
karpet |
add prototypes for stringlist and make some constant ints into powers of 2 |
|
|
|
@2112
|
[2112]
|
04/07/08 21:48:43 |
karpet |
make output a little more swish-like |
|
|
|
@2110
|
[2110]
|
04/03/08 22:44:09 |
karpet |
Refactor duplicate id checks to use hash instead of array. Fixes bug with … |
|
|
|
@2108
|
[2108]
|
03/31/08 23:47:51 |
karpet |
add header read/write to xapian example and fix some mem leaks |
|
|
|
@2106
|
[2106]
|
03/30/08 22:55:55 |
karpet |
test that all ids are unique |
|
|
|
@2097
|
[2097]
|
03/23/08 23:49:06 |
karpet |
write header |
|
|
|
@2096
|
[2096]
|
03/21/08 14:27:54 |
karpet |
add prop and meta id auto-init; fix debug scheme to use bitwise comparison |
|
|
|
@2090
|
[2090]
|
03/18/08 23:45:43 |
karpet |
xapian example |
|
|
|
@2046
|
[2046]
|
03/07/08 22:33:11 |
karpet |
more config refactoring |
|
|
|
@2042
|
[2042]
|
03/02/08 22:52:11 |
karpet |
more refactoring of config/header |
|
|
|
@2041
|
[2041]
|
02/29/08 23:18:18 |
karpet |
major reconstruction of config object.
basically, let go of the naive idea … |
|
|
|
@2030
|
[2030]
|
02/25/08 21:58:22 |
karpet |
expand ref counting and clean up some unused code |
|
|
|
@2027
|
[2027]
|
02/23/08 22:31:16 |
karpet |
rename ParseData? to ParserData?; add more debug env vars; implement c ptr … |
|
|
|
@2010
|
[2010]
|
02/10/08 22:26:06 |
karpet |
simplify API with top-level swish_3 struct |
|
|
|
@2009
|
[2009]
|
02/03/08 23:29:35 |
karpet |
rename some vars for clarity |
|
|
|
@1955
|
[1955]
|
11/13/07 23:31:51 |
karpet |
doc tweek; come config work |
|
|
|
@1952
|
[1952]
|
10/26/07 00:17:00 |
karpet |
rename messaging functions and add file, line and function name to output |
|
|
|
@1934
|
[1934]
|
05/07/07 22:11:18 |
karpet |
change stdin to any filehandle pointer and add more POD |
|
|
|
@1933
|
[1933]
|
05/06/07 23:33:47 |
karpet |
perl bindings split \003 into array of strings, libswish3 pod … |
|
|
|
@1931
|
[1931]
|
05/01/07 23:53:25 |
karpet |
tweek the metanames NB to separate text chunks with ctrl char \003 and … |
|
|
|
@1930
|
[1930]
|
04/30/07 23:08:43 |
karpet |
refactor to buffer all MetaNames? as well as PropertyNames? in NamedBuffer? |
|
|
|
@1928
|
[1928]
|
04/23/07 11:58:51 |
karpet |
refactor SWISH::3::Parser class and ref_cnt system |
|
|
|
@1927
|
[1927]
|
04/20/07 17:54:55 |
karpet |
refactoring to create Analyzer class, and the ability to do regex … |
|
|
|
@1924
|
[1924]
|
04/03/07 23:30:21 |
karpet |
global init/cleanup functions to help reduce duplication |
|
|
|
@1923
|
[1923]
|
03/19/07 11:58:02 |
karpet |
expose the tokenizer into Perl space for benchmarking |
|
|
|
@1921
|
[1921]
|
03/14/07 10:19:44 |
karpet |
reorg the perl namespaces and rename/rework some of the tokenizing to … |
|
|
|
@1913
|
[1913]
|
02/27/07 22:57:38 |
karpet |
for all the world to see |