[Dev] Defect #726

Brett Wooldridge bwooldridge at alterpoint.com
Fri Apr 18 03:30:53 CDT 2008


I'm looking at #726, a bug in which we sometimes return the ZED in a text config search and sometimes not.  In looking at it, I don't see how ZED searching can be made to work with Lucene without implementing a custom tokenizer.  The current tokenizer (supplied by Lucene) indexes by tokenizing on whitespace, so many XML documents will not index in a way that is useful.  For example:

<tag>something</tag>

Will index as a single token.  A search for "something" would fail.  However:

<tag>
   something
</tag>

...would yield a hit for a "something" search.  So, I'd like to exclude the ZED from the full-text index, if not permanently then at least in 2008.04.  I'm unclear on it's utility.  Seems like the full-text search should search configurations, and anything else people want to search for (i.e. modeled content) should have a specific search like our other searches (and therefore have a column in a table in the DB).

Without objection (or a better idea), I'm going to make it so.

-Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ziptie.org/pipermail/dev/attachments/20080418/b4915eff/attachment.html 


More information about the Dev mailing list