Ruminations on Development
Lucene Overview
is an open-source suite of index-based search software projects hosted by the Apache Software Foundation
. Lucene (The Project)
is the base project at the center of all of the other projects presented on Lucene (The Website).
What if I then tell you that you need to come up with rankings of the search results based on frequency of the occurrence of the word "balloon"? Oh, and did I metion that we need the ability to weight the results from one column heavier than another column? This whole searching thing becomes a much harder task...
Now, when I tell you that this should scale to include not only "balloon", but also "giraffe" and "cotton candy" and a varying number of other phrases, we've basically put the whole SQL-based searching option to bed. Yeah, you could come up with a whole code-plus-SQL framework to do all these things dynamically... But, why bother? The Lucene project already solved all these issues for you.
This is how I say there are two sides of search. The first side is the involved process of creating the index. The second side is actually querying against the data set in the index.
It turns out that there is very little overhead requiring a connection between the two sides of searching, other than the index itself. You could, theoretically, build your index using Lucy
(the loose C port of the Lucene Java library), and consume that index Java-style in your Java-based Web Application.
This is all I have time for right now. I plan to make several more Lucene-based posts over the next couple of weeks. Let me know if you have any questions on what I've posted so far...
Posted at 11:23AM Oct 01, 2007 by Nelson "Nelz" Carpentier in Java | Comments[0]