[SPT/CWIS] indexing slowness

Edward Almasy ealmasy at scout.wisc.edu
Thu May 19 06:31:06 CDT 2005


On Wed, May 18, at 08:08:56PM, Aaron Krowne wrote:
> Can someone tell me what CWIS is doing that it takes a day to index
> 50,000 records?  
> Might I have something misconfigured, perhaps?  Can I assist in improving 
> the implementation?

   I assume you're talking about rebuilding the search database.  In that
   case you probably don't have anything misconfigured.  Keep in mind that
   it's not indexing 50,000 records -- it's indexing every word in every
   enabled field for each of those 50,000 records, which, if your metadata
   is anything like ours, means it's actually indexing millions of terms.

   That said, if you want to dig into the code and have some ideas on how
   we can speed up the rebuild process, we're certainly open to suggestions.
   We have done a fair amount of work optimizing this, and I believe the
   majority of the time is actually being spent in SQL statements that insert
   the indexed values back into the database.  This could be speeded up some
   by writing multiple rows in a single INSERT statement, using the syntax
   extensions for INSERT added in newer versions of MySQL, but we've tried to
   stay away from programming to particular DB software versions.  Anyway, if
   you've got specific ideas on how we might speed up the search database
   rebuild process please e-mail me off list, and I'd be happy to talk with
   you about them.

   Ed


-- 
   Edward Almasy                                     ealmasy at scout.wisc.edu
   Co-Director                                         1210 W Dayton Street
   Internet Scout Project                                  Madison WI 53706
   Computer Sciences Department                        608-262-6606 (voice)
   University of Wisconsin - Madison                     608-265-9296 (fax)



More information about the SPT-CWIS-Users mailing list