2011-12-29

Introducing Vista Search

The first version of the Vista Logic and Protocol bundles have been merged into the default branch.  Vista provides a fast full-text indexing method for content stored on the OpenGroupware Coils server.  The Vista component can currently index Contact, Document [meta-data], Enterprise, Note, and Task entities.  All the text fields of these entities are indexed including the values of object properties and company values. Utilizing the power of Vista search is performed by performing HTTP requests to the protocol exposed at "/vista" on the server. 
curl -v -u fred -o output 'http://coils.example.com/vista?term=detroit&term=steel&archived&type=enterprise'
Authenticate as user fred and search for enterprise entities, regardless of archived status, that contain the terms "detroit" and "steel".
The results of the HTTP request are JSON encoded Omphalos representations of the first 100 entities to match the specified criteria.  The default Omphalos detail level is 2056 [Comment + Company Values].  If an alternate detail level is desired this default can be overridden using the "detail" URL parameter.   The following URL parameters are supported:
  • archived - Include entities in the search regardless of there archived status. If no specified archived entities will be excluded from the search results.
  • detail - The Omphalos detail level to use when representing entities in the response.  It is important to recognize that specifying a high detail level will reduce performance.
  • term - Specify a search term.  Any number of terms may be specified.
  • type - Limit the searched entities by type.  If no type is specified all indexed entities are searched regardless of type.  Multiple type parameters may be specified.
For code local to the server searches can be performed using the "vista::search" Logic command:
results = ctx.run_command('vista::search', keywords = [ 'detroit', 'steel' ],
                                                                    entity_types = [ 'enterprise' ],
                                                                    include_archived = True)
Peform a Vista search via Logic for all enterprises, regardless of archived status, that contain the terms "detroit" and "steel".

The recently packaged tool coils-request-index can request the creation or update of an entities search vector.  Normally when an entity is modified a re-index is requested automatically [search vector generation happens in the background and is performed by the coils.vista.index component].  However, if large changes are made to the database or for the initial index generation the use of coils-request-index may expedite the process.
coils-request-index --contact --enterprises --notes --documents --tasks
Request an index/reindex of all the entities of the specified types.
coils-request-index --objectid=10100
Request an index/reindex the entity with objectId 10,100.
If the index is already current for the entity the vector generation request will be discarded.
This new feature does require a schema update to existing OpenGroupware databases.  This schema update will be required for version 0.1.45.
CREATE TABLE vista_vector (
  object_id  INT PRIMARY KEY,
  version    INT DEFAULT 0,
  edition    INT,
  entity     VARCHAR(25) NOT NULL,
  event_date DATE DEFAULT 'TODAY', 
  archived   BOOL DEFAULT FALSE,
  keywords   VARCHAR(128)[],
  vector     tsvector);
CREATE INDEX vista_idx_i0 ON vista_vector (entity);
CREATE INDEX vista_idx_i1 ON vista_vector (event_date);
CREATE INDEX vista_idx_i2 ON vista_vector USING gin(vector);
Vista search utilizes PostgreSQL's powerful tsearch text indexing module.  tsearch provides lexeme oriented indexing - so the server knows, for example, that "rats" and "rat" share the same stem.  Thanks to tsearch Vista searches are not only fast - they're clever!

1 comment:

  1. Wonder if it'd be worth adding appointments to the types of things it could search?

    ReplyDelete