An Inverse Inference Engine for
high Precision Web Search
RES790 DARPA Inverse II
The Phase I work has proved the precision and scalability
of the inverse inference algorithm, and its ability to perform
latent semantic analysis. In Phase II, we will extend the
functionality of the algorithm to encompass cross-language
document retrieval, tracking of document clusters in time,
and fast hierarchical clustering of large document databases.
The indexing structure will evolve from an information matrix
to an information tensor. The information tensor will accommodate
multidimensional term attributes like work position, part
of speech, and taxonomical and syntactic tags. We will embed
this richer indexing structure and all search functionality
in the Oracle interMedia cartridge. New query operators will
provide support for word n-grams, ordered phrases, term broadening,
cross document entity tracking and extraction of entity relationships.
We will also improve the performance of the soft hyperlink
navigation tool. We will validate the precision of our search
technology by participating in the TREC and CLEF competitions
on a regular basis throughout the duration of the contract.
|