Traditional search is based on guessing the keywords that might find what you want and manually inspecting the results. Traditional search fails when the data doesn’t match your keyword guesses, or when you don’t already know what you’re looking for (but you will when you see it).
Concept-based search approaches allow the user to explore unknown data and to retrieve results which match concepts rather than exact keywords (so a query for ‘Car’ can return a document that only contains the word ‘Automobile’).
Entity extraction allows a system to detect references to special terms, producing structure that can be used by a Faceted Search system to allow a user to browse or drill-down through the data-set.
A Semantic Text-Annotation system such as Ontotext’s KIM platform extends simple Entity extraction by matching terms against a Semantic Ontology (or Vocabulary) of related terms and indexing documents according to thise relationships. The resulting system provides enhances search by understanding that a query for ‘Car’ should also find ‘Automobile’, and vice versa. This both improves the findability of documents, and the relevance of the documents that are found.
Tailoring the Ontology to suit the content domain produces a system with enhanced recall. Ontotext’s Biomedical Semantic Tagger is just such an enhanced text processing system that is tuned to understand and classify biomedical text.
Faceted search systems such as TopQuadrant’s Faceted Search capability or KIM’s search UI expose the extracted terms to allow the user to drill-down and discover information that they would not have been able to query.