Visual Information Retrieval
This interesting addition to the Oracle tool suite is very similar to the interMedia tool.
Both have a Java client that lets you retrieve an image file, modify it on your desktop, and then return it to the database.
Both Visual Information Retrieval and interMedia use object types with methods as the primary definitions for tables storing the images.
Both allow you to update the image record with attributes for size, format, compression, comments, and the like.
The difference between the tools lies in the added query capability of Visual Information Retrieval. This tool can compare images to one another using a scoring method that scores the correlation between two images: 0 (zero) means the images are a perfect match and 100 means the images share no common traits.
Visual Information Retrieval also has another method called "similar," which compares two images and rates how similar they are to each other, according to specific criteria. Technology similar to these Visual Information Retrieval methods is used in face-recognition software.
(IMT) Oracle interMedia Text
System-Computed Relevance and Ranking
An information retrieval system which will rank and order the records or their surrogates in a retrieved set needs a mechanism for calculating the closeness of
a match between a user query and a document. The result of this calculation can be used to determine the order of presentation of members of the set to the searcher. That is to say, this calculation provides the system’s estimate of the relevance of the document, and the goal is that this estimate should be strongly correlated to the user’s judgment of the relevance of the document.
The result of this calculation, the value given to the closeness of the match between the query and the
document, has been called the retrieval status value, or rsv.
In a strict Boolean query system, one that specifies attribute values that must be present if a record is to be selected, each term present in the query or
document could only have a weight of 0 or 1 and the resulting
rsv[2] of a document could only have a value of 1 (accept) or 0 (reject) resulting in the traditional unranked, but assumed relevant, subset of the database. If weighted terms
are used, a document’s rsv, computed from their values, can range anywhere from 0 to 1 and is therefore potentially much more useful.
Ranking
Since the purpose of the (rsv) "retrieval status value" is to provide a mechanism for evaluating the match between a document and a query, it allows the system to rank documents in descending order on the basis of their rsv.
This means that the system can go down the ranked list and present the user with a complete, ordered list of all documents that have a positive value of rsv or the top-ranking n documents of the list, where n can be set by the user.
These would be those the system judges most likely to be deemed relevant by the user.
This is what is called mathematically a
weak ordering[3], meaning that ties are allowed.
If the rsv is binary there is no choice but to present all documents that meet the formal requirements of the query, an option often frustrating to users. Increasingly, IR systems are providing relevance ranking options, and on the Web where precise queries may not be possible and document attributes not explicit, all search engines utilize such rankings.
Challenge with Ranking
A difficulty with ranking is that users are not usually told what the system's base for the calculation is.
Where users have been polled for their reactions, they seem to like it. Would it make any significant difference if they were told the basis or given an opportunity to make a contribution to the method, perhaps to emphasize words occurring in the text, name of the author, or source? There is no research on this question to date although systems exist that give the user the opportunity to supply terms to be used for ranking separate from those used in the search, e.g., the AltaVista "Sort By" box. Asking users to make such choices calls for more involvement on their part, necessitating more knowledge of the system, something not all users want to invest in. But, it could lead to better
retrieval outcomes.
[1]
Relevance ranking: Relevancy ranking is the method that is used to order the results set in such a way that the records most likely to be of
interest to a user will be at the top of the result set. This makes searching easier for users as they will not have to spend as much time looking through records for the information that interests them.
[2](rsv) Retrieval Status Value: The retrieved documents are ranked according to their retrieval status values if these are montonically increasing with the probability of relevance of documents.
[3]
weak ordering: A weak ordering is a mathematical formalization of the intuitive notion of a ranking of a set, some of whose members may be tied with each other. Weak orders are a generalization of totally ordered sets (rankings without ties) and are in turn generalized by (poset) partially ordered sets and preorders.