LEVAN Uses Images to Provide Encyclopaedic Knowledge

By July 02, 2014

An intelligent search programme enables identification of all the concepts linked to a word and presents them as an exhaustive list of images.

The service offered by search engines, at least the most commonly used ones, is today limited to the presentation of a mass of unsorted content. However, search engine models are now being designed from novel angles. One example is Random, a model developed by a Finnish startup which still bases web browsing on observed user attitudes but makes less cognitive-based suggestions, allowing a random, unexpected element. More recently, research scientists at the University of Washington (UW) and the Allen Institute for Artificial Intelligence (AI2) in Seattle focused their work on how a search for precise content could be made when the term on which you are searching is vague, or conversely how to ensure that you have exhaustive results from a given search. The invention they have come up with is a fully automated software programme called LEVAN (Learning Everything about Anything).

Technology for global searches

The LEVAN programme draws on the vast resources of books available online (Google Books Ngrams) to find all possible variants of a given term. The search on the libraries of books focuses on occurrences, i.e. the frequency with which a word appears in a text. Ideas which cannot be associated with a visual image are excluded by the algorithm as not being relevant to an image search. All synonymous words are grouped together under the same ‘concept’. The programme then proceeds to search for images which correspond to the concepts that have been identified, and an image recognition algorithm classifies them under that concept. “It’s all about discovering associations between textual and visual data,” explains Ali Farhadi, an assistant professor of computer science and engineering at the University of Washington. LEVAN results are shown as a visual archive of the concepts in as many groupings as it can provide, so that the user can rapidly see all the concepts associated with a word search.

New tool for visual data

This kind of intelligent search engine is not new, but what is absolutely novel is that the LEVAN programme needs no human supervision whatsoever and therefore represents a considerable advance for searches on data from encyclopaedias. Santosh Divvala, a research scientist at AI2 and affiliate scientist at UW’s computer science and engineering department, underlines that “major information resources such as dictionaries and encyclopaedias are moving in the direction of showing users visual information because it’s easier to comprehend and much faster to browse through concepts.” However, these resources have limited coverage because most often they tend to be manually curated, while this newly created system requires no human curator or supervision at all to combine textual and visual data so as to cover all the instances of a word. LEVAN, which was launched in March, is today able to annotate 13 million images via 65,000 different variations of concepts. The research team is aiming to make LEVAN both a standard learning tool and an image information bank. Their next step will be to create a smartphone app.

Legal mentions © L’Atelier BNP Paribas