Finding Data Relationships to Build Better Recommendation Engines

By July 26, 2010

The calculating methods used by recommendation systems, which place products deemed most pertinent to a user’s profile at the top, need to be simplified, according to researchers at the University of Utah. The most common app

roach, called “multidimensional,” consists in analyzing multiple types of user data simultaneously: their age, tastes, contacts and previous purchases, for example. The more data criteria, the better the results. But this also makes the task more complex. Each new criteria multiplies a system’s calculations exponentially.

To address this, University of Utah's researchers created a method that groups the data into the main outlines that characterize an individual. Rather than analyze all of the data, the system focuses on these outlines.

“Each piece of data has a numeric value,” Suresh Venkatasubramanian, one of the researchers, told L’Atelier. “If you look at this data like points on a graph, each with its own coordinates, the distance between the points allows us to find certain similarities.”

The researchers are more interested in the relationships between data than with the data itself. For example, if you think about the height and weight of someone, there’s a good chance that a taller person is heavier than a short person. Thus, rather than measuring the height and weight of a person as independent variables, you need to look at the correlation between them.

“Our approach is not interested only in the relationship of data represented by points on a graph,” Venkatasubramanian said. “We do this by decreasing the respective coordinates of each of them, while preserving the same distance.” The objective? Reducing the “dimensionality” of data.

"Prior methods on modern computers struggle with data from more than 5,000 people,” the researcher said. “Our method smoothly handles well above 50,000 people."

The advantage of this method is not only an increase in calculation speed, but also a reduction in the amount of memory required for recommendation systems.

Legal mentions © L’Atelier BNP Paribas