Applying search engine techniques to physics data, Yandex partners up with CERN

The European Organization for Nuclear Research, or CERN, is perhaps best-known for two things: (probably) locating the elusive Higgs Boson, and being the birthplace of the web. Now that second claim to fame is circling back, with Yandex — Russia’s answer to Google — offering up its core search technology for testing in CERN’s physics work.

The technology in question is called MatrixNet. It’s a machine learning tool that helps Yandex rank search results for relevancy, but in this case CERN is testing it out alongside other proprietary Yandex data-processing technologies in order to pick rare but important events out of the masses of data produced in CERN’s experiments. Essentially, it is being used to more accurately establish statistical relevance, so that CERN can call something a ‘major discovery’ with more certainty.

The research facility is already using MatrixNet to sift through data on B-meson decay analysis (this is part of the quest for sypersymmetry, which tests the boundaries of the Standard Model of physics), but now CERN will also see whether it can help across other aspects of its work.

To accomplish that, Yandex has been brought in as an associate member of CERN openlab, the mechanism CERN uses to collaborate with private-sector IT firms such as Oracle and HP. This is actually an expansion of Yandex’s work with the research facility, as it has already been supplying it with computing resources since 2011. The company also came up with a specialized search tool last year for the data emanating from one of the Large Hadron Collider experiments.

CERN openlab head Bob Jones said in a statement:

“Today’s physics deals with large datasets that need to be properly processed and interpreted. New discoveries are all but impossible without meticulous data analysis. To this respect we are very interested in using and testing the services and technologies Yandex is developing.”

This is sort of a sponsorship deal, giving Yandex added respectability, but it’s also win-win from a technological standpoint — after all, we’re talking about machine learning here. As Yandex’s Andrey Ustyuzhanin put it: “Being a CERN openlab associate member opens up new opportunities for contributing to the organization’s projects and gives us a wide access to the LHC experiment data. These datasets allow us both to improve our technologies and contribute to advances in physics.”

The Russian web giant previously had to supply CERN with specialists in order to work with MatrixNet, but it said in Monday’s statement that it is developing “a user-friendly service that will give physicists an opportunity to use Yandex’s machine learning technology on their own”. It’s not clear whether this is to do with the fact that MatrixNet is proprietary, but it is worth pointing out that Yandex’s willingness to even let outsiders near its secret search sauce sets it apart from chief rival Google.


GigaOM