Overall Objectives
Research Program
Application Domains
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
XML PDF e-pub
PDF e-Pub

Section: Bilateral Contracts and Grants with Industry

The LearnClues Labcomm

The LearnClues LabComm has been granted on Oct 2.

Statistical learning is a field of mathematics and computer science that enables the extraction of predictive models from data with weak signal to noise ratio. These techniques are behind the successes of Google or the progresses of automatic medical diagnostic. Combined with a knowledge of the field of application, they open the door to optimal decisions. Tinyclues is a start-up that applies statistical learning to e-commerce, adapting the marketing practice from customer databases. Parietal is an Inria research group that develops statistical learning for neurosciences and is the driving force behind the software tool "scikit-learn", that is a standard in statistical learning.

The goal of this proposed common lab is to transfer the expertise of Parietal on big data and to improve statistical learning techniques and implementation on distributed systems to open the door to faster analysis of very large datasets. Indeed, processing more data implies detecting smaller effects in the signals. Tinyclues already uses the tools developed par Parietal on the "cloud", and thus in distributed computing environments. The practical experience of Parietal enables us to plan substantial improvements to computational performance as well as to the amount of information extracted from big data.

From a strategical standpoint for Tinyclues, such progress are important to vary the number of domain scenarios that it can address, by analyzing jointly more data of a wider type, and to render fully automatic the data analysis platform that it is offering to its customers, replacing challenging tasks currently performed by experts. These developments are particularly important given that Tinyclues is developing at a very fast rate and is processing bigger and bigger datasets and an increasing number of different problems.

The project partners are: