Workshop 19/06/2015

Date : 19/06/2015
9h30 – 13h30
Orange Labs
Salle S012, 38-40 rue du Général Leclerc
Issy-les-Moulineaux (Métro 12 – Mairie d’Issy)
Plan d’accès / map

  • Text analytics in advertising and finance, Laurent El Ghaoui, Unversity of California, Berkeley.
  • Seriation algorithm for ranking, Alexandre d’Aspremont, Ecole Normale Supérieure
  • Community detection based on opinion dynamics, Jérémie Jakubowicz, Telecom SudParis.
  • Relational structures and optimal Transport  Models  applied to  big graphs and big networks clustering, Jean-François Marcotorchino, Thalès.

Abstracts :
Sparse machine learning in text analytics in advertising and finance, Laurent El Ghaoui (Unversity of California, Berkeley) [Slides]
Sparse machine learning has recently emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational cost. We will review some recent progresses in text analytics based on sparse machine learning models, and explore applications in advertising and finance, with a focus on risk management.

Seriation algorithm for ranking, Alexandre d’Aspremont (Ecole Normale Supérieure, France) [Slides]
We describe a seriation algorithm for ranking a set of n items given pairwise comparisons between these items. Intuitively, the algorithm assigns similar rankings to items that compare similarly with all others. It does so by constructing a similarity matrix from pairwise comparisons, using seriation methods to reorder this matrix and construct a ranking. We first show that this spectral seriation algorithm recovers the true ranking when all pairwise comparisons are observed and consistent with a total order. We then show that ranking reconstruction is still exact even when some pairwise comparisons are corrupted or missing, and that seriation based spectral ranking is more robust to noise than other scoring methods. An additional benefit of the seriation formulation is that it allows us to solve semi- supervised ranking problems. Experiments on both synthetic and real datasets demonstrate that seriation based spectral ranking achieves competitive and in some cases superior performance compared to classical ranking methods.

Community detection based on opinion dynamics, Jérémie Jakubowicz (Telecom SudParis, France) [Slides]
After having introduced the problem of community detection, and presented the main tools used to address it in the large scale setting, we will introduce a new community detection algorithm based on an underlying opinion dynamics model. We will analyze this algorithm using stochastic approximation. Then we will present a distributed implementation of it and comment on numerical experiments.

Relational structures and optimal transport models applied to big graphs and big networks clustering, Jean-François Marcotorchino (Thalès, France)
We present a summary of the principal results which can be found in a Thales internal paper written by the author in 2013. Starting with the seminal works on transportation theory of G. Monge and L. Kantorovich, while revisiting the works of Maurice Fréchet, we will introduce direct derivations of the optimal transport problem such as the so-called “Alan Wilson’s Entropy Model” and the “Minimal Trade Problem”. We will show that optimal solutions of those models are mainly based on two dual principles: The “ Statistical independence” on the one hand and the “ Logical indetermination” on the other hand . Thanks to Mathematical Relational Analysis representation we will introduce at that occasion and the Antoine Caritat’s (Condorcet) works on Relational Consensus, we will give a mathematical interpretation of the” Logical indetermination structure” and underline so the duality Relationship between “deviation to independence” and “deviation to indetermination” criteria . Finally, these results will lead us to the elaboration of a new criterion for huge graphs modularization, generalizing the “Girvan Newman”’s one . Relying of a new and generic version (Thales/Paris VI) of the famous and powerful “Louvain’s Algorithm” for clustering huge graphs, developed by LIP6 ‘s Complex Networks Lab and Louvain’s University Computer Sciences Dpt, we will show, as a conclusion, some results allowing to decompose Social Networks for large value of N (number of nodes)”.