2015 Villard Winter School / Search for Latent Variables: ICA, Tensors, and NMF

2015 Winter School
Search for Latent Variables: ICA, Tensors, and NMF

Program

The presentations are scheduled to be from Monday to Wednesday (2-4 February). Participants are expected to arrive on Sunday, February 1st, 2015.

Monday, 2 February 2015
09:00 - 09:45	Welcome drink
09:45 - 10:00	Christian JUTTEN and Pierre COMON
10:00 - 12:00	Lek-Heng LIM
Lunch break
14:00 - 16:00	Giorgio OTTAVIANI
16:15 - 18:15	Rasmus BRO

Tuesday, 3 February 2015
09:00 - 10:00	Cedric RICHARD
10:15 - 12:15	Remi GRIBONVAL
Lunch break
Free afternoon
17:00 - 19:00	David BRIE

Wednesday, 4 February 2015
09:00 - 10:00	Nicolas DOBIGEON
10:15 - 12:15	Guillaume BOUCHARD
Lunch break
14:00 - 14:30	Ciro CATTUTO
14:30 - 15:00	Ignat DOMANOV

The program in PDF.

Titles and slides

Christian Jutten and Pierre Comon: Introduction Slides

In alphabetic order:

Guillaume Bouchard: Binary matrix and tensor factorization - application to the probabilistic modeling of relational databases Slides
David Brie: I.Uniqueness of NMF. Slides II.Tensors in antenna array processing Slides
Rasmus Bro: Tensor analysis in chemistry Slides
Ciro Cattuto: Presentation of ISI Foundation Slides
Nicolas Dobigeon: Bayesian fusion of multiple images Slides
Ignat Domanov: Uniqueness issues Slides
Remi Gribonval: Learning low-dimensional models with penalized matrix factorization Slides
Lek-Heng Lim: The Role of Tensors in Numerical Mathematics Slides
Giorgio Ottaviani: On uniqueness of tensor decomposition Slides
Cedric Richard: Geometrical approaches in NMF, Applications to hyperspectral data unmixing Slides

Abstracts

Guillaume Bouchard	Binary matrix and tensor factorization - application to the probabilistic modeling of relational databases
	Many applications involve multiple interlinked data sources, but existing approach to handle them are often based on latent factor models (i.e. distributed representations) which are difficult to learn. At the same time, recent advances in convex analysis, mainly based on the nuclear norm (relaxation of the matrix rank) and sparse structured approximations, have shown great theoretical and practical performances to handle very large matrix factorization problems with non-Gaussian noise (in particular binary matrices) and missing data. In this tutorial, we will show how multiple matrices or tensors can be jointly factorized using a convex formulation of the problem, with a particular focus on: Multi-view learning: A popular approach is to assume that, both, the correlations between the views and the view-specific correlations have low-rank structure, leading to a model closely related to canonical correlation analysis called inter-battery factor analysis. We propose a convex relaxation of this model, based on a structured nuclear norm regularization. Collective matrix factorization: When multiple matrices are related, they share common latent factors, leading to a simple yet powerful way of handling complex data structures, such as relational databases. Again, a convex formulation of this approach is proposed. We also show that the Bayesian version of this model can be used to tune the multiple regularization parameters involved in such models, avoiding costly cross-validation. Another contribution to KB modeling relates to binary tensor and matrix factorization with many zeros. We show a new learning approaches for binary data that scales linearly with the number of positive examples. It is based on a iterative split of the tensor (or matrix) on which the binary loss is approximated by a Gaussian loss which itself can be efficiently minimized. Experiments on popular tasks such as data imputation, multi-label prediction, link prediction in graphs and item recommendation illustrate the benefit of the proposed approaches.
David Brie	1. Uniqueness of NMF. 2. Tensors in antenna array processing
	Part 1: Uniqueness and admissible solutions of NMF Non-negative Matrix Factorisation (NMF) has a wide range of applications including chemometrics, data mining, hyperspectral imaging, speech processing, musical signal processing… However, in general, such a factorization is not unique. Thus, to better understand it, we will review some conditions under which the NMF is unique and if not, how to characterize the set of admissible solutions. A special focus will be given to the geometrical interpretation of the NMF uniqueness conditions. Also, using this geometrical interpretation, we will see how some additional constraints (minimal volume enclosing simplex, sparseness) may ensure the identifiability of the NMF. Part 2: CP based array processing Array processing is a fondamental problem in signal processing for which Esprit based approach exploits the notion of rotational invariance. Starting from the Esprit method, we will present the key ideas of the Candecomp/Parafac (CP) based array processing approach which provides us with a very elegant way to handle multiple invariances. We will show how this approach can be extended to array having multiple scales of invariance and to vector sensor array processing.
Rasmus Bro	Tensor analysis in chemistry and biology
	We will discuss commonly used multi-way methods and show how they can be useful in solving real world problems in chemistry, medicine and biology. Focus will be on understanding the PARAFAC model (also known as CANDECOMP) and also variants such as PARAFAC2 and the Tucker3 model will be mentioned as will as multi-way regression methods. It will be shown how to validate and use tensor analysis in real world examples. Throughout the presentations, MATLAB will be used for illustrating practical aspects of the analysis as well as common problems. If you bring your own data, it will be possible to do analysis on those.
Nicolas Dobigeon	Bayesian fusion of multi-band images
	In this talk, we address the problem of fusing remotely sensed multi-band images. The observed images are related to the high spectral and high spatial resolution image to be recovered through physical degradations, e.g., spatial and spectral blurring and/or subsampling defined by the sensor characteristics. The fusion problem will be formulated within a Bayesian estimation framework. The fusion problem will be solved within a Bayesian estimation framework. Three different prior models will be introduced, as well as the the associated algorithms required to approximate the Bayesian estimators of the scene of interest. In particular, one of the proposed model allows the spectral blurring operator to be estimated jointly with the high spectral and high spatial resolution image.
Remi Gribonval	Learning low-dimensional models with penalized matrix factorization
	Sparse modeling has become a very popular tool in signal processing and machine learning, where many tasks can be expressed as underdetermined linear inverse problems. Together with a growing family of related low-dimensional signal models, sparse models expressed with signal dictionaries have given rise to algorithms combining provably good performance with bounded complexity. Through data-driven principles to choose the model, dictionary learning leverages these algorithms to address applications ranging from denoising to inpainting and super-resolution. This tutorial is meant to draw a panorama of dictionary learning and related (sparse, non-negative …) matrix factorization approaches for low-dimensional modeling. The talk will cover principles, connections with blind signal separation, algorithms, as well as an overview of recent theoretical results guaranteeing the robust identifiability of overcomplete dictionaries in the presence of noise and outliers.
Lek-Heng Lim	The Role of Tensors in Numerical Mathematics
	A tensor of order d or d-tensor is a multilinear map. It extends linear operators and bilinear forms, which are 2-tensors. We discuss how studies of higher-order tensors (i.e., d > 2) can shed light on questions in numerical linear algebra and optimization. We start from a classical result: The arithmetic complexity of matrix multiplication is determined by the rank of a 3-tensor. This insight of Strassen, when further developed, permits us to answer more "modern" questions: (i) Can one extend the popular techniques of L^1-norm sparse vector recovery and nuclear norm low-rank matrix completion to tensors of order 3 or higher? (ii) Can one multiply two 3-tensors in a way that generalizes the usual matrix-matrix multiplication? The answer to both questions is no. Our next question comes from nonlinear optimization. Can one extend the 1st (KKT) and 2nd order conditions for optimality of constrained optimization problems to higher orders? We will derive 3rd and 4th order conditions, involving 3- and 4-tensors, for constrained optimality and show that 5th and higher order conditions do not in general exist. One might notice the parallel to results on algebraic expressions for polynomial roots but they are unrelated. We end with a curious nugget from convex optimization. Nesterov and Nemirovskii famously showed that conic programming problems with self-concordant barriers may be solved in polynomial time. Casting self-concordance as a statement about 6-tensors, we deduce that deciding self-concordance is itself an NP-hard problem.
Giorgio Ottaviani	On uniqueness of tensor decomposition
	Uniqueness of CP tensor decomposition (PARAFAC or CANDECOMP) is a crucial property, needed in many applications, which allows to recover the individual summands from a tensor. A tensor which has a unique minimal tensor decomposition is called identifiable. The well known Kruskal Criterion gives since 1977 a sufficient condition which guarantees identifiability of a specific tensor. A related question is generic k-identifiability, which asks if the general tensor of rank k is identifiable. Kruskal criterion answers affirmatively to this question when the rank is relatively small. Precisely, for nnn tensors, Kruskal criterion guarantees generic k-identifiability for k<=3n/2-1. This result has a large amount of applications, but it is still unsatisfactory because it covers a range of ranks up to a linear bound of n, while more general nnn tensors have a rank which grows quadratically with n. The picture is similar for abc formats and for more than three modes. There has been a large recent activity on this topic by means of tools from algebraic geometry. Chiantini and Ciliberto discovered in 2001 that a classical paper by Terracini from 1911 contained a clever idea which allows to treat identifiability by infinitesimal computations. This is counter-intuitive, because two different decompositions of the same tensor can be very different one from each other. Terracini method ("weak defectivity") allows to detect if a second decomposition may exists, just by infinitesimal computations "on a neighborhood of the first one". This idea may be implemented in an algorithm which detects generic identifiability just by linear algebra, that is by computing the rank of certain (large) Jacobian and Hessian-like matrices. Running this algorithm, in several collaborations with Bocci, Chiantini and Vannieuwenhoven, it has been discovered that generic k-identifiability still holds for all subgeneric ranks k, unless a list of exceptions (the weakly defective cases). These results have a rigorous theoretic confirmation in the symmetric case, while in the general case there is still a gap between the state of the art of theoretical proofs and the numerical experiments. A corollary of these results is the computation of the dimension of secant varieties for tensors. In particular this gives a different approach to the Alexander-Hirschowitz Theorem. This approach was touched in particular cases already by Strassen in 1981. The algorithm gives also a sufficient condition to check identifiability of specific tensors. There is an additional problem for this task, which is related to our ignorance about equations of secant varieties. Nevertheless, in many cases, this algorithm can detect identifiability of specific tensors beyond Kruskal bound, but still in a range linear with n. In the last part of the talk I will report about recent results about identifiability of general tensors, and their canonical forms, by means of homotopical method techniques, in collaboration with Hauenstein, Oeding and Sommese.
Cedric Richard	Geometrical approaches in NMF, applications to hyperspectral data unmixing
	The emergence of hyperspectral imaging sensors in recent years has brought new opportunities and challenges in image analysis. Hyperspectral images are cubes of data, measuring spectral composition within a spatial view. In hyperspectral imaging, spectral unmixing is one of the most challenging and fundamental problems. It consists of breaking down the spectrum of a mixed pixel into a set of pure spectra, called endmembers, and their contributions, called abundances, by solving a non-negative matrix factorization problem. Many endmember extraction techniques have been proposed in literature. In this presentation, we propose to review those based on a geometrical formulation.