Publications
- M. M. Bronstein, A. M. Bronstein, Biometrics was no match for hair-raising tricks, Nature Vol. 420, 2002M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, H. Azhari, Reconstruction in ultrasound diffraction tomography using non-uniform FFT, IEEE Trans. on Medical Imaging, Vol. 21(11), 2002 details
Reconstruction in ultrasound diffraction tomography using non-uniform FFT
M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, H. AzhariIEEE Trans. on Medical Imaging, Vol. 21(11), 2002We show an iterative reconstruction framework for diffraction ultrasound tomography. The use of broadband illumination allows a significant reduction of the number of projections compared to straight ray tomography. The proposed algorithm makes use of the forward nonuniform fast Fourier transform (NUFFT) for iterative Fourier inversion. Incorporation of total variation regularization allows the reduction of noise and Gibbs phenomena while preserving the edges. The complexity of the NUFFT-based reconstruction is comparable to the frequency domain interpolation (gridding) algorithm, whereas the reconstruction accuracy (in sense of the L2 and the L∞ norm) is better.
M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Iterative reconstruction in diffraction tomography using non-uniform fast Fourier transform, Proc. Int'l Symposium on Biomedical Imaging (ISBI), 2002 detailsIterative reconstruction in diffraction tomography using non-uniform fast Fourier transform
M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviProc. Int'l Symposium on Biomedical Imaging (ISBI), 2002We show an iterative reconstruction framework for diffraction ultrasound tomography. The use of broadband illumination allows the number of projections to be reduced significantly compared to straight ray tomography. The proposed algorithm makes use of fast forward non-uniform Fourier transform (NUFFT) for iterative Fourier inversion. Incorporation of total variation regularization allows noise and Gibbs phenomena to be reduced whilst preserving the edges.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Optimal nonlinear estimation of photon coordinates in PET, Proc. Int'l Symposium on Biomedical Imaging (ISBI), 2002 detailsOptimal nonlinear estimation of photon coordinates in PET
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviProc. Int'l Symposium on Biomedical Imaging (ISBI), 2002We consider detection of high-energy photons in PET using thick scintillation crystals. Parallax effect and multiple Compton interactions in this type of crystals significantly reduce the accuracy of conventional detection methods. In order to estimate the scintillation point coordinates based on photomultiplier responses, we use asymptotically optimal nonlinear techniques, implemented by feed-forward neural networks, radial basis functions (RBF) networks, and neuro-fuzzy systems. Incorporation of information about angles of incidence of photons significantly improves the accuracy of estimation. The proposed estimators are fast enough to perform detection using conventional computers.
- A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Optimal nonlinear line-of-flight estimation in positron emission tomography, IEEE Trans. on Nuclear Science, Vol. 50(3), 2003 details
Optimal nonlinear line-of-flight estimation in positron emission tomography
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviIEEE Trans. on Nuclear Science, Vol. 50(3), 2003We consider detection of high-energy photons in PET using thick scintillation crystals. Parallax effect and multiple Compton interactions such crystals significantly reduce the accuracy of conventional detection methods. In order to estimate the photon line of flight based on photomultiplier responses, we use asymptotically optimal nonlinear techniques, implemented by feedforward and radial basis function (RBF) neural networks. Incorporation of information about angles of incidence of photons significantly improves the accuracy of estimation. The proposed estimators are fast enough to perform detection, using conventional computers. Monte-Carlo simulation results show that our approach significantly outperforms the conventional Anger algorithm.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Separation of semireflective layers using Sparse ICA, Proc. Int'l Conf. on Acoustics Speech and Signal Processing (ICASSP), 2003 detailsSeparation of semireflective layers using Sparse ICA
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviProc. Int'l Conf. on Acoustics Speech and Signal Processing (ICASSP), 2003We address the problem of Blind Source Separation (BSS) of superimposed images and, in particular, consider the recovery of a scene recorded through a semi-refective medium (e.g. glass windshield) from its mixture with a virtual reflected image. We extend the Sparse ICA (SPICA) approach to BSS and apply it to the separation of the desired image from the superimposed images, without having any a priori knowledge about its structure and/or statistics. Advances in the SPICA approach are discussed. Simulations and experimental results illustrate the efficiency of the proposed approach, and of its specific implementation in a simple algorithm of a low computational cost. The approach and the algorithm are generic in that they can be adapted and applied to a wide range of BSS problems involving one-dimensional signals or images.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Expression-invariant 3D face recognition, Proc. Audio- and Video-based Biometric Person Authentication (AVBPA), Lecture Notes in Comp. Science No. 2688, Springer, 2003 detailsExpression-invariant 3D face recognition
A. M. Bronstein, M. M. Bronstein, R. KimmelProc. Audio- and Video-based Biometric Person Authentication (AVBPA), Lecture Notes in Comp. Science No. 2688, Springer, 2003We present a novel 3D face recognition approach based on geometric invariants introduced by Elad and Kimmel. The key idea of the proposed algorithm is a representation of the facial surface, invariant to isometric deformations, such as those resulting from different expressions and postures of the face. The obtained geometric invariants allow mapping 2D facial texture images into special images that incorporate the 3D geometry of the face. These signature images are then decomposed into their principal components. The result is an efficient and accurate face recognition algorithm that is robust to facial expressions. We demonstrate the results of our method and compare it to existing 2D and 3D face recognition algorithms.
- A. M. Bronstein, M. M. Bronstein, E. Gordon, R. Kimmel, Fusion of 2D and 3D data in three-dimensional face recognition, Proc. Int'l Conf. on Image Processing (ICIP), 2004 details
Fusion of 2D and 3D data in three-dimensional face recognition
A. M. Bronstein, M. M. Bronstein, E. Gordon, R. KimmelProc. Int'l Conf. on Image Processing (ICIP), 2004We discuss the synthesis between the 3D and the 2D data in three-dimensional face recognition. We show how to compensate for the illumination and facial expressions using the 3D facial geometry and present the approach of canonical images, which allows to incorporate geometric information into standard face recognition approaches.
M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi,, Optimal sparse representations for blind source separation and blind deconvolution: a learning approach, Proc. Int'l Conf. on Image Processing (ICIP), 2004 detailsOptimal sparse representations for blind source separation and blind deconvolution: a learning approach
M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi,Proc. Int'l Conf. on Image Processing (ICIP), 2004We present a generic approach, which allows to adapt sparse blind deconvolution and blind source separation algorithms to arbitrary sources. The key idea is to bring the problem to the case in which the underlying sources are sparse by applying a sparsifying transformation on the mixtures. We present simulation results and show that such transformation can be found by training. Properties of the optimal sparsifying transformation are highlighted by an example with aerial images.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Fast relative Newton algorithm for blind deconvolution of images, Proc. Int'l Conf. on Image Processing (ICIP), 2004 detailsFast relative Newton algorithm for blind deconvolution of images
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviProc. Int'l Conf. on Image Processing (ICIP), 2004We present an efficient Newton-like algorithm for quasi-maximum likelihood (QML) blind deconvolution of images. This algorithm exploits the sparse structure of the Hessian. An optimal distribution-shaping approach by means of sparsification allows one to use simple and convenient sparsity prior for processing of a wide range of natural images. Simulation results demonstrate the efficiency of the proposed method.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, A. Spira, Face recognition from facial surface metric, Proc. European Conf. on Computer Vision (ECCV), 2004 detailsFace recognition from facial surface metric
A. M. Bronstein, M. M. Bronstein, R. Kimmel, A. SpiraProc. European Conf. on Computer Vision (ECCV), 2004Recently, a 3D face recognition approach based on geometric invariant signatures, has been proposed. The key idea is a representation of the facial surface, invariant to isometric deformations, such as those resulting from facial expressions. One important stage in the construction of the geometric invariants involves in measuring geodesic distances on triangulated surfaces, which is carried out by the fast marching on triangulated domains algorithm. Proposed here is a method that uses only the metric tensor of the surface for geodesic distance computation. That is, the explicit integration of the surface in 3D from its gradients is not needed for the recognition task. It enables the use of simple and cost-efficient 3D acquisition techniques such as photometric stereo. Avoiding the explicit surface reconstruction stage saves computational time and reduces numerical errors.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Blind source separation using block-coordinate relative Newton method, Signal Processing, Vol. 84(8), 2004 detailsBlind source separation using block-coordinate relative Newton method
A. M. Bronstein, M. M. Bronstein, M. ZibulevskySignal Processing, Vol. 84(8), 2004Presented here is a generalization of the relative Newton method, recently proposed for quasi maximum likelihood blind source separation. Special structure of the Hessian matrix allows performing block-coordinate Newton descent, which significantly reduces the algorithm computational complexity and boosts its performance. Simulations based on artificial and real data showed that the separation quality using the proposed algorithm is superior compared to other accepted blind source separation methods.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Blind source separation using the block-coordinate relative Newton method, Proc. Int'l Conf. on Independent Component Analysis and Blind Signal Separation, Lecture Notes in Comp. Science No. 3195, Springer, 2004 detailsBlind source separation using the block-coordinate relative Newton method
A. M. Bronstein, M. M. Bronstein, M. ZibulevskyProc. Int'l Conf. on Independent Component Analysis and Blind Signal Separation, Lecture Notes in Comp. Science No. 3195, Springer, 2004Presented here is a generalization of the modified relative Newton method, recently proposed by Zibulevsky for quasi-maximum likelihood blind source separation. The special structure of the Hessian matrix allows to perform block-coordinate Newton descent, which significantly reduces the algorithm computational complexity and boosts its performance. Simulations based on artificial and real data show that the separation quality using the proposed algorithm outperforms other accepted blind source separation methods.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, QML blind deconvolution: asymptotic analysis, Proc. Int'l Conf. on Independent Component Analysis and Blind Signal Separation, 2004 detailsQML blind deconvolution: asymptotic analysis
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviProc. Int'l Conf. on Independent Component Analysis and Blind Signal Separation, 2004Blind deconvolution is considered as a problem of quasi-maximum likelihood (QML) estimation of the restoration kernel. Simple closed-form expressions for the asymptotic estimation error are derived. The asymptotic performance bounds coincide with the Cramér-Rao bounds, when the true ML estimator is used. Conditions for asymptotic stability of the QML estimator are derived. Special cases when the estimator is super-efficient are discussed.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Optimal sparse representations for blind deconvolution of images, Proc. Int'l Conf. on Independent Component Analysis and Blind Signal Separation, 2004 detailsOptimal sparse representations for blind deconvolution of images
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviProc. Int'l Conf. on Independent Component Analysis and Blind Signal Separation, 2004The relative Newton algorithm, previously proposed for quasi-maximum likelihood blind source separation and blind deconvolution of one-dimensional signals is generalized for blind deconvolution of images. Smooth approximation of the absolute value is used in modeling the log probability density function, which is suitable for sparse sources. We propose a method of sparsification, which allows blind deconvolution of sources with arbitrary distribution, and show how to find optimal sparsifying transformations by training.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Quasi maximum likelihood blind deconvolution of images acquired through scattering media, Proc. Int'l Symposium on Biomedical Imaging (ISBI), 2004 detailsQuasi maximum likelihood blind deconvolution of images acquired through scattering media
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviProc. Int'l Symposium on Biomedical Imaging (ISBI), 2004We address the problem of restoration of images obtained through a scattering medium. We present an efficient quasi-maximum likelihood blind deconvolution approach based on the fast relative Newton algorithm and optimal distribution shaping approach (sparsification), which allows to use simple and convenient sparsity prior for a wide class of images. Simulation results prove the efficiency of the proposed method.
- A. M. Bronstein, M. M. Bronstein, R. Kimmel, Three-dimensional face recognition, Int'l Journal of Computer Vision (IJCV), Vol. 64(1), 2005 details
Three-dimensional face recognition
A. M. Bronstein, M. M. Bronstein, R. KimmelInt'l Journal of Computer Vision (IJCV), Vol. 64(1), 2005An expression-invariant 3D face recognition approach is presented. Our basic assumption is that facial expressions can be modeled as isometries of the facial surface. This allows to construct expression-invariant representations of faces using the canonical forms approach. The result is an efficient and accurate face recognition algorithm, robust to facial expressions that can distinguish between identical twins (the first two authors). We demonstrate a prototype system based on the proposed algorithm and compare its performance to classical face recognition methods. The numerical methods employed by our approach do not require the facial surface explicitly. The surface gradients field, or the surface metric, are sufficient for constructing the expression-invariant representation of any given face. It allows us to perform the 3D face recognition task while avoiding the surface reconstruction stage.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Quasi maximum likelihood blind deconvolution: super- an sub-Gaussianity versus consistency, IEEE Trans. Signal Processing, Vol. 53(7), 2005 detailsQuasi maximum likelihood blind deconvolution: super- an sub-Gaussianity versus consistency
A. M. Bronstein, M. M. Bronstein, M. ZibulevskyIEEE Trans. Signal Processing, Vol. 53(7), 2005In this note we consider the problem of MIMO quasi maximum likelihood (QML) blind deconvolution. We examine two classes of estimators, which are commonly believed to be suitable for super- and sub-Gaussian sources. We state the consistency conditions and demonstrate a distribution, for which the studied estimators are unsuitable, in the sense that they are asymptotically unstable
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Relative optimization for blind deconvolution, IEEE Trans. on Signal Processing, Vol. 53(6), 2005 detailsRelative optimization for blind deconvolution
A. M. Bronstein, M. M. Bronstein, M. ZibulevskyIEEE Trans. on Signal Processing, Vol. 53(6), 2005We propose a relative optimization framework for quasi-maximum likelihood (QML) blind deconvolution and the relative Newton method as its particular instance. The special Hessian structure allows fast Newton system construction and solution, resulting in a fast-convergent algorithm with iteration complexity comparable to that of gradient methods. We also propose the use of rational IIR restoration kernels, which constitute a richer family of filters than the traditionally used FIR kernels. We discuss different choices of non-linear functions suitable for deconvolution of super- and sub-Gaussian sources and formulate the conditions, under which the QML estimation is stable. Simulation results demonstrate the efficiency of the proposed methods.
M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Blind deconvolution of images using optimal sparse representations, IEEE Trans. on Image Processing, Vol. 14(6), 2005 detailsBlind deconvolution of images using optimal sparse representations
M. M. Bronstein, A. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviIEEE Trans. on Image Processing, Vol. 14(6), 2005The relative Newton algorithm, previously proposed for quasi-maximum likelihood blind source separation and blind deconvolution of one-dimensional signals is generalized for blind deconvolution of images. Smooth approximation of the absolute value is used in modeling the log probability density function, which is suitable for sparse sources. In addition, we propose a method of sparsification, which allows blind deconvolution of sources with arbitrary distribution, and show how to find optimal sparsifying transformations by training.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Expression-invariant face recognition via spherical embedding, Proc. Int'l Conf. on Image Processing (ICIP), 2005 detailsExpression-invariant face recognition via spherical embedding
A. M. Bronstein, M. M. Bronstein, R. KimmelProc. Int'l Conf. on Image Processing (ICIP), 2005Recently, it was proven empirically that facial expressions can be modeled as isometries, that is, geodesic distances on the facial surface were shown to be significantly less sensitive to facial expressions compared to Euclidean ones. Based on this assumption, the 3DFACE face recognition system was built. The system efficiently computes expression invariant signatures based on an isometry-invariant representation of the facial surface. One of the crucial steps in the recognition system was embedding of the face geometric structure into a Euclidean (flat) space. Here, we propose to replace the flat embedding by a spherical one to construct isometric invariant representations of the facial image. We refer to these new invariants as spherical canonical images. Compared to its Euclidean counterpart, spherical embedding leads to notably smaller metric distortion. We demonstrate experimentally that representations with lower embedding error lead to better recognition. In order to efficiently compute the invariants, we introduce a dissimilarity measure between the spherical canonical images based on the spherical harmonic transform.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Unmixing tissues: sparse component analysis in multi-contrast MRI, Proc. Int'l Conf. on Image Processing (ICIP), 2005 detailsUnmixing tissues: sparse component analysis in multi-contrast MRI
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviProc. Int'l Conf. on Image Processing (ICIP), 2005We pose the problem of tissue classification in MRI as a blind source separation (BSS) problem and solve it by means of sparse component analysis (SCA). Assuming that most MR images can be sparsely represented, we consider their optimal sparse representation. Sparse components define a physically-meaningful feature space for classification. We demonstrate our approach on simulated and real multi-contrast MRI data. The proposed framework is general in that it is applicable to other modalities of medical imaging as well, whenever the linear mixing model is applicable.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Isometric embedding of facial surfaces into S^3, Proc. Int'l Conf. on Scale Space and PDE Methods in Computer Vision (SSVM), 2005 detailsIsometric embedding of facial surfaces into S^3
A. M. Bronstein, M. M. Bronstein, R. KimmelProc. Int'l Conf. on Scale Space and PDE Methods in Computer Vision (SSVM), 2005The problem of isometry-invariant representation and comparison of surfaces is of cardinal importance in pattern recognition applications dealing with deformable objects. Particularly, in three-dimensional face recognition treating facial expressions as isometries of the facial surface allows to perform robust recognition insensitive to expressions. Isometry-invariant representation of surfaces can be constructed by isometrically embedding them into some convenient space, and carrying out the comparison in that space. Presented here is a discussion on isometric embedding into S3, which appears to be superior over the previously used Euclidean space in sense of the representation accuracy.
M. M. Bronstein, A. M. Bronstein, R. Kimmel, I. Yavneh, A multigrid approach for multi-dimensional scaling, Proc. Copper Mountain Conf. Multigrid Methods, 2005 (Best Paper Award) detailsA multigrid approach for multi-dimensional scaling
M. M. Bronstein, A. M. Bronstein, R. Kimmel, I. YavnehProc. Copper Mountain Conf. Multigrid Methods, 2005 (Best Paper Award)A multigrid approach for the efficient solution of large-scale multidimensional scaling (MDS) problems is presented. The main motivation is a recent application of MDS to isometry-invariant representation of surfaces, in particular, for expression-invariant recognition of human faces. Simulation results show that the proposed approach significantly outperforms conventional MDS algorithms.
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. Zeevi, Sparse ICA for blind separation of transmitted and reflected images, Int'l Journal of Imaging Science and Technology (IJIST), Vol. 15(1), 2005 detailsSparse ICA for blind separation of transmitted and reflected images
A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, Y. Y. ZeeviInt'l Journal of Imaging Science and Technology (IJIST), Vol. 15(1), 2005We address the problem of recovering a scene recorded through a semi-reflecting medium (i.e. planar lens), with a virtual reflected image being superimposed on the image of the scene transmitted through the semi-reflective lens. Recent studies propose imaging through a linear polarizer at several orientations to estimate the reflected and the transmitted components in the scene. In this stud,y we extend the sparse ICA (SPICA) technique and apply it to the problem of separating the image of the scene without having any a priori knowledge about its structure or statistics. Recent novel advances in the SPICA approach are discussed. Simulation and experimental results demonstrate the efficacy of the proposed methods.
- A. M. Bronstein, M. M. Bronstein, R. Kimmel, Robust expression-invariant face recognition from partially missing data, Proc. European Conf. on Computer Vision (ECCV), 2006 details
Robust expression-invariant face recognition from partially missing data
A. M. Bronstein, M. M. Bronstein, R. KimmelProc. European Conf. on Computer Vision (ECCV), 2006Recent studies on three-dimensional face recognition proposed to model facial expressions as isometries of the facial surface. Based on this model, expression-invariant signatures of the face were constructed by means of approximate isometric embedding into flat spaces. Here, we apply a new method for measuring isometry-invariant similarity between faces by embedding one facial surface into another. We demonstrate that our approach has several significant advantages, one of which is the ability to handle partially missing data. Promising face recognition results are obtained in numerical experiments even when the facial surfaces are severely occluded.
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. Kimmel, Matching two-dimensional articulated shapes using generalized multidimensional scaling, Proc. Conf. on Articulated Motion and Deformable Objects (AMDO), 2006 detailsMatching two-dimensional articulated shapes using generalized multidimensional scaling
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. KimmelProc. Conf. on Articulated Motion and Deformable Objects (AMDO), 2006We present a theoretical and computational framework for matching of two-dimensional articulated shapes. Assuming that articulations can be modeled as near-isometries, we show an axiomatic construction of an articulation-invariant distance between shapes, formulated as a generalized multidimensional scaling (GMDS) problem and solved efficiently. Some numerical results demonstrating the accuracy of our method are presented.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Face2Face: an isometric model for facial animation, Proc. Conf. on Articulated Motion and Deformable Objects (AMDO), 2006 detailsFace2Face: an isometric model for facial animation
A. M. Bronstein, M. M. Bronstein, R. KimmelProc. Conf. on Articulated Motion and Deformable Objects (AMDO), 2006A geometric framework for finding intrinsic correspondence between animated 3D faces is presented. We model facial expressions as isometries of the facial surface and find the correspondence between two faces as the minimum-distortion mapping. Generalized multidimensional scaling is used for this goal. We apply our approach to texture mapping onto 3D video, expression exaggeration and morphing between faces.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Efficient computation of isometry-invariant distances between surfaces, SIAM J. Scientific Computing, Vol. 28(5), 2006A. M. Bronstein, M. M. Bronstein, M. Zibulevsky, On separation of semitransparent dynamic images from static background, Proc. Int'l Conf. on Independent Component Analysis and Blind Signal Separation, 2006 detailsOn separation of semitransparent dynamic images from static background
A. M. Bronstein, M. M. Bronstein, M. ZibulevskyProc. Int'l Conf. on Independent Component Analysis and Blind Signal Separation, 2006Presented here is the problem of recovering a dynamic image superimposed on a static background. Such a problem is ill-posed and may arise e.g. in imaging through semireflective media, in separation of an illumination image from a reflectance image, in imaging with diffraction phenomena, etc. In this work we study regularization of this problem in spirit of Total Variation and general sparsifying transformations.
M. M. Bronstein, A. M. Bronstein, R. Kimmel, I. Yavneh, Multigrid multidimensional scaling, Numerical Linear Algebra with Applications (NLAA), Vol. 13(2), 2006 (Special issue on multigrid methods) detailsMultigrid multidimensional scaling
M. M. Bronstein, A. M. Bronstein, R. Kimmel, I. YavnehNumerical Linear Algebra with Applications (NLAA), Vol. 13(2), 2006 (Special issue on multigrid methods)Multidimensional scaling (MDS) is a generic name for a family of algorithms that construct a configuration of points in a target metric space from information about inter-point distances measured in some other metric space. Large-scale MDS problems often occur in data analysis, representation, and visualization. Solving such problems efficiently is of key importance in many applications. In this paper, we present a multigrid framework for MDS problems. We demonstrate the performance of our algorithm on dimensionality reduction and isometric embedding problems, two classical problems requiring efficient large-scale MDS. Simulation results show that the proposed approach significantly outperforms conventional MDS algorithms.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Generalized multidimensional scaling: a framework for isometry-invariant partial surface matching, Proc. US National Academy of Sciences (PNAS), Vol. 103(5), 2006 detailsGeneralized multidimensional scaling: a framework for isometry-invariant partial surface matching
A. M. Bronstein, M. M. Bronstein, R. KimmelProc. US National Academy of Sciences (PNAS), Vol. 103(5), 2006An efficient algorithm for isometry-invariant matching of surfaces is presented. The key idea is computing the minimum-distortion mapping between two surfaces. For this purpose, we introduce the generalized multidimensional scaling, a computationally efficient continuous optimization algorithm for finding the least distortion embedding of one surface into another. The generalized multidimensional scaling algorithm allows for both full and partial surface matching. As an example, it is applied to the problem of expression- invariant three-dimensional face recognition.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Expression invariant face recognition: faces as isometric surfaces, Chapter in Face Processing: Advanced Modeling and Methods (Rama Chellappa, Wenyi Zhao Eds.), Academic Press, 2006 detailsExpression invariant face recognition: faces as isometric surfaces
A. M. Bronstein, M. M. Bronstein, R. KimmelChapter in Face Processing: Advanced Modeling and Methods (Rama Chellappa, Wenyi Zhao Eds.), Academic Press, 2006One of the hardest problems in face recognition is dealing with facial expressions. Finding an expression-invariant representation of the face could be a remedy for this problem. We suggest treating faces as deformable surfaces in the context of Riemannian geometry, and propose to approximate facial expressions as isometries of the facial surface. This way, we can define geometric invariants of a given face under different expressions. One such invariant is constructed by isometrically embedding the facial surface structure into a low-dimensional flat space. Based on this approach, we built an accurate three-dimensional face recognition system that is able to distinguish between identical twins under various facial expressions. In this chapter we show how under the near-isometric model assumption, the difficult problem of face recognition in the presence of facial expressions can be solved in a relatively simple way.
- A. M. Bronstein, M. M. Bronstein, R. Kimmel, Calculus of non-rigid surfaces for geometry and texture manipulation, IEEE Trans. Visualization and Computer Graphics, Vol 13(5), 2007 details
Calculus of non-rigid surfaces for geometry and texture manipulation
A. M. Bronstein, M. M. Bronstein, R. KimmelIEEE Trans. Visualization and Computer Graphics, Vol 13(5), 2007We present a geometric framework for automatically finding intrinsic correspondence between three-dimensional nonrigid objects. We model object deformation as near isometries and find the correspondence as the minimum-distortion mapping. A generalization of multidimensional scaling is used as the numerical core of our approach. As a result, we obtain the possibility to manipulate the extrinsic geometry and the texture of the objects as vectors in a linear space. We demonstrate our method on the problems of expression-invariant texture mapping onto an animated three-dimensional face, expression exaggeration, morphing between faces, and virtual body painting.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Rock, Paper, and Scissors: extrinsic vs. intrinsic similarity of non-rigid shapes, Proc. Int'l Conf. Computer Vision (ICCV), 2007 detailsRock, Paper, and Scissors: extrinsic vs. intrinsic similarity of non-rigid shapes
A. M. Bronstein, M. M. Bronstein, R. KimmelProc. Int'l Conf. Computer Vision (ICCV), 2007This paper explores similarity criteria between non-rigid shapes. Broadly speaking, such criteria are divided into intrinsic and extrinsic, the first referring to the metric structure of the objects and the latter to the geometry of the shapes in the Euclidean space. Both criteria have their advantages and disadvantages; extrinsic similarity is sensitive to non-rigid deformations of the shapes, while intrinsic similarity is sensitive to topological noise. Here, we present an approach unifying both criteria in a single distance. Numerical results demonstrate the robustness of our approach in cases where using only extrinsic or intrinsic criteria fail.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Weighted distance maps computation on parametric three-dimensional manifolds, Journal of Computational Physics, Vol. 255(1), 2007 detailsWeighted distance maps computation on parametric three-dimensional manifolds
A. M. Bronstein, M. M. Bronstein, R. KimmelJournal of Computational Physics, Vol. 255(1), 2007We propose an effcient computational solver for the eikonal equations on parametric three-dimensional manifolds. Our approach is based on the fast marching method for solving the eikonal equation in O(n log n) steps by numerically simulating wavefront propagation. The obtuse angle splitting problem is reformulated as a set of small integer linear programs, that can be solved in O(n). Numerical simulations demonstrate the accuracy of the proposed algorithm.
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. Kimmel, Paretian similarity for partial comparison of non-rigid objects, Proc. Scale Space and Variational Methods in Computer Vision (SSVM), 2007 detailsParetian similarity for partial comparison of non-rigid objects
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. KimmelProc. Scale Space and Variational Methods in Computer Vision (SSVM), 2007In this paper, we address the problem of partial comparison of non-rigid objects. We introduce a new class of set-valued distances, related to the concept of Pareto optimality in economics. Such distances allow to capture intrinsic geometric similarity between parts of non-rigid objects, obtaining semantically meaningful comparison results. The numerical implementation of our method is computationally efficient and is similar to GMDS, a multidimensional scaling-like continuous optimization problem.
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. Kimmel, Partial similarity of objects and text sequences, Proc. Information Theory and Applications Workshop, 2007 detailsPartial similarity of objects and text sequences
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. KimmelProc. Information Theory and Applications Workshop, 2007Similarity is one of the most important abstract concepts in the human perception of the world. In computer vision, numerous applications deal with comparing objects observed in a scene with some a priori known patterns. Often, it happens that while two objects are not similar, they have large similar parts, that is, they are partially similar. Here, we present a novel approach to quantify this semantic definition of partial similarity using the notion of Pareto optimality. We exemplify our approach on the problems of recognizing non-rigid objects and analyzing text sequences.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Expression-invariant representation of faces, IEEE Trans. Image Processing, Vol. 16(1), 2007 detailsExpression-invariant representation of faces
A. M. Bronstein, M. M. Bronstein, R. KimmelIEEE Trans. Image Processing, Vol. 16(1), 2007We present an efficient computational framework for isometry-invariant comparison of smooth surfaces. We formulate the Gromov-Hausdorff distance as a multidimensional scaling (MDS)-like continuous optimization problem. In order to construct an efficient optimization scheme, we develop a numerical tool for interpolating geodesic distances on a sampled surface from precomputed geodesic distances between the samples. For isometry-invariant comparison of surfaces in the case of partially missing data, we present the partial embedding distance, which is computed using a similar scheme. The main idea is finding a minimum-distortion mapping from one surface to another while considering only relevant geodesic distances. We discuss numerical implementation issues and present experimental results that demonstrate its accuracy and efficiency.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Story of Cinderella: biometrics and isometry-invariant distances, Chapter in 3D Imaging for Safety and Security (A. Koschan, M. Pollefeys, M. Abidi Eds.), Springer, 2007 detailsStory of Cinderella: biometrics and isometry-invariant distances
A. M. Bronstein, M. M. Bronstein, R. KimmelChapter in 3D Imaging for Safety and Security (A. Koschan, M. Pollefeys, M. Abidi Eds.), Springer, 2007In this chapter, we address the question of what are the facial measures one could use in order to distinguish between people. Our starting point is the fact that the expressions of our face can, in most cases, be modeled as isometries, which we validate empirically. Then, based on this observation, we introduce a technique that enables us to distinguish between people based on the intrinsic geometry of their faces. We provide empirical evidence that the proposed geometric measures are invariant to facial expressions and relate our findings to the broad context of biometric methods, ranging from modern face recognition technologies to fairy tales and biblical stories.
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Symmetries of non-rigid shapes, Proc. Workshop on Non-rigid Registration and Tracking through Learning (NRTL), 2007 detailsSymmetries of non-rigid shapes
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. KimmelProc. Workshop on Non-rigid Registration and Tracking through Learning (NRTL), 2007Symmetry and self-similarity is the cornerstone of Nature, exhibiting itself through the shapes of natural creations and ubiquitous laws of physics. Since many natural objects are symmetric, the absence of symmetry can often be an indication of some anomaly or abnormal behavior. Therefore, detection of asymmetries is important in numerous practical applications, including crystallography, medical imaging, and face recognition, to mention a few. Conversely, the assumption of underlying shape symmetry can facilitate solutions to many problems in shape reconstruction and analysis. Traditionally, symmetries are described as extrinsic geometric properties of the shape. While being adequate for rigid shapes, such a description is inappropriate for non-rigid ones. Extrinsic symmetry can be broken as a result of shape deformations, while its intrinsic symmetry is preserved. In this paper, we pose the problem of finding intrinsic symmetries of non-rigid shapes and propose an efficient method for their computation.
- O. Weber, Y. Devir, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Parallel algorithms for approximation of distance maps on parametric surfaces, ACM Trans. on Graphics, Vol. 27(4), 2008 details
Parallel algorithms for approximation of distance maps on parametric surfaces
O. Weber, Y. Devir, A. M. Bronstein, M. M. Bronstein, R. KimmelACM Trans. on Graphics, Vol. 27(4), 2008We present an efficient O(n) numerical algorithm for first-order approximation of geodesic distances on geometry images, where n is the number of points on the surface. The structure of our algorithm allows efficient implementation on parallel architectures. Two implementations on a SIMD processor and on a GPU are discussed. Numerical results demonstrate up to four orders of magnitude improvement in execution time compared to the state-of-the-art algorithms.
A. M. Bronstein, M. M. Bronstein, Regularized partial matching of rigid shapes, Proc. European Conf. on Computer Vision (ECCV), 2008 detailsRegularized partial matching of rigid shapes
A. M. Bronstein, M. M. BronsteinProc. European Conf. on Computer Vision (ECCV), 2008Matching of rigid shapes is an important problem in numerous applications across the boundary of computer vision, pattern recognition and computer graphics communities. A particularly challenging setting of this problem is partial matching, where the two shapes are dissimilar in general but have significant similar parts. In this paper, we show a rigorous approach allowing to find matching parts of rigid shapes with controllable size and regularity. The regularity term we use is similar to the spirit of the Mumford-Shah functional, extended to non-Euclidean spaces. Numerical experiments show that the regularized partial matching produces better results compared to the non-regularized one.
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. Kimmel, Analysis of two-dimensional non-rigid shapes, Int'l Journal of Computer Vision (IJCV), Vol. 78(1), 2008 detailsAnalysis of two-dimensional non-rigid shapes
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. KimmelInt'l Journal of Computer Vision (IJCV), Vol. 78(1), 2008Analysis of deformable two-dimensional shapes is an important problem, encountered in numerous pattern recognition, computer vision, and computer graphics applications. In this paper, we address three major problems in the analysis of non-rigid shapes: similarity, partial similarity, and correspondence. We present an axiomatic construction of similarity criteria for deformation-invariant shape comparison, based on intrinsic geometric properties of the shapes, and show that such criteria are related to the Gromov-Hausdorff distance. Next, we extend the problem of similarity computation to shapes which have similar parts but are dissimilar when considered as a whole and present a construction of set-valued distances, based on the notion of Pareto optimality. Finally, we show that the correspondence between non-rigid shapes can be obtained as a byproduct of the non-rigid similarity problem. As a numerical framework, we use the generalized multidimensional scaling (GMDS) method, which is the numerical core of the three problems addressed in this paper.
A. M. Bronstein, M. M. Bronstein, Not only size matters: regularized partial matching of nonrigid shapes, Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Registration (NORDIA), 2008 detailsNot only size matters: regularized partial matching of nonrigid shapes
A. M. Bronstein, M. M. BronsteinProc. Workshop on Nonrigid Shape Analysis and Deformable Image Registration (NORDIA), 2008Partial matching is probably one of the most challenging problems in nonrigid shape analysis. The problem consists of matching similar parts of shapes that are dissimilar on the whole and can assume different forms by undergoing nonrigid deformations. Conceptually, two shapes can be considered partially matching if they have significant similar parts, with the simplest definition of significance being the size of the parts. Thus, partial matching can be defined as a multcriterion optimization problem trying to simultaneously maximize the similarity and the size of these parts. In this paper, we propose a different definition of significance, taking into account the regularity of parts besides their size. The regularity term proposed here is similar to the spirit of the Mumford-Shah functional. Numerical experiments show that the regularized partial matching produces semantically better results compared to the non-regularized one.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Numerical geometry of non-rigid shapes, Springer, 2008, ISBN: 978-0387733005 detailsNumerical geometry of non-rigid shapes
A. M. Bronstein, M. M. Bronstein, R. KimmelSpringer, 2008, ISBN: 978-0387733005Deformable objects are ubiquitous in the world surrounding us, on all levels from micro to macro. The need to study such shapes and model their behavior arises in a wide spectrum of applications, ranging from medicine to security. In recent years, non-rigid shapes have attracted growing interest, which has led to rapid development of the field, where state-of-the-art results from very different sciences – theoretical and numerical geometry, optimization, linear algebra, graph theory, machine learning and computer graphics, to mention several – are applied to find solutions.
This book gives an overview of the current state of science in analysis and synthesis of non-rigid shapes. Everyday examples are used to explain concepts and to illustrate different techniques. The presentation unfolds systematically and numerous figures enrich the engaging exposition. Practice problems follow at the end of each chapter, with detailed solutions to selected problems in the appendix. A gallery of colored images enhances the text.
This book will be of interest to graduate students, researchers and professionals in different fields of mathematics, computer science and engineering. It may be used for courses in computer vision, numerical geometry and geometric modeling and computer graphics or for self-study.
G. Rosman, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Topologically constrained isometric embedding, Human Motion Understanding, Modeling, Capture, and Animation, Computational Imaging and Vision, Vol. 36, Springer, 2008 detailsTopologically constrained isometric embedding
G. Rosman, A. M. Bronstein, M. M. Bronstein, R. KimmelHuman Motion Understanding, Modeling, Capture, and Animation, Computational Imaging and Vision, Vol. 36, Springer, 2008We present a new algorithm for nonlinear dimensionality reduction that consistently uses global information, which enables understanding the intrinsic geometry of non-convex manifolds. Compared to methods that consider only local information, our method appears to be more robust to noise. We demonstrate the performance of our algorithm and compare it to state-of-the-art methods on synthetic as well as real data.
R. Giryes, A. M. Bronstein, Y. Moshe, M. M. Bronstein, Embedded system for 3D shape reconstruction, Proc. European DSP Education and Research Symposium (EDERS), 2008 detailsEmbedded system for 3D shape reconstruction
R. Giryes, A. M. Bronstein, Y. Moshe, M. M. BronsteinProc. European DSP Education and Research Symposium (EDERS), 2008Many applications that use three-dimensional scanning require a low cost, accurate and fast solution. This paper presents a fixed-point implementation of a real time active stereo threedimensional acquisition system on a Texas Instruments DM6446 EVM board which meets these requirements. A time-multiplexed structured light reconstruction technique is described and a fixed point algorithm for its implementation is proposed. This technique uses a standard camera and a standard projector. The fixed point reconstruction algorithm runs on the DSP core while the ARM controls the DSP and is responsible for communication with the camera and projector. The ARM uses the projector to project coded light and the camera to capture a series of images. The captured data is sent to the DSP. The DSP, in turn, performs the 3D reconstruction and returns the results to the ARM for storing. The inter-core communication is performed using the xDM interface and VISA API. Performance evaluation of a fully working prototype proves the feasibility of a fixed-point embedded implementation of a real time three-dimensional scanner, and the suitability of the DM6446 chip for such a system.
- O. Rubinstein, Y. Honen, A. M. Bronstein, M. M. Bronstein, R. Kimmel, 3D color video camera, Proc. Workshop on 3D Digital Imaging and Modeling (3DIM), 2009 details
3D color video camera
O. Rubinstein, Y. Honen, A. M. Bronstein, M. M. Bronstein, R. KimmelProc. Workshop on 3D Digital Imaging and Modeling (3DIM), 2009We introduce a design of a coded light-based 3D color video camera optimized for build up cost as well as accuracy in depth reconstruction and acquisition speed. The components of the system include a monochromatic camera and an off-the-shelf LED projector synchronized by a miniature circuit. The projected patterns are captured and processed at a rate of 200 fps and allow for real-time reconstruction of both depth and color at video rates. The reconstruction and display are performed at around 30 depth profiles and color texture per second using a graphics processing unit (GPU).
M. Ovsjanikov, A. M. Bronstein, M. M. Bronstein, L. Guibas, ShapeGoogle: a computer vision approach for invariant shape retrieval, Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2009 detailsShapeGoogle: a computer vision approach for invariant shape retrieval
M. Ovsjanikov, A. M. Bronstein, M. M. Bronstein, L. GuibasProc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2009Feature-based methods have recently gained popularity in computer vision and pattern recognition communities, in applications such as object recognition and image retrieval. In this paper, we explore analogous approaches in the 3D world applied to the problem of non-rigid shape search and retrieval in large databases.
Y. Devir, G. Rosman, A. M. Bronstein, M. M. Bronstein, R. Kimmel, On reconstruction of non-rigid shapes with intrinsic regularization, Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2009 detailsOn reconstruction of non-rigid shapes with intrinsic regularization
Y. Devir, G. Rosman, A. M. Bronstein, M. M. Bronstein, R. KimmelProc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2009Shape-from-X is a generic type of inverse problems in computer vision, in which a shape is reconstructed from some measurements. A specially challenging setting of this problem is the case in which the reconstructed shapes are non-rigid. In this paper, we propose a framework for intrinsic regularization of such problems. The assumption is that we have the geometric structure of a shape which is intrinsically (up to bending) similar to the one we would like to reconstruct. For that goal, we formulate a variation with respect to vertex coordinates of a triangulated mesh approximating the continuous shape. The numerical core of the proposed method is based on differentiating the fast marching update step for geodesic distance computation.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, Topology-invariant similarity of nonrigid shapes, Int'l Journal of Computer Vision (IJCV), Vol. 81(3), 2009 detailsTopology-invariant similarity of nonrigid shapes
A. M. Bronstein, M. M. Bronstein, R. KimmelInt'l Journal of Computer Vision (IJCV), Vol. 81(3), 2009This paper explores the problem of similarity criteria between nonrigid shapes. Broadly speaking, such criteria are divided into intrinsic and extrinsic, the first referring to the metric structure of the object and the latter to how it is laid out in the Euclidean space. Both criteria have their advantages and disadvantages: extrinsic similarity is sensitive to nonrigid deformations, while intrinsic similarity is sensitive to topological noise. In this paper, we approach the problem from the perspective of metric geometry. We show that by unifying the extrinsic and intrinsic similarity criteria, it is possible to obtain a stronger topology-invariant similarity, suitable for comparing deformed shapes with different topology. We construct this new joint criterion as a tradeoff between the extrinsic and intrinsic similarity and use it as a set-valued distance. Numerical results demonstrate the efficiency of our approach in cases where using either extrinsic or intrinsic criteria alone would fail.
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. Kimmel, Partial similarity of objects, or how to compare a centaur to a horse, Int'l Journal of Computer Vision (IJCV), Vol. 84(2), 2009 detailsPartial similarity of objects, or how to compare a centaur to a horse
A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, R. KimmelInt'l Journal of Computer Vision (IJCV), Vol. 84(2), 2009Similarity is one of the most important abstract concepts in human perception of the world. In computer vision, numerous applications deal with comparing objects observed in a scene with some a priori known patterns. Often, it happens that while two objects are not similar, they have large similar parts, that is, they are partially similar. Here, we present a novel approach to quantify partial similarity using the notion of Pareto optimality. We exemplify our approach on the problems of recognizing non-rigid geometric objects, images, and analyzing text sequences.
A. M. Bronstein, M. M. Bronstein, Y. Carmon, R. Kimmel, Partial similarity of shapes using a statistical significance measure, IPSJ Trans. Computer Vision and Application, Vol. 1, 2009 detailsPartial similarity of shapes using a statistical significance measure
A. M. Bronstein, M. M. Bronstein, Y. Carmon, R. KimmelIPSJ Trans. Computer Vision and Application, Vol. 1, 2009Partial matching of geometric structures is important in computer vision, pattern recognition and shape analysis applications. The problem consists of matching similar parts of shapes that may be dissimilar as a whole. Recently, it was proposed to consider partial similarity as a multi-criterion optimization problem trying to simultaneously maximize the similarity and the significance of the matching parts. A major challenge in that framework is providing a quantitative measure of the significance of a part of an object. Here, we define the significance of a part of a shape by its discriminative power with respect do a given shape database—that is, the uniqueness of the part. We define a point-wise significance density using a statistical weighting approach similar to the term frequency-inverse document frequency (tfidf) weighting employed in search engines. The significance measure of a given part is obtained by integrating over this density. Numerical experiments show that the proposed approach produces intuitive significant parts, and demonstrate an improvement in the performance of partial matching between shapes.
- D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. Sochen, Affine-invariant geodesic geometry of deformable 3D shapes, arXiv:1012.5936, 2010 details
Affine-invariant geodesic geometry of deformable 3D shapes
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. SochenarXiv:1012.5936, 2010Natural objects can be subject to various transformations yet still preserve properties that we refer to as invariants. Here, we use definitions of affine invariant arclength for surfaces in R3 in order to extend the set of existing non-rigid shape analysis tools. In fact, we show that by re-defining the surface metric as its equi-affine version, the surface with its modified metric tensor can be treated as a canonical Euclidean object on which most classical Euclidean processing and analysis tools can be applied. The new definition of a metric is used to extend the fast marching method technique for computing geodesic distances on surfaces, where now, the distances are defined with respect to an affine invariant arclength. Applications of the proposed framework demonstrate its invariance, efficiency, and accuracy in shape analysis.
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. Sochen, Affine-invariant diffusion geometry for the analysis of deformable 3D shapes, arXiv:1012.5933, 2010 detailsAffine-invariant diffusion geometry for the analysis of deformable 3D shapes
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. SochenarXiv:1012.5933, 2010We introduce an (equi-)affine invariant diffusion geometry by which surfaces that go through squeeze and shear transformations can still be properly analyzed. The definition of an affine invariant metric enables us to construct an invariant Laplacian from which local and global geometric structures are extracted. Applications of the proposed framework demon- strate its power in generalizing and enriching the existing set of tools for shape analysis.
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Full and partial symmetries of non-rigid shapes, Int'l Journal of Computer Vision (IJCV), Vol. 89(1), 2010 detailsFull and partial symmetries of non-rigid shapes
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. KimmelInt'l Journal of Computer Vision (IJCV), Vol. 89(1), 2010Symmetry and self-similarity is the cornerstone of Nature, exhibiting itself through the shapes of natural creations and ubiquitous laws of physics. Since many natural objects are symmetric, the absence of symmetry can often be an indication of some anomaly or abnormal behavior. Therefore, detection of asymmetries is important in numerous practical applications, including crystallography, medical imaging, and face recognition, to mention a few. Conversely, the assumption of underlying shape symmetry can facilitate solutions to many problems in shape reconstruction and analysis. Traditionally, symmetries are described as extrinsic geometric properties of the shape. While being adequate for rigid shapes, such a description is inappropriate for non-rigid ones: extrinsic symmetry can be broken as a result of shape deformations, while its intrinsic symmetry is preserved. In this paper, we present a generalization of symmetries for non-rigid shapes and a numerical framework for their analysis, addressing the problems of full and partial exact and approximate symmetry detection and classification.
A. M. Bronstein, M. M. Bronstein, R. Kimmel, M. Mahmoudi, G. Sapiro, A Gromov-Hausdorff framework with diffusion geometry for topologically-robust non-rigid shape matching, Int'l Journal of Computer Vision (IJCV), Vol. 89(2), 2010 detailsA Gromov-Hausdorff framework with diffusion geometry for topologically-robust non-rigid shape matching
A. M. Bronstein, M. M. Bronstein, R. Kimmel, M. Mahmoudi, G. SapiroInt'l Journal of Computer Vision (IJCV), Vol. 89(2), 2010In this paper, the problem of non-rigid shape recognition is viewed from the perspective of metric geometry, and the applicability of diffusion distances within the Gromov-Hausdorff framework is explored. While the commonly used geodesic distance exploits the shortest path between points on the surface, the diffusion distance averages all paths connecting between the points. The diffusion distance provides an intrinsic distance measure which is robust, in particular to topological changes. Such changes may be a result of natural non-rigid deformations, as well as acquisition noise, in the form of holes or missing data, and representation noise due to inaccurate mesh construction. The presentation of the proposed framework is complemented with numerous examples demonstrating that in addition to the relatively low complexity involved in the computation of the diffusion distances between surface points, its recognition and matching performances favorably compare to the classical geodesic distances in the presence of topological changes between the non-rigid shapes.
N. Mitra, A. M. Bronstein, M. M. Bronstein, Intrinsic regularity detection in 3D geometry, Proc. European Conf. Computer Vision (ECCV), 2010 detailsIntrinsic regularity detection in 3D geometry
N. Mitra, A. M. Bronstein, M. M. BronsteinProc. European Conf. Computer Vision (ECCV), 2010Automatic detection of symmetries, regularity, and repetitive structures in 3D geometry is a fundamental problem in shape analysis and pattern recognition with applications in computer vision and graphics. Especially challenging is to detect intrinsic regularity, where the repetitions are on an intrinsic grid, without any apparent Euclidean pattern to describe the shape, but rising out of (near) isometric deformation of the underlying surface. In this paper, we employ multidimensional scaling to reduce the problem of intrinsic structure detection to a simpler problem of 2D grid detection. Potential 2D grids are then identified using an autocorrelation analysis, refined using local fitting, validated, and finally projected back to the spatial domain. We test the detection algorithm on a variety of scanned plaster models in presence of imperfections like missing data, noise and outliers. We also present a range of applications including scan completion, shape editing, super-resolution, and structural correspondence.
A. M. Bronstein, M. M. Bronstein, Spatially-sensitive affine-invariant image descriptors, Proc. European Conf. Computer Vision (ECCV), 2010 detailsSpatially-sensitive affine-invariant image descriptors
A. M. Bronstein, M. M. BronsteinProc. European Conf. Computer Vision (ECCV), 2010Invariant image descriptors play an important role in many computer vision and pattern recognition problems such as image search and retrieval. A dominant paradigm today is that of “bags of features”, a representation of images as distributions of primitive visual elements. The main disadvantage of this approach is the loss of spatial relations between features, which often carry important information about the image. In this paper, we show how to construct spatially-sensitive image descriptors in which both the features and their relation are affine-invariant. Our construction is based on a vocabulary of pairs of features coupled with a vocabulary of invariant spatial relations between the features. Experimental results show the advantage of our approach in image retrieval applications.
M. M. Bronstein, A. M. Bronstein, F. Michel, N. Paragios, Data fusion through cross-modality metric learning using similarity-sensitive hashing, Proc. Computer Vision and Pattern Recognition (CVPR), 2010 detailsData fusion through cross-modality metric learning using similarity-sensitive hashing
M. M. Bronstein, A. M. Bronstein, F. Michel, N. ParagiosProc. Computer Vision and Pattern Recognition (CVPR), 2010Visual understanding is often based on measuring similarity between observations. Learning similarities specific to a certain perception task from a set of examples has been shown advantageous in various computer vision and pattern recognition problems. In many important applications, the data that one needs to compare come from different representations or modalities, and the similarity between such data operates on objects that may have different and often incommensurable structure and dimensionality. In this paper, we propose a framework for supervised similarity learning based on embedding the input data from two arbitrary spaces into the Hamming space. The mapping is expressed as a binary classification problem with positive and negative examples, and can be efficiently learned using boosting algorithms. The utility and efficiency of such a generic approach is demonstrated on several challenging applications including cross-representation shape retrieval and alignment of multi-modal medical images.
D. Raviv, M. M. Bronstein, A. M. Bronstein, R. Kimmel, Volumetric heat kernel signatures, Proc. Int'l Workshop on 3D Object Retrieval (3DOR), ACM Multimedia, 2010 detailsVolumetric heat kernel signatures
D. Raviv, M. M. Bronstein, A. M. Bronstein, R. KimmelProc. Int'l Workshop on 3D Object Retrieval (3DOR), ACM Multimedia, 2010Invariant shape descriptors are instrumental in numerous shape analysis tasks including deformable shape comparison, registration, classification, and retrieval. Most existing constructions model a 3D shape as a two-dimensional surface describing the shape boundary, typically represented as a triangular mesh or a point cloud. Using intrinsic properties of the surface, invariant descriptors can be designed. One such example is the recently introduced heat kernel signature, based on the Laplace-Beltrami operator of the surface. In many applications, however, a volumetric shape model is more natural and convenient. Moreover, modeling shape deformations as approximate isometries of the volume of an object, rather than its boundary, better captures natural behavior of non-rigid deformations in many cases. Here, we extend the idea of heat kernel signature to robust isometry-invariant volumetric descriptors, and show their utility in shape retrieval. The proposed approach achieves state-of-the-art results on the SHREC 2010 large-scale shape retrieval benchmark.
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, G. Sapiro, Diffusion symmetries of non-rigid shapes, Proc. Int'l Symposium on 3D Data Processing, Visualization and Transmission (3DPVT), 2010 detailsDiffusion symmetries of non-rigid shapes
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, G. SapiroProc. Int'l Symposium on 3D Data Processing, Visualization and Transmission (3DPVT), 2010Detection and modeling of self-similarity and symmetry is important in shape recognition, matching, synthesis, and reconstruction. While the detection of rigid shape symmetries is well-established, the study of symmetries in non- rigid shapes is a much less researched problem. A particularly challenging setting is the detection of symmetries in non-rigid shapes affected by topological noise and asymmetric connectivity. In this paper, we treat shapes as metric spaces, with the metric induced by heat diffusion properties, and define non-rigid symmetries as self-isometries with respect to the diffusion metric. Experimental results show the advantage of the diffusion metric over the previously proposed geodesic metric for exploring intrinsic symmetries of bendable shapes with possible topological irregularities
G. Rosman, M. M. Bronstein, A. M. Bronstein, R. Kimmel, Nonlinear dimensionality reduction by topologically constrained isometric embedding, Intl. Journal of Computer Vision (IJCV), Vol. 89(1), 2010 detailsNonlinear dimensionality reduction by topologically constrained isometric embedding
G. Rosman, M. M. Bronstein, A. M. Bronstein, R. KimmelIntl. Journal of Computer Vision (IJCV), Vol. 89(1), 2010Many manifold learning procedures try to embed a given feature data into a flat space of low dimensionality while preserving as much as possible the metric in the natural feature space. The embedding process usually relies on distances between neighboring features, mainly since distances between features that are far apart from each other often provide an unreliable estimation of the true distance on the feature manifold due to its non-convexity. Distortions resulting from using long geodesics indiscriminately lead to a known limitation of the Isomap algorithm when used to map nonconvex manifolds. Presented is a framework for nonlinear dimensionality reduction that uses both local and global distances in order to learn the intrinsic geometry of flat manifolds with boundaries. The resulting algorithm filters out potentially problematic distances between distant feature points based on the properties of the geodesics connecting those points and their relative distance to the boundary of the feature manifold, thus avoiding an inherent limitation of the Isomap algorithm. Since the proposed algorithm matches non-local structures, it is robust to strong noise. We show experimental results demonstrating the advantages of the proposed approach over conventional dimensionality reduction techniques, both global and local in nature.
A. M. Bronstein, M. M. Bronstein, U. Castellani, B. Falcidieno, A. Fusiello, A. Godil, L. J. Guibas, I. Kokkinos, Z. Lian, M. Ovsjanikov, G. Patané, M. Spagnuolo, R. Toldo, SHREC 2010: robust large-scale shape retrieval benchmark, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2010 detailsSHREC 2010: robust large-scale shape retrieval benchmark
A. M. Bronstein, M. M. Bronstein, U. Castellani, B. Falcidieno, A. Fusiello, A. Godil, L. J. Guibas, I. Kokkinos, Z. Lian, M. Ovsjanikov, G. Patané, M. Spagnuolo, R. ToldoProc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2010SHREC’10 robust large-scale shape retrieval benchmark simulates a retrieval scenario, in which the queries include multiple modifications and transformations of the same shape. The benchmark allows evaluating how algorithms cope with certain classes of transformations and what is the strength of the transformations that can be dealt with. The present paper is a report of the SHREC’10 robust large-scale shape retrieval benchmark results.
A. M. Bronstein, M. M. Bronstein, B. Bustos, U. Castellani, M. Crisani, B. Falcidieno, L. J. Guibas, I. Kokkinos, V. Murino, M. Ovsjanikov, G. Patané, I. Sipiran, M. Spagnuolo, J. Sun, SHREC 2010: robust feature detection and description benchmark, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2010 detailsSHREC 2010: robust feature detection and description benchmark
A. M. Bronstein, M. M. Bronstein, B. Bustos, U. Castellani, M. Crisani, B. Falcidieno, L. J. Guibas, I. Kokkinos, V. Murino, M. Ovsjanikov, G. Patané, I. Sipiran, M. Spagnuolo, J. SunProc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2010Feature-based approaches have recently become very popular in computer vision and image analysis application, and are becoming a promising direction in shape retrieval applications. SHREC’10 robust feature detection and description benchmark simulates feature detection and description stage of feature-based shape retrieval algorithms. The benchmark tests the performance of shape feature detectors and descriptors under a wide variety of different transformations. The benchmark allows evaluating how algorithms cope with certain classes of transformations and what is the strength of the transformations that can be dealt with. The present paper is a report of the SHREC’10 robust feature detection and description benchmark results.
A. M. Bronstein, M. M. Bronstein, U. Castellani, A. Dubrovina, L. J. Guibas, R. P. Horaud, R. Kimmel, D. Knossow, E. von Lavante, D. Mateus, M. Ovsjanikov, A. Sharma, SHREC 2010: robust correspondence benchmark, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2010 detailsSHREC 2010: robust correspondence benchmark
A. M. Bronstein, M. M. Bronstein, U. Castellani, A. Dubrovina, L. J. Guibas, R. P. Horaud, R. Kimmel, D. Knossow, E. von Lavante, D. Mateus, M. Ovsjanikov, A. SharmaProc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2010SHREC’10 robust correspondence benchmark simulates a one-to-one shape matching scenario, in which one of the shapes undergoes multiple modifications and transformations. The benchmark allows evaluating how correspondence algorithms cope with certain classes of transformations and what is the strength of the transformations that can be dealt with. The present paper is a report of the SHREC’10 robust correspondence benchmark results.
- R. Kimmel, C. Zhang, A. M. Bronstein, M. M. Bronstein, Are MSER features really interesting?, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 33(11), 2011 details
Are MSER features really interesting?
R. Kimmel, C. Zhang, A. M. Bronstein, M. M. BronsteinIEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 33(11), 2011Detection and description of affine-invariant features is a cornerstone component in numerous computer vision applications. In this note, we analyze the notion of maximally stable extremal regions (MSER) through the prism of the curvature scale space, and conclude that in its original definition, MSER prefers regular (round) regions. Arguing that interesting features in natural images usually have irregular shapes, we propose alternative definitions of MSER which are free of this bias, yet maintain their invariance properties.
A. M. Bronstein, Spectral descriptors for deformable shapes, arXiv:1110.5015, 2011 detailsSpectral descriptors for deformable shapes
A. M. BronsteinarXiv:1110.5015, 2011Informative and discriminative feature descriptors play a fundamental role in deformable shape analysis. For example, they have been successfully employed in correspondence, registration, and retrieval tasks. In the recent years, significant attention has been devoted to descriptors obtained from the spectral decomposition of the Laplace-Beltrami operator associated with the shape. Notable examples in this family are the heat kernel signature (HKS) and the wave kernel signature (WKS). Laplacian-based descriptors achieve state-of-the-art performance in numerous shape analysis tasks; they are computationally efficient, isometry-invariant by construction, and can gracefully cope with a variety of transformations. In this paper, we formulate a generic family of parametric spectral descriptors. We argue that in order to be optimal for a specific task, the descriptor should take into account the statistics of the corpus of shapes to which it is applied (the “signal”) and those of the class of transformations to which it is made insensitive (the “noise”). While such statistics are hard to model axiomatically, they can be learned from examples. Following the spirit of the Wiener filter in signal processing, we show a learning scheme for the construction of optimal spectral descriptors and relate it to Mahalanobis metric learning. The superiority of the proposed approach is demonstrated on the SHREC’10 benchmark.
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. Sochen, Affine-invariant diffusion geometry for the analysis of deformable 3D shapes, Proc. Computer Vision and Pattern Recognition (CVPR), 2011 detailsAffine-invariant diffusion geometry for the analysis of deformable 3D shapes
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. SochenProc. Computer Vision and Pattern Recognition (CVPR), 2011We introduce an (equi-)affine invariant diffusion geometry by which surfaces that go through squeeze and shear transformations can still be properly analyzed. The definition of an affine invariant metric enables us to construct an invariant Laplacian from which local and global geometric structures are extracted. Applications of the proposed framework demonstrate its power in generalizing and enriching the existing set of tools for shape analysis.
M. M. Bronstein, A. M. Bronstein, Shape recognition with spectral distances, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 33(5), 2011 detailsShape recognition with spectral distances
M. M. Bronstein, A. M. BronsteinIEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 33(5), 2011Recent works have shown the use of diffusion geometry for various pattern recognition applications, including non-rigid shape analysis. In this paper, we introduce spectral shape distance as a general framework for distribution-based shape similarity and show that two recent methods for shape similarity due to Rustamov and Mahmoudi & Sapiro are particular cases thereof.
J. Pokrass, A. M. Bronstein, M. M. Bronstein, A correspondence-less approach to matching of deformable shapes, Proc. Scale Space and Variational Methods (SSVM), 2011 detailsA correspondence-less approach to matching of deformable shapes
J. Pokrass, A. M. Bronstein, M. M. BronsteinProc. Scale Space and Variational Methods (SSVM), 2011Finding a match between partially available deformable shapes is a challenging problem with numerous applications. The problem is usually approached by computing local descriptors on a pair of shapes and then establishing a point-wise correspondence between the two. In this paper, we introduce an alternative correspondence-less approach to matching fragments to an entire shape undergoing a non-rigid deformation. We use diffusion geometric descriptors and optimize over the integration domains on which the integral descriptors of the two parts match. The problem is regularized using the Mumford-Shah functional. We show an efficient discretization based on the Ambrosio-Tortorelli approximation generalized to triangular meshes. Experiments demonstrating the success of the proposed method are presented.
A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, R. Kimmel, Photometric heat kernel signatures, Proc. Scale Space and Variational Methods (SSVM), 2011 detailsPhotometric heat kernel signatures
A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, R. KimmelProc. Scale Space and Variational Methods (SSVM), 2011In this paper, we explore the use of the diffusion geometry framework for the fusion of geometric and photometric information in local heat kernel signature shape descriptors. Our construction is based on the definition of a diffusion process on the shape manifold embedded into a high-dimensional space where the embedding coordinates represent the photometric information. Experimental results show that such data fusion is useful in coping with different challenges of shape analysis where pure geometric and pure photometric methods fail.
J. Aflalo, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Deformable shape retrieval by learning diffusion kernels, Proc. Scale Space and Variational Methods (SSVM), 2011 detailsDeformable shape retrieval by learning diffusion kernels
J. Aflalo, A. M. Bronstein, M. M. Bronstein, R. KimmelProc. Scale Space and Variational Methods (SSVM), 2011In classical signal processing, it is common to analyze and process signals in the frequency domain, by representing the signal in the Fourier basis, and filtering it by applying a transfer function on the Fourier coefficients. In some applications, it is possible to design an optimal filter. A classical example is the Wiener filter that achieves a minimum mean squared error estimate for signal denoising. Here, we adopt similar concepts to construct optimal diffusion geometric shape descriptors. The analogy of Fourier basis are the eigenfunctions of the Laplace-Beltrami operator, in which many geometric constructions such as diffusion metrics, can be represented. By designing a filter of the Laplace-Beltrami eigenvalues, it is theoretically possible to achieve invariance to different shape transformations, like scaling. Given a set of shape classes with different transformations, we learn the optimal filter by minimizing the ratio between knowingly similar and knowingly dissimilar diffusion distances it induces. The output of the proposed framework is a filter that is optimally tuned to handle transformations that characterize the training set.
G. Rosman, M. M. Bronstein, A. M. Bronstein, A. Wolf, R. Kimmel, Group-valued regularization framework for motion segmentation of dynamic non-rigid shapes, Proc. Scale Space and Variational Methods (SSVM), 2011 detailsGroup-valued regularization framework for motion segmentation of dynamic non-rigid shapes
G. Rosman, M. M. Bronstein, A. M. Bronstein, A. Wolf, R. KimmelProc. Scale Space and Variational Methods (SSVM), 2011Understanding of articulated shape motion plays an important role in many applications in the mechanical engineering, movie industry, graphics, and vision communities. In this paper, we study motion-based segmentation of articulated 3D shapes into rigid parts. We pose the problem as finding a group-valued map between the shapes describing the motion, forcing it to favor piecewise rigid motions. Our computation follows the spirit of the Ambrosio-Tortorelli scheme for Mumford-Shah segmentation, with a diffusion component suited for the group nature of the motion model. Experimental results demonstrate the effectiveness of the proposed method in non-rigid motion segmentation.
C. Wang, M. M. Bronstein, A. M. Bronstein, N. Paragios, Discrete minimum distortion correspondence problems for non-rigid shape matching, Proc. Scale Space and Variational Methods (SSVM), 2011 detailsDiscrete minimum distortion correspondence problems for non-rigid shape matching
C. Wang, M. M. Bronstein, A. M. Bronstein, N. ParagiosProc. Scale Space and Variational Methods (SSVM), 2011Similarity and correspondence are two fundamental archetype problems in shape analysis, encountered in numerous application in computer vision and pattern recognition. Many methods for shape similarity and correspondence boil down to the minimum-distortion correspondence problem, in which two shapes are endowed with certain structure, and one attempts to find the matching with smallest structure distortion between them. Defining structures invariant to some class of shape transformations results in an invariant minimum-distortion correspondence or similarity. In this paper, we model shapes using local and global structures, formulate the invariant correspondence problem as binary graph labeling, and show how different choice of structure results in invariance under various classes of deformations.
A. Hooda, M. M. Bronstein, A. M. Bronstein, R. Horaud, Shape palindromes: analysis of intrinsic symmetries in 2D articulated shapes, Proc. Scale Space and Variational Methods (SSVM), 2011 detailsShape palindromes: analysis of intrinsic symmetries in 2D articulated shapes
A. Hooda, M. M. Bronstein, A. M. Bronstein, R. HoraudProc. Scale Space and Variational Methods (SSVM), 2011Analysis of intrinsic symmetries of non-rigid and articulated shapes is an important problem in pattern recognition with numerous applications ranging from medicine to computational aesthetics. Considering articulated planar shapes as closed curves, we show how to represent their extrinsic and intrinsic symmetries as self-similarities of local descriptor sequences, which in turn have simple interpretation in the frequency domain. The problem of symmetry detection and analysis thus boils down to analysis of descriptor sequence patterns. For that purpose, we show two efficient computational methods: one based on Fourier analysis, and another on dynamic programming. Metaphorically, the later can be compared to finding palindromes in text sequences.
F. Michel, M. M. Bronstein, A. M. Bronstein, N. Paragios, Boosted metric learning for 3D multi-modal deformable registration, Proc. Int'l Symposium on Biomedical Imaging (ISBI), 2011 detailsBoosted metric learning for 3D multi-modal deformable registration
F. Michel, M. M. Bronstein, A. M. Bronstein, N. ParagiosProc. Int'l Symposium on Biomedical Imaging (ISBI), 2011Defining a suitable metric is one of the biggest challenges in deformable image fusion from different modalities. In this paper, we propose a novel approach for multi-modal metric learning in the deformable registration framework that consists of embedding data from both modalities into a common metric space whose metric is used to parametrize the similarity. Specifically, we use image representation in the Fourier/Gabor space which introduces invariance to the local pose parameters, and the Hamming metric as the target embedding space, which allows constructing the embedding using boosted learning algorithms. The resulting metric is incorporated into a discrete optimization framework. Very promising results demonstrate the potential of the proposed method.
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. Sochen, Affine-invariant geodesic geometry of deformable 3D shapes, Computers and Graphics (CAG), Vol. 35(3), 2011 detailsAffine-invariant geodesic geometry of deformable 3D shapes
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. SochenComputers and Graphics (CAG), Vol. 35(3), 2011Natural objects can be subject to various transformations yet still preserve properties that we refer to as invariants. Here, we use definitions of affine invariant arclength for surfaces in R3 in order to extend the set of existing non-rigid shape analysis tools. We show that by re-defining the surface metric as its equi-affine version, the surface with its modified metric tensor can be treated as a canonical Euclidean object on which most classical Euclidean processing and analysis tools can be applied. The new definition of a metric is used to extend the fast marching method technique for computing geodesic distances on surfaces, where now, the distances are defined with respect to an affine invariant arclength. Applications of the proposed framework demonstrate its invariance, efficiency, and accuracy in shape analysis.
R. Litman, A. M. Bronstein, A. M. Bronstein, Diffusion-geometric maximally stable component detection in deformable shapes, Computers and Graphics (CAG), Vol. 35(3), 2011 detailsDiffusion-geometric maximally stable component detection in deformable shapes
R. Litman, A. M. Bronstein, A. M. BronsteinComputers and Graphics (CAG), Vol. 35(3), 2011Maximally stable component detection is a very popular method for feature analysis in images, mainly due to its low computation cost and high repeatability. With the recent advance of feature-based methods in geometric shape analysis, there is significant interest in finding analogous approaches in the 3D world. In this paper, we formulate a diffusion-geometric framework for stable component detection in non-rigid 3D shapes, which can be used for geometric feature detection and description. A quantitative evaluation of our method on the SHREC’10 feature detection benchmark shows its potential as a source of high-quality features.
A. M. Bronstein, M. M. Bronstein, M. Ovsjanikov, L. J. Guibas, Shape Google: geometric words and expressions for invariant shape retrieval, ACM Trans. Graphics (TOG), Vol. 30(1), 2011 detailsShape Google: geometric words and expressions for invariant shape retrieval
A. M. Bronstein, M. M. Bronstein, M. Ovsjanikov, L. J. GuibasACM Trans. Graphics (TOG), Vol. 30(1), 2011The computer vision and pattern recognition communities have recently witnessed a surge of feature-based methods in object recognition and image retrieval applications. These methods allow representing images as collections of “visual words” and treat them using text search approaches following the “bag of features” paradigm. In this paper, we explore analogous approaches in the 3D world applied to the problem of non-rigid shape retrieval in large databases. Using multiscale diffusion heat kernels as “geometric words”, we construct compact and informative shape descriptors by means of the “bag of features” approach. We also show that considering pairs of geometric words (“geometric expressions”) allows creating spatially-sensitive bags of features with better discriminativity. Finally, adopting metric learning approaches, we show that shapes can be efficiently represented as binary codes. Our approach achieves state-of-the-art results on the SHREC 2010 large-scale shape retrieval benchmark.
A. M. Bronstein, M. M. Bronstein, Metric approaches to invariant shape similarity, Chapter in Handbook of Mathematical Methods in Imaging (O. Scherzer Ed.), Springer, 2011 detailsMetric approaches to invariant shape similarity
A. M. Bronstein, M. M. BronsteinChapter in Handbook of Mathematical Methods in Imaging (O. Scherzer Ed.), Springer, 2011Non-rigid shapes are ubiquitous in Nature and are encountered at all levels of life, from macro to nano. The need to model such shapes and understand their behavior arises in many applications in imaging sciences, pattern recognition, computer vision, and computer graphics. Of particular importance is understanding which properties of the shape are attributed to deformations and which are invariant, i.e., remain unchanged. This chapter presents an approach to non- rigid shapes from the point of view of metric geometry. Modeling shapes as metric spaces, one can pose the problem of shape similarity as the similarity of metric spaces and harness tools from theoretical metric geometry for the computation of such a similarity.
- P. Sprechmann, A. M. Bronstein, G. Sapiro, Real-time online singing voice separation from monaural recordings using robust low-rank modeling, Proc. Annual Conference of the Int'l Society for Music Information Retrieval (ISMIR), 2012 (Best poster presentation award) details
Real-time online singing voice separation from monaural recordings using robust low-rank modeling
P. Sprechmann, A. M. Bronstein, G. SapiroProc. Annual Conference of the Int'l Society for Music Information Retrieval (ISMIR), 2012 (Best poster presentation award)Separating the leading vocals from the musical accompaniment is a challenging task that appears naturally in several music processing applications. Robust principal component analysis (RPCA) has been recently employed to this problem producing very successful results. The method decomposes the signal into a low-rank component corresponding to the accompaniment with its repetitive structure, and a sparse component corresponding to the voice with its quasi-harmonic structure. In this paper, we first introduce a non-negative variant of RPCA, termed as robust low-rank non-negative matrix factorization (RNMF). This new framework better suits audio applications. We then propose two efficient feed-forward architectures that approximate the RPCA and RNMF with low latency and a fraction of the complexity of the original optimization method. These approximants allow incorporating elements of unsupervised, semi- and fully-supervised learning into the RPCA and RNMF frameworks. Our basic implementation shows several orders of magnitude speedup compared to the exact solvers with no performance degradation, and allows online and faster-than-real-time processing. Evaluation on the MIR-1K dataset demonstrates state-of-the-art performance.
O. Litany, A. M. Bronstein, M. M. Bronstein, Putting the pieces together: regularized multi-shape partial matching, Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012 detailsPutting the pieces together: regularized multi-shape partial matching
O. Litany, A. M. Bronstein, M. M. BronsteinProc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012Multi-part shape matching in an important class of problems, arising in many fields such as computational archaeology, biology, geometry processing, computer graphics and vision. In this paper, we address the problem of simultaneous matching and segmentation of multiple shapes. We assume to be given a reference shape and multiple parts partially matching the reference. Each of these parts can have additional clutter, have overlap with other parts, or there might be missing parts. We show experimental results of efficient and accurate assembly of fractured synthetic and real objects.
A. Kovnatsky, A. M. Bronstein, M. M. Bronstein, Stable spectral mesh filtering, Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012 detailsStable spectral mesh filtering
A. Kovnatsky, A. M. Bronstein, M. M. BronsteinProc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012The rapid development of 3D acquisition technology has brought with itself the need to perform standard signal processing operations such as filters on 3D data. It has been shown that the eigenfunctions of the Laplace-Beltrami operator (manifold harmonics) of a surface play the role of the Fourier basis in the Euclidean space; it is thus possible to formulate signal analysis and synthesis in the manifold harmonics basis. In particular, geometry filtering can be carried out in the manifold harmonics domain by decomposing the embedding coordinates of the shape in this basis. However, since the basis functions depend on the shape itself, such filtering is valid only for weak (near all-pass) filters, and produces severe artifacts otherwise. In this paper, we analyze this problem and propose the fractional filtering approach, wherein we apply iteratively weak fractional powers of the filter, followed by the update of the basis functions. Experimental results show that such a process produces more plausible and meaningful results.
I. Kokkinos, M. M. Bronstein, R. Litman, A. M. Bronstein, Intrinsic shape context descriptors for deformable shapes, Proc. Computer Vision and Pattern Recognition (CVPR), 2012 detailsIntrinsic shape context descriptors for deformable shapes
I. Kokkinos, M. M. Bronstein, R. Litman, A. M. BronsteinProc. Computer Vision and Pattern Recognition (CVPR), 2012In this work, we present intrinsic shape context (ISC) descriptors for 3D shapes. We generalize to surfaces the polar sampling of the image domain used in shape contexts; for this purpose, we chart the surface by shooting geodesic outwards from the point being analyzed; ‘angle’ is treated as tantamount to geodesic shooting direction, and radius as geodesic distance. To deal with orientation ambiguity, we exploit properties of the Fourier transform. Our charting method is intrinsic, i.e., invariant to isometric shape transformations. The resulting descriptor is a meta-descriptor that can be applied to any photometric or geometric property field defined on the shape, in particular, we can leverage recent developments in intrinsic shape analysis and construct ISC based on state-of-the-art dense shape descriptors such as heat kernel signatures. Our experiments demonstrate a notable improvement in shape matching on standard benchmarks.
E. Rodolà, A. M. Bronstein, A. Albarelli, F. Bergamasco, A. Torsello, A game-theoretic approach to deformable shape matching, Proc. Computer Vision and Pattern Recognition (CVPR), 2012 detailsA game-theoretic approach to deformable shape matching
E. Rodolà, A. M. Bronstein, A. Albarelli, F. Bergamasco, A. TorselloProc. Computer Vision and Pattern Recognition (CVPR), 2012We consider the problem of minimum distortion intrinsic correspondence between deformable shapes, many useful formulations of which give rise to the NP-hard quadratic assignment problem (QAP). Previous attempts to use the spectral relaxation have had limited success due to the lack of sparsity of the obtained “fuzzy” solution. In this paper, we adopt the recently introduced alternative L1 relaxation of the QAP based on the principles of game theory. We relate it to the Gromov and Lipschitz metrics between metric spaces and demonstrate on state-of-the-art benchmarks that the proposed approach is capable of finding very accurate sparse correspondences between deformable shapes.
M. Spagnuolo, M. M. Bronstein, A. M. Bronstein, A. Ferreira (Eds.), Eurographics Workshop on 3D Object Retrieval, Eurographics Association, 2012, ISBN: 978-3-905674-36-1 detailsEurographics Workshop on 3D Object Retrieval
M. Spagnuolo, M. M. Bronstein, A. M. Bronstein, A. Ferreira (Eds.)Eurographics Association, 2012, ISBN: 978-3-905674-36-1This book contains the research work presented at fifth Eurographics Workshop on 3D Object Retrieval (3DOR) held in Cagliari, Italy on May 13, 2012. The 3DOR workshop series was started in Crete (2008), and then held in Munich (2009), Norrkoping (2010) and Llandudno (2011), always as a co-event of the Annual Conference of the European Association for Computer Graphics (Eurographics). All five such workshops are successful examples of international cooperation and the attendance demonstrates the relevance of focused topics. Demonstrating the increasing importance of the workshop, a record number of 23 papers were submitted this year. These papers were reviewed by an international Program Committee of 35 external experts in the area. Based on their recommendations, a selection of nine long papers was accepted for presentation at the workshop, giving an acceptance rate below 40%. Additionally, six poster presentations describing timely research results of high quality were included in the workshop program. Similarly to the previous editions of the 3DOR workshop, this year’s event hosted the seventh Shape Retrieval Contest (SHREC’12). The goal of the contest is to evaluate the effectiveness of 3D-shape retrieval algorithms, thus playing an important role in the evolution of 3D Object Retrieval research. SHREC’12 contributes to the proceedings with four additional papers that detail the results of the competition. We are grateful to the Eurographics association for their support, and to all reviewers for ensuring a high-quality program despite the tight schedule. Special thanks are also to Stefanie Behnke for her constant and timely attention. Finally, we hope that this workshop proves useful to all participants and sets the ground for long-term interaction, collaboration, and identification of future directions and potential problems in the field.
R. Litman, A. M. Bronstein, M. M. Bronstein, Stable volumetric features in deformable shapes, Computers and Graphics (CAG), Vol. 36(5), 2012 detailsStable volumetric features in deformable shapes
R. Litman, A. M. Bronstein, M. M. BronsteinComputers and Graphics (CAG), Vol. 36(5), 2012Region feature detectors and descriptors have become a successful and popular alternative to point descriptors in image analysis due to their high robustness and repeatability, leading to a significant interest in the shape analysis community in finding analogous approaches in the 3D world. Recent works have successfully extended the maximally stable extremal region (MSER) detection algorithm to surfaces. In many applications, however, a volumetric shape model is more appropriate, and modeling shape deformations as approximate isometries of the volume of an object, rather than its boundary, better captures natural behavior of non-rigid deformations. In this paper, we formulate a diffusion-geometric framework for volumetric stable component detection and description in deformable shapes. An evaluation of our method on the SHREC’11 feature detection benchmark and SCAPE human body scans shows its potential as a source of high-quality features. Examples demonstrating the drawbacks of surface stable components and the advantage of their volumetric counterparts are also presented.
G. Rosman, A. M. Bronstein, M. M. Bronstein, X.-C. Tai, R. Kimmel, Group-valued regularization for analysis of articulated motion, Proc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012 detailsGroup-valued regularization for analysis of articulated motion
G. Rosman, A. M. Bronstein, M. M. Bronstein, X.-C. Tai, R. KimmelProc. Workshop on Nonrigid Shape Analysis and Deformable Image Alignment (NORDIA), 2012We present a novel method for estimation of articulated motion in depth scans. The method is based on a framework for regularization of vector- and matrix- valued functions on parametric surfaces. We extend augmented-Lagrangian total variation regularization to smooth rigid motion cues on the scanned 3D surface obtained from a range scanner. We demonstrate the resulting smoothed motion maps to be a powerful tool in articulated scene understanding, providing a basis for rigid parts segmentation, with little prior assumptions on the scene, despite the noisy depth measurements that often appear in commodity depth scanners.
P. Sprechmann, A. M. Bronstein, G. Sapiro, Learning efficient structured sparse models, Proc. Int'l Conf. on Machine Learning (ICML), 2012 detailsLearning efficient structured sparse models
P. Sprechmann, A. M. Bronstein, G. SapiroProc. Int'l Conf. on Machine Learning (ICML), 2012We present a comprehensive framework for structured sparse coding and modeling extending the recent ideas of using learnable fast regressors to approximate exact sparse codes. For this purpose, we propose an efficient feed forward architecture derived from the iteration of the block-coordinate algorithm. This architecture approximates the exact structured sparse codes with a fraction of the complexity of the standard optimization methods. We also show that by using different training objective functions, the proposed learnable sparse encoders are not only restricted to be approximants of the exact sparse code for a pre-given dictionary, but can be rather used as full-featured sparse encoders or even modelers. A simple implementation shows several orders of magnitude speedup compared to the state-of-the-art exact optimization algorithms at minimal performance degradation, making the proposed framework suitable for real time and large-scale applications.
A. Zabatani, A. M. Bronstein, Parallelized algorithms for rigid surface alignment on GPU, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2012 detailsParallelized algorithms for rigid surface alignment on GPU
A. Zabatani, A. M. BronsteinProc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2012Alignment and registration of rigid surfaces is a fundamental computational geometric problem with applications ranging from medical imaging, automated target recognition, and robot navigation just to mention a few. The family of the iterative closest point (ICP) algorithms introduced by Chen and Medioni and Besl and McKey and improved over the three decades that followed constitute a classical to the problem. However, with the advent of geometry acquisition technologies and applications they enable, it has become necessary to align in real time dense surfaces containing millions of points. The classical ICP algorithms, being essentially sequential procedures, are unable to address the need. In this study, we follow the recent work by Mitra et al. considering ICP from the point of view of point-to-surface Euclidean distance map approximation. We propose a variant of a k-d tree data structure to store the approximation, and show its efficient parallelization on modern graphics processors. The flexibility of our implementation allows using different distance approximation schemes with controllable trade-off between accuracy and complexity. It also allows almost straightforward adaptation to richer transformation groups. Experimental evaluation of the proposed approaches on a state-of-the-art GPU on very large datasets containing around 106 vertices shows real-time performance superior by up to three orders of magnitude compared to an efficient CPU-based version.
G. Rosman, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Articulated motion segmentation of point clouds by group-valued regularization, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2012 detailsArticulated motion segmentation of point clouds by group-valued regularization
G. Rosman, A. M. Bronstein, M. M. Bronstein, R. KimmelProc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2012Motion segmentation for articulated objects is an important topic of research. Yet such a segmentation should be as free as possible from underlying assumptions so as to fit general scenes and objects. In this paper we demonstrate an algorithm for articulated motion segmentation of 3D point clouds, free of any assumptions on the underlying model and yet firmly set in a well-defined variational framework. Results on scanned images show the generality of the proposed technique and its robustness to scanning artifacts and noise.
A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, D. Raviv, R. Kimmel, Affine-invariant photometric heat kernel signatures, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2012 detailsAffine-invariant photometric heat kernel signatures
A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, D. Raviv, R. KimmelProc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2012In this paper, we explore the use of the diffusion geometry framework for the fusion of geometric and photometric information in local shape descriptors. Our construction is based on the definition of a modified metric, which combines geometric and photometric information, and then the diffusion process on the shape manifold is simulated. Experimental results show that such data fusion is useful in coping with shape retrieval experiments, where pure geometric and pure photometric methods fail. Apart from retrieval task the proposed diffusion process may be employed in other applications.
C. Strecha, A. M. Bronstein, M. M. Bronstein, P. Fua, LDAHash: improved matching with smaller descriptors, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 34(1), 2012 detailsLDAHash: improved matching with smaller descriptors
C. Strecha, A. M. Bronstein, M. M. Bronstein, P. FuaIEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 34(1), 2012SIFT-like local feature descriptors are ubiquitously employed in such computer vision applications as content-based retrieval, video analysis, copy detection, object recognition, photo-tourism, and 3D reconstruction from multiple views. Feature descriptors can be designed to be invariant to certain classes of photometric and geometric transformations, in particular, affine and intensity scale transformations. However, real transformations that an image can undergo can only be approximately modeled in this way, and thus most descriptors are only approximately invariant in practice. Secondly, descriptors are usually high-dimensional (e.g. SIFT is represented as a 128-dimensional vector). In large-scale retrieval and matching problems, this can pose challenges in storing and retrieving descriptor data. We propose mapping the descriptor vectors into the Hamming space, in which the Hamming metric is used to compare the resulting representations. This way, we reduce the size of the descriptors by representing them as short binary strings and learn descriptor invariance from examples. We show extensive experimental validation, demonstrating the advantage of the proposed approach.
B. M. Bruckstein, B. ter haar Romeny, A. M. Bronstein, M. M. Bronstein (Eds.), Scale Space and Variational Methods in Computer Vision, Lecture Notes in Computer Science (LNCS) No. 6667, Springer, 2012, ISBN: 978-3-642-24784-2 detailsScale Space and Variational Methods in Computer Vision
B. M. Bruckstein, B. ter haar Romeny, A. M. Bronstein, M. M. Bronstein (Eds.)Lecture Notes in Computer Science (LNCS) No. 6667, Springer, 2012, ISBN: 978-3-642-24784-2The International Conference on Scale Space and Variational Methods in Computer Vision (SSVM 2011) is the third issue of the conference born in 2007 as the joint edition of the Scale-Space Conferences (since 1997, Utrecht) and the Workshop on Variational, Geometric, and Level set Methods (VLSM) that first took place in Vancouver in 2001. Previous issues in Ischia, Italy (2007) and Voss, Norway (2009) were very successful, materializing the hope of the first SSVM organizers, Prof. Sgallari, Murli and Paragios, that the conference would ‘become a reference in the domain’. This year, SSVM was held in Kibbutz Ein-Gedi, Israel – a unique place on the shores of the Dead Sea, the global minimum on earth. Despite its small size, Israel plays an important role on the worldwide scientific arena, and in particular in the fields on computer vision and image processing. Following the tradition of the previous SSVM conferences, we invited outstanding scientists to give keynote presentations. This year, it was our pleasure to welcome Prof. Haim Brezis (Université Pierre et Marie Curie, France), Dr. Remco Duits, (Eindhoven University, The Netherlands), Prof. Stèphane Mallat (École Polytechnique, France), and Prof. Joachim Weickert (Saarland University, Germany). Additionally, we had six review lectures on topics of broad interest, given by experts in the field, Profs. Philip Rosenau (Tel Aviv University, Israel), Jing Yuan (University of Western Ontario, Canada), Patrizio Frosini (University of Bologna, Italy), Radu Horaud (INRIA, France), Gérard Medioni (University of Southern California, USA), and Elisabetta Carlini (La Sapienza, Italy). Out of 78 submitted papers, 24 were selected to be presented orally and 44 as posters. Over 100 people attended the conference, representing countries from all over the world, including Austria, China, France, Germany, Hong-Kong, Israel, Italy, Japan, Korea, the Netherlands, Norway, Singapore, Slovakia, Switzerland, Turkey, and USA. We would like to thank the authors for their contributions, the members of the Program Committee for their dedication and timely review process, and to Yana Katz and Boris Princ for local arrangements and organization without which this conference would not be possible. Finally, our special thanks to the Technion Department of Computer Science, HP Laboratories Israel, Haifa, Rafael Ltd., Israel, BBK Technologies Ltd., Israel, and the European Community’s FP7 ERC/FIRST programs for their generous sponsorship.
R. Litman, A. M. Bronstein, M. M. Bronstein, Stable semi-local features for non-rigid shapes, Chapter in Innovations for Shape Analysis: Models and Algorithms (M. Breuss, A. M. Bruckstein, P. Maragos Eds.), Springer, 2012 detailsStable semi-local features for non-rigid shapes
R. Litman, A. M. Bronstein, M. M. BronsteinChapter in Innovations for Shape Analysis: Models and Algorithms (M. Breuss, A. M. Bruckstein, P. Maragos Eds.), Springer, 2012Feature-based analysis is becoming a very popular approach for geometric shape analysis. Following the success of this approach in image analysis, there is a growing interest in finding analogous methods in the 3D world. Maximally stable component detection is a low computation cost and high repeatability method for feature detection in images. In this study, a diffusion-geometry based framework for stable component detection is presented, which can be used for geometric feature detection in deformable shapes. The vast majority of studies of deformable 3D shapes models them as the two-dimensional boundary of the volume of the shape. Recent works have shown that a volumetric shape model is advantageous in numerous ways as it better captures the natural behavior of non-rigid deformations. We show that our framework easily adapts to this volumetric approach, and even demonstrates superior performance. A quantitative evaluation of our methods on the SHREC’10 and SHREC’11 feature detection benchmarks as well as qualitative tests on the SCAPE dataset show its potential as a source of high-quality features. Examples demonstrating the drawbacks of surface stable components and the advantage of their volumetric counterparts are also presented.
G. Rosman, M. M. Bronstein, A. M. Bronstein, A. Wolf, R. Kimmel, Group-valued regularization for motion segmentation of articulated shapes, Chapter in Innovations for Shape Analysis: Models and Algorithms (M. Breuss, A. M. Bruckstein, P. Maragos Eds.), Springer, 2012 detailsGroup-valued regularization for motion segmentation of articulated shapes
G. Rosman, M. M. Bronstein, A. M. Bronstein, A. Wolf, R. KimmelChapter in Innovations for Shape Analysis: Models and Algorithms (M. Breuss, A. M. Bruckstein, P. Maragos Eds.), Springer, 2012Motion-based segmentation is an important tool for the analysis of articulated shapes. As such, it plays an important role in mechanical engineering, computer graphics, and computer vision. In this chapter, we study motion-based segmentation of 3D articulated shapes. We formulate motion-based surface segmentation as a piecewise-smooth regularization problem for the transformations between several poses. Using Lie-group representation for the transformation at each surface point, we obtain a simple regularized fitting problem. An Ambrosio-Tortorelli scheme of a generalized Mumford-Shah model gives us the segmentation functional without assuming prior knowledge on the number of parts or even the articulated nature of the object. Experiments on several standard datasets compare the results of the proposed method to state-of-the-art algorithms.
A. M. Bronstein, M. M. Bronstein, M. Ovsjanikov, 3D features, surface descriptors, and object descriptors, Chapter in 3D Imaging, Analysis and Applications (N. Pears, Y. Liu, P. Bunting, Eds.), Springer, 2012. details3D features, surface descriptors, and object descriptors
A. M. Bronstein, M. M. Bronstein, M. OvsjanikovChapter in 3D Imaging, Analysis and Applications (N. Pears, Y. Liu, P. Bunting, Eds.), Springer, 2012.The computer vision and pattern recognition communities have recently witnessed a surge of feature-based methods in numerous applications including object recognition and image retrieval. Similar concepts and analogous approaches are penetrating the world of 3D shape analysis, in a variety of areas including non-rigid shape retrieval and matching. In this chapter, we present the state-of-the-art of feature-based approaches in 3D shape analysis.
- P. Sprechmann, R. Litman, T. Ben Yakar, A. M. Bronstein, G. Sapiro, Efficient supervised sparse analysis and synthesis operators, Proc. Neural Information Proc. Systems (NIPS), 2013 details
Efficient supervised sparse analysis and synthesis operators
P. Sprechmann, R. Litman, T. Ben Yakar, A. M. Bronstein, G. SapiroProc. Neural Information Proc. Systems (NIPS), 2013In this paper, we propose a new and computationally efficient framework for learning sparse models. We formulate a unified approach that contains as particular cases models promoting sparse synthesis and analysis type of priors, and mixtures thereof. The supervised training of the proposed model is formulated as a bilevel optimization problem, in which the operators are optimized to achieve the best possible performance on a specific task, e.g., reconstruction or classification. By restricting the operators to be shift invariant, our approach can be thought as a way of learning analysis+synthesis sparsity-promoting convolutional operators. Leveraging recent ideas on fast trainable regressors designed to approximate exact sparse codes, we propose a way of constructing feed-forward neural networks capable of approximating the learned models at a fraction of the computational cost of exact solvers. In the shift-invariant case, this leads to a principled way of constructing task-specific convolutional networks. We illustrate the proposed models on several experiments in music analysis and image processing applications.
T. Ben Yakar, R. Litman, P. Sprechmann, A. M. Bronstein, G. Sapiro, Bilevel sparse models for polyphonic music transcription, Proc. Annual Conf. of the Int'l Society for Music Info. Retrieval (ISMIR), 2013 detailsBilevel sparse models for polyphonic music transcription
T. Ben Yakar, R. Litman, P. Sprechmann, A. M. Bronstein, G. SapiroProc. Annual Conf. of the Int'l Society for Music Info. Retrieval (ISMIR), 2013In this work, we propose a trainable sparse model for automatic polyphonic music transcription, which incorporates several successful approaches into a unified optimization framework. Our model combines unsupervised synthesis models similar to latent component analysis and nonnegative factorization with metric learning techniques that allow supervised discriminative learning. We develop efficient stochastic gradient training schemes allowing unsupervised, semi-, and fully supervised training of the model as well its adaptation to test data. We show efficient fixed complexity and latency approximation that can replace iterative minimization algorithms in time-critical applications. Experimental evaluation on synthetic and real data shows promising initial results.
J. Pokrass, A. M. Bronstein, M. M. Bronstein, P. Sprechmann, G. Sapiro, Sparse modeling of intrinsic correspondences, Computer Graphics Forum (CGF), Vol. 32(2), 2013 detailsSparse modeling of intrinsic correspondences
J. Pokrass, A. M. Bronstein, M. M. Bronstein, P. Sprechmann, G. SapiroComputer Graphics Forum (CGF), Vol. 32(2), 2013We present a novel sparse modeling approach to non-rigid shape matching using only the ability to detect repeatable regions. As the input to our algorithm, we are given only two sets of regions in two shapes; no descriptors are provided so the correspondence between the regions is not know, nor we know how many regions correspond in the two shapes. We show that even with such scarce information, it is possible to establish very accurate correspondence between the shapes by using methods from the field of sparse modeling, being this, the first non-trivial use of sparse models in shape correspondence. We formulate the problem of permuted sparse coding, in which we solve simultaneously for an unknown permutation ordering the regions on two shapes and for an unknown correspondence in functional representation. We also propose a robust variant capable of handling incomplete matches. Numerically, the problem is solved efficiently by alternating the solution of a linear assignment and a sparse coding problem. The proposed methods are evaluated qualitatively and quantitatively on standard benchmarks containing both synthetic and scanned objects.
A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, K. Glashoff, R. Kimmel, Coupled quasi-harmonic bases, Computer Graphics Forum (CGF), Vol. 32(2), 2013 detailsCoupled quasi-harmonic bases
A. Kovnatsky, M. M. Bronstein, A. M. Bronstein, K. Glashoff, R. KimmelComputer Graphics Forum (CGF), Vol. 32(2), 2013State-of-the-art approaches to shape analysis, synthesis, and correspondence rely on these natural harmonic bases that allow using classical tools from harmonic analysis on manifolds. However, many applications involving multiple shapes are obstacled by the fact that Laplacian eigenbases computed independently on different shapes are often incompatible with each other. In this paper, we propose the construction of common approximate eigenbases for multiple shapes using approximate joint diagonalization algorithms, taking as input a set of corresponding functions (e.g. indicator functions of stable regions) on the two shapes. We illustrate the benefits of the proposed approach on tasks from shape editing, pose transfer, correspondence, and similarity.
P. Sprechmann, A. M. Bronstein, J.-M. Morel, G. Sapiro, Audio restoration from multiple copies, Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013 detailsAudio restoration from multiple copies
P. Sprechmann, A. M. Bronstein, J.-M. Morel, G. SapiroProc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013A method for removing impulse noise from audio signals by fusing multiple copies of the same recording is introduced in this paper. The proposed algorithm exploits the fact that while in general multiple copies of a given recording are available, all sharing the same master, most degradations in audio signals are record-dependent. Our method first seeks for the optimal non-rigid alignment of the signals that is robust to the presence of sparse outliers with arbitrary magnitude. Unlike previous approaches, we simultaneously find the optimal alignment of the signals and impulsive degradation. This is obtained via continuous dynamic time warping computed solving an Eikonal equation. We propose to use our approach in the derivative domain, reconstructing the signal by solving an inverse problem that resembles the Poisson image editing technique. The proposed framework is here illustrated and tested in the restoration of old gramophone recordings showing promising results; however, it can be used in other application where different copies of the signal of interest are available and the degradations are copy-dependent.
P. Sprechmann, A. M. Bronstein, M. M. Bronstein, G. Sapiro, Learnable low rank sparse models for speech denoising, Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013 detailsLearnable low rank sparse models for speech denoising
P. Sprechmann, A. M. Bronstein, M. M. Bronstein, G. SapiroProc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2013In this paper we present a framework for real time enhancement of speech signals. Our method leverages a new process-centric approach for sparse and parsimonious models, where the representation pursuit is obtained applying a deterministic function or process rather than solving an optimization problem. We first propose a rank-regularized robust version of non-negative matrix factorization (NMF) for modeling time-frequency representations of speech signals in which the spectral frames are decomposed as sparse linear combinations of atoms of a low-rank dictionary. Then, a parametric family of pursuit processes is derived from the iteration of the proximal descent method for solving this model. We present several experiments showing successful results and the potential of the proposed framework. Incorporating discriminative learning makes the proposed method significantly outperform exact NMF algorithms, with fixed latency and at a fraction of it’s computational complexity.
A. Kovnatski, D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Geometric and photometric data fusion in non-rigid shape analysis, Numerical Mathematics: Theory, Methods and Applications (NM-TMA), Vol. 6(1), 2013 detailsGeometric and photometric data fusion in non-rigid shape analysis
A. Kovnatski, D. Raviv, A. M. Bronstein, M. M. Bronstein, R. KimmelNumerical Mathematics: Theory, Methods and Applications (NM-TMA), Vol. 6(1), 2013In this paper, we explore the use of the diffusion geometry framework for the fusion of geometric and photometric information in local and global shape descriptors. Our construction is based on the definition of a diffusion process on the shape manifold embedded into a high-dimensional space where the embedding coordinates represent the photometric information. Experimental results show that such data fusion is useful in coping with different challenges of shape analysis where pure geometric and pure photometric methods fail.
J. Pokrass, A. M. Bronstein, M. M. Bronstein, Partial shape matching without point-wise correspondence, Numerical Mathematics: Theory, Methods and Applications (NM-TMA), Vol. 6(1), 2013 detailsPartial shape matching without point-wise correspondence
J. Pokrass, A. M. Bronstein, M. M. BronsteinNumerical Mathematics: Theory, Methods and Applications (NM-TMA), Vol. 6(1), 2013Partial similarity of shapes in a challenging problem arising in many important applications in computer vision, shape analysis, and graphics, e.g. when one has to deal with partial information and acquisition artifacts. The problem is especially hard when the underlying shapes are non-rigid and are given up to a deformation. Partial matching is usually approached by computing local descriptors on a pair of shapes and then establishing a point-wise non-bijective correspondence between the two, taking into account possibly different parts. In this paper, we introduce an alternative correspondence-less approach to matching fragments to an entire shape undergoing a non-rigid deformation. We use diffusion geometric descriptors and optimize over the integration domains on which the integral descriptors of the two parts match. The problem is regularized using the Mumford-Shah functional. We show an efficient discretization based on the Ambrosio-Tortorelli approximation generalized to triangular meshes and point clouds, and present experiments demonstrating the success of the proposed method.
R. Litman, and A. M. Bronstein, Learning spectral descriptors for deformable shape correspondence, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 36(1), 2013 detailsLearning spectral descriptors for deformable shape correspondence
R. Litman, and A. M. BronsteinIEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 36(1), 2013Informative and discriminative feature descriptors play a fundamental role in deformable shape analysis. For example, they have been successfully employed in correspondence, registration, and retrieval tasks. In the recent years, significant attention has been devoted to descriptors obtained from the spectral decomposition of the Laplace-Beltrami operator associated with the shape. Notable examples in this family are the heat kernel signature (HKS) and the recently introduced wave kernel signature (WKS). Laplacian-based descriptors achieve state-of-the-art performance in numerous shape analysis tasks; they are computationally efficient, isometry-invariant by construction, and can gracefully cope with a variety of transformations. In this paper, we formulate a generic family of parametric spectral descriptors. We argue that in order to be optimized for a specific task, the descriptor should take into account the statistics of the corpus of shapes to which it is applied (the “signal”) and those of the class of transformations to which it is made insensitive (the “noise”). While such statistics are hard to model axiomatically, they can be learned from examples. Following the spirit of the Wiener filter in signal processing, we show a learning scheme for the construction of optimized spectral descriptors and relate it to Mahalanobis metric learning. The superiority of the proposed approach in generating correspondences is demonstrated on synthetic and scanned human figures. We also show that the learned descriptors are robust enough to be learned on synthetic data and transferred successfully to scanned shapes.
- Q. Qiu, G. Sapiro, A. M. Bronstein, Random forests can hash, arXiv:1412.5083, 2014 details
Random forests can hash
Q. Qiu, G. Sapiro, A. M. BronsteinarXiv:1412.5083, 2014Hash codes are a very efficient data representation needed to be able to cope with the ever growing amounts of data. We introduce a random forest semantic hashing scheme with information-theoretic code aggregation, showing for the first time how random forest, a technique that together with deep learning have shown spectacular results in classification, can also be extended to large-scale retrieval. Traditional random forest fails to enforce the consistency of hashes generated from each tree for the same class data, i.e., to preserve the underlying similarity, and it also lacks a principled way for code aggregation across trees. We start with a simple hashing scheme, where independently trained random trees in a forest are acting as hashing functions. We the propose a subspace model as the splitting function, and show that it enforces the hash consistency in a tree for data from the same class. We also introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. Experiments on large-scale public datasets are presented, showing that the proposed approach significantly outperforms state-of-the-art hashing methods for retrieval tasks.
P. Sprechmann, A. M. Bronstein, G. Sapiro, Supervised non-Euclidean sparse NMF via bilevel optimization with applications to speech enhancement, Proc. Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), 2014 detailsSupervised non-Euclidean sparse NMF via bilevel optimization with applications to speech enhancement
P. Sprechmann, A. M. Bronstein, G. SapiroProc. Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA), 2014Traditionally, NMF algorithms consist of two separate stages: a training stage, in which a generative model is learned; and a testing stage in which the pre-learned model is used in a high level task such as enhancement, separation, or classification. As an alternative, we propose a task-supervised NMF method for the adaptation of the basis spectra learned in the first stage to enhance the performance on the specific task used in the second stage. We cast this problem as a bilevel optimization program that can be efficiently solved via stochastic gradient descent. The proposed approach is general enough to handle sparsity priors of the activations, and allow non-Euclidean data terms such as beta-divergences. The framework is evaluated on single-channel speech enhancement tasks.
S. Korman, R. Litman, S. Avidan, A. M. Bronstein, Probably approximately symmetric: Fast rigid symmetry detection with global guarantees, Computer Graphics Forum (CGF), Vol. 34(1), 2014 detailsProbably approximately symmetric: Fast rigid symmetry detection with global guarantees
S. Korman, R. Litman, S. Avidan, A. M. BronsteinComputer Graphics Forum (CGF), Vol. 34(1), 2014We present a fast algorithm for global 3D symmetry detection with approximation guarantees. The algorithm is guaranteed to find the best approximate symmetry of a given shape, to within a user-specified threshold, with very high probability. Our method uses a carefully designed sampling of the transformation space, where each transformation is efficiently evaluated using a sub-linear algorithm. We prove that the density of the sampling depends on the total variation of the shape, allowing us to derive formal bounds on the algorithm’s complexity and approximation quality. We further investigate different volumetric shape representations (in the form of truncated distance transforms), and in such a way control the total variation of the shape and hence the sampling density and the runtime of the algorithm. A comprehensive set of experiments assesses the proposed method, including an evaluation on the eight categories of the COSEG data-set. This is the first large-scale evaluation of any symmetry detection technique that we are aware of.
R. Litman, A. M. Bronstein, M. M. Bronstein, U. Castellani, Supervised learning of bag-of-features shape descriptors using sparse coding, Computer Graphics Forum (CGF), Vol. 33(5), 2014 detailsSupervised learning of bag-of-features shape descriptors using sparse coding
R. Litman, A. M. Bronstein, M. M. Bronstein, U. CastellaniComputer Graphics Forum (CGF), Vol. 33(5), 2014We present a method for supervised learning of shape descriptors for shape retrieval applications. Many content-based shape retrieval approaches follow the bag-of-features (BoF) paradigm commonly used in text and image retrieval by first computing local shape descriptors, and then representing them in a `geometric dictionary’ using vector quantization. A major drawback of such approaches is that the dictionary is constructed in an unsupervised manner using clustering, unaware of the last stage of the process (pooling of the local descriptors into a BoF, and comparison of the latter using some metric). In this paper, we replace the clustering with dictionary learning, where every atom acts as a feature, followed by sparse coding and pooling to get the final BoF descriptor. Both the dictionary and the sparse codes can be learned in the supervised regime via bi-level optimization using a task-specific objective that promotes invariance desired in the specific application. We show significant performance improvement on several standard shape retrieval benchmarks.
O. Menashe, A. M. Bronstein, Real-time compressed imaging of scattering volumes, Proc. Int'l Conf. on Image Processing (ICIP), 2014 detailsReal-time compressed imaging of scattering volumes
O. Menashe, A. M. BronsteinProc. Int'l Conf. on Image Processing (ICIP), 2014We propose a method and a prototype imaging system for real-time reconstruction of volumetric piecewise-smooth scattering media. The volume is illuminated by a sequence of structured binary patterns emitted from a fan beam projector, and the scattered light is collected by a two-dimensional sensor, thus creating an under-complete set of compressed measurements. We show a fixed-complexity and latency reconstruction algorithm capable of estimating the scattering coefficients in real-time. We also show a simple greedy algorithm for learning the optimal illumination patterns. Our results demonstrate faithful reconstruction from highly compressed measurements. Furthermore, a method for compressed registration of the measured volume to a known template is presented, showing excellent alignment with just a single projection. Though our prototype system operates in visible light, the presented methodology is suitable for fast x-ray scattering imaging, in particular in real-time vascular medical imaging.
S. Biasotti, A. Cerri, A. M. Bronstein, M. M. Bronstein, Quantifying 3D shape similarity using maps: Recent trends, applications and perspectives, Proc. EUROGRAPHICS STARS, 2014 detailsQuantifying 3D shape similarity using maps: Recent trends, applications and perspectives
S. Biasotti, A. Cerri, A. M. Bronstein, M. M. BronsteinProc. EUROGRAPHICS STARS, 2014Shape similarity is an acute issue in Computer Vision and Computer Graphics that involves many aspects of human perception of the real world, including judged and perceived similarity concepts, deterministic and probabilistic decisions and their formalization. 3D models carry multiple information with them (e.g., geometry, topology, texture, time evolution, appearance), which can be thought as the filter that drives the recognition process. Assessing and quantifying the similarity between 3D shapes is necessary to explore large dataset of shapes, and tune the analysis framework to the userÕs needs. Many efforts have been done in this sense, including several attempts to formalize suitable notions of similarity and distance among 3D objects and their shapes. In the last years, 3D shape analysis knew a rapidly growing interest in a number of challenging issues, ranging from deformable shape similarity to partial matching and view-point selection. In this panorama, we focus on methods which quantify shape similarity (between two objects and sets of models) and compare these shapes in terms of their properties (i.e., global and local, geometric, differential and topological) conveyed by (sets of) maps. After presenting in detail the theoretical foundations underlying these methods, we review their usage in a number of 3D shape application domains, ranging from matching and retrieval to annotation and segmentation. Particular emphasis will be given to analyze the suitability of the different methods for specific classes of shapes (e.g. rigid or isometric shapes), as well as the flexibility of the various methods at the different stages of the shape comparison process. Finally, the most promising directions for future research developments are discussed.
J. Masci, M. M. Bronstein, A. M. Bronstein, J. Schmidhuber, Multimodal similarity-preserving hashing, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 36(4), 2014 detailsMultimodal similarity-preserving hashing
J. Masci, M. M. Bronstein, A. M. Bronstein, J. SchmidhuberIEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 36(4), 2014We introduce an efficient computational framework for hashing data belonging to multiple modalities into a single representation space where they become mutually comparable. The proposed approach is based on a novel coupled siamese neural network architecture and allows unified treatment of intra- and inter-modality similarity learning. Unlike existing cross-modality similarity learning approaches, our hashing functions are not limited to binarized linear projections and can assume arbitrarily complex forms. We show experimentally that our method significantly outperforms state-of-the-art hashing approaches on multimedia retrieval tasks.
J. Masci, A. M. Bronstein, M. M. Bronstein, P. Sprechmann, G. Sapiro, Sparse similarity-preserving hashing, Proc. Int'l Conf. on Learning Representations (ICLR), 2014 detailsSparse similarity-preserving hashing
J. Masci, A. M. Bronstein, M. M. Bronstein, P. Sprechmann, G. SapiroProc. Int'l Conf. on Learning Representations (ICLR), 2014In recent years, a lot of attention has been devoted to efficient nearest neighbor search by means of similarity-preserving hashing. One of the plights of existing hashing techniques is the intrinsic trade-off between performance and computational complexity: while longer hash codes allow for lower false positive rates, it is very difficult to increase the embedding dimensionality without incurring in very high false negatives rates or prohibiting computational costs. In this paper, we propose a way to overcome this limitation by enforcing the hash codes to be sparse. Sparse high-dimensional codes enjoy from the low false positive rates typical of long hashes, while keeping the false negative rates similar to those of a shorter dense hashing scheme with equal number of degrees of freedom. We use a tailored feed-forward neural network for the hashing function. Extensive experimental evaluation involving visual and multi-modal data shows the benefits of the proposed method.
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. Sochen, Equi-affine invariant intrinsic geometries for bendable shapes analysis, Journal of Mathematical Imaging and Vision (JMIV), Vol. 50(1), 2014 detailsEqui-affine invariant intrinsic geometries for bendable shapes analysis
D. Raviv, A. M. Bronstein, M. M. Bronstein, R. Kimmel, N. SochenJournal of Mathematical Imaging and Vision (JMIV), Vol. 50(1), 2014Traditional models of bendable surfaces are based on the exact or approximate invariance to deformations that do not tear or stretch the shape, leaving intact an intrinsic geometry associated with it. Intrinsic geometries are typically defined using either the shortest path length (geodesic distance), or properties of heat diffusion (diffusion distance) on the surface. Both ways are implicitly derived from the metric induced by the ambient Euclidean space. In this paper, we depart from this restrictive assumption by observing that a different choice of the metric results in a richer set of geometric invariants. We extend the classic equi-affine arclength, defined on convex surfaces, to arbitrary shapes with non-vanishing gaussian curvature. As a result, a family of affine- invariant intrinsic geometries is obtained. The potential of this novel framework is explored in a wide range of applications such as shape matching and retrieval, symmetry detection, and computation of Voroni tessellation. We show that in some shape analysis tasks, our affine-invariant intrinsic geometries often outperform their Euclidean-based counterparts.
D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. M. Bronstein, M. M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, J. Ye, Shape retrieval of non-rigid 3D human models, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2014 detailsShape retrieval of non-rigid 3D human models
D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. M. Bronstein, M. M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, J. YeProc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2014We have created a new dataset for non-rigid 3D shape retrieval, one that is much more challenging than existing datasets. Our dataset features exclusively human models, in a variety of body shapes and poses. 3D models of humans are commonly used within computer graphics and vision, therefore the ability to distinguish between body shapes is an important feature for shape retrieval methods. In this track nine groups have submitted the results of a total of 22 different methods which have been tested on our new dataset.
- O. Litany, T. Remez, A. M. Bronstein, Image reconstruction from dense binary pixels, arXiv:1512.01774, 2015D. Eynard, A. Kovnatsky, M. M. Bronstein, K. Glashoff, A. M. Bronstein, Multimodal manifold analysis using simultaneous diagonalization of Laplacians, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 37(12), 2015 details
Multimodal manifold analysis using simultaneous diagonalization of Laplacians
D. Eynard, A. Kovnatsky, M. M. Bronstein, K. Glashoff, A. M. BronsteinIEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 37(12), 2015We construct an extension of spectral and diffusion geometry to multiple modalities through simultaneous diagonalization of Laplacian matrices. This naturally extends classical data analysis tools based on spectral geometry, such as diffusion maps and spectral clustering. We provide several synthetic and real examples of manifold learning, retrieval, and clustering demonstrating that the joint spectral geometry frequently better captures the inherent structure of multi-modal data. We also show the relation of many previous approaches to multimodal manifold analysis to our framework, of which the can be seen as particular cases.
T. Remez, O. Litany, A. M. Bronstein, A Picture is Worth a Billion Bits: Real-time image reconstruction from dense binary pixels, arXiv:1510.04601, 2015 detailsA Picture is Worth a Billion Bits: Real-time image reconstruction from dense binary pixels
T. Remez, O. Litany, A. M. BronsteinarXiv:1510.04601, 2015The pursuit of smaller pixel sizes at ever-increasing resolution in digital image sensors is mainly driven by the stringent price and form-factor requirements of sensors and optics in the cellular phone market. Recently, Eric Fossum proposed a novel concept of an image sensor with dense sub-diffraction limit one-bit pixels (jots), which can be considered a digital emulation of silver halide photographic film. This idea has been recently embodied as the EPFL Gigavision camera. A major bottleneck in the design of such sensors is the image reconstruction process, producing a continuous high dynamic range image from oversampled bi- nary measurements. The extreme quantization of the Pois- son statistics is incompatible with the assumptions of most standard image processing and enhancement frameworks. The recently proposed maximum-likelihood (ML) approach addresses this difficulty, but suffers from image artifacts and has impractically high computational complexity. In this work, we study a variant of a sensor with binary thresh- old pixels and propose a reconstruction algorithm combin- ing an ML data fitting term with a sparse synthesis prior. We also show an efficient hardware-friendly real-time approximation of this inverse operator. Promising results are shown on synthetic data as well as on HDR data emulated using multiple exposures of a regular CMOS sensor.
A. M. Bronstein, New dimensions of media, Universidad La Salle, Revista de ciencias de la computación, Vol. 3(1), 2015H. Haim, A. M. Bronstein, E. Marom, Computational all-in-focus imaging using an optical phase mask, OSA Optics Express, Vol. 23(19), 2015 detailsComputational all-in-focus imaging using an optical phase mask
H. Haim, A. M. Bronstein, E. MaromOSA Optics Express, Vol. 23(19), 2015A method for extended depth of field imaging based on image acquisition through a thin binary phase plate followed by fast automatic computational post-processing is presented. By placing a wavelength dependent optical mask inside the pupil of a conventional camera lens, one acquires a unique response for each of the three main color channels, which adds valuable information that allows blind reconstruction of blurred images without the need of an iterative search process for estimating the blurring kernel. The presented simulation as well as capture of a real life scene show how acquiring a one-shot image focused at a single plane, enable generating a de-blurred scene over an extended range in space.
R. Litman, S. Korman, A. M. Bronstein, S. Avidan, GMD: Global model detection via inlier rate estimation, Proc. Computer Vision and Pattern Recognition (CVPR), 2015 detailsGMD: Global model detection via inlier rate estimation
R. Litman, S. Korman, A. M. Bronstein, S. AvidanProc. Computer Vision and Pattern Recognition (CVPR), 2015This work presents a novel approach for detecting inliers in a given set of correspondences (matches). It does so without explicitly identifying any consensus set, based on a method for inlier rate estimation (IRE). Given such an estimator for the inlier rate, we also present an algorithm that detects a globally optimal transformation. We provide a theoretical analysis of the IRE method using a stochastic generative model on the continuous spaces of matches and transformations. This model allows rigorous investigation of the limits of our IRE method for the case of 2D translation, further giving bounds and insights for the more general case. Our theoretical analysis is validated empirically and is shown to hold in practice for the more general case of 2D affinities. In addition, we show that the combined framework works on challenging cases of 2D homography estimation, with very few and possibly noisy inliers, where RANSAC generally fails.
I. Sipiran, B. Bustos, T. Schreck, A. M. Bronstein, M. M. Bronstein, U. Castellani, S. Choi, L. Lai, H. Li, R. Litman, L. Sun, SHREC'15 Track: Scalability of non-rigid 3D shape retrieval, Proc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2015 detailsSHREC'15 Track: Scalability of non-rigid 3D shape retrieval
I. Sipiran, B. Bustos, T. Schreck, A. M. Bronstein, M. M. Bronstein, U. Castellani, S. Choi, L. Lai, H. Li, R. Litman, L. SunProc. EUROGRAPHICS Workshop on 3D Object Retrieval (3DOR), 2015Due to recent advances in 3D acquisition and modeling, increasingly large amounts of 3D shape data become available in many application domains. This rises not only the need for effective methods for 3D shape retrieval, but also efficient retrieval and robust implementations. Previous 3D retrieval challenges have mainly considered data sets in the range of a few thousands of queries. In the 2015 SHREC track on Scalability of 3D Shape Retrieval we provide a benchmark with more than 96 thousand shapes. The data set is based on a non-rigid retrieval benchmark enhanced by other existing shape benchmarks. From the baseline models, a large set of partial objects were automatically created by simulating a range-image acquisition process. Four teams have participated in the track, with most methods providing very good to near-perfect retrieval results, and one less complex baseline method providing fair performance. Timing results indicate that three of the methods including the latter baseline one provide near- interactive time query execution. Generally, the cost of data pre-processing varies depending on the method.
X. Bian, H. Krim, A. M. Bronstein, L. Dai, Sparse null space basis pursuit and analysis dictionary learning for high-dimensional data analysis, Proc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2015 detailsSparse null space basis pursuit and analysis dictionary learning for high-dimensional data analysis
X. Bian, H. Krim, A. M. Bronstein, L. DaiProc. Int'l Conf. on Acoustics, Speech, and Signal Processing (ICASSP), 2015Sparse models in dictionary learning have been successfully applied in a wide variety of machine learning and computer vision problems, and have also recently been of increasing research interest. Another interesting related problem based on a linear equality constraint, namely the sparse null space problem (SNS), first appeared in 1986, and has since inspired results on sparse basis pursuit. In this paper, we investigate the relation between the SNS problem and the analysis dictionary learning problem, and show that the SNS problem plays a central role, and may be utilized to solve dictionary learning problems. Moreover, we propose an efficient algorithm of sparse null space basis pursuit, and extend it to a solution of analysis dictionary learning. Experimental results on numerical synthetic data and realworld data are further presented to validate the performance of our method.
Y. Aflalo, A. M. Bronstein, R. Kimmel, On convex relaxation of graph isomorphism, Proc. US National Academy of Sciences (PNAS), 2015 detailsOn convex relaxation of graph isomorphism
Y. Aflalo, A. M. Bronstein, R. KimmelProc. US National Academy of Sciences (PNAS), 2015We consider the problem of exact and inexact matching of weighted undirected graphs, in which a bijective correspondence is sought to minimize a quadratic weight disagreement. This computationally challenging problem is often relaxed as a convex quadratic program, in which the space of permutations is replaced by the space of doubly stochastic matrices. However, the applicability of such a relaxation is poorly understood. We define a broad class of friendly graphs characterized by an easily verifiable spectral property. We prove that for friendly graphs, the convex relaxation is guaranteed to find the exact isomorphism or certify its inexistence. This result is further extended to approximately isomorphic graphs, for which we develop an explicit bound on the amount of weight disagreement under which the relaxation is guaranteed to find the globally optimal approximate isomorphism. We also show that in many cases, the graph matching problem can be further harmlessly relaxed to a convex quadratic program with only n separable linear equality constraints, which is substantially more efficient than the standard relaxation involving 2n equality and n2 inequality constraints. Finally, we show that our results are still valid for unfriendly graphs if additional information in the form of seeds or attributes is allowed, with the latter satisfying an easy to verify spectral characteristic.
P. Sprechmann, A. M. Bronstein, G. Sapiro, Learning efficient sparse and low-rank models, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 37(9), 2015 detailsLearning efficient sparse and low-rank models
P. Sprechmann, A. M. Bronstein, G. SapiroIEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), Vol. 37(9), 2015Parsimony, including sparsity and low rank, has been shown to successfully model data in numerous machine learning and signal processing tasks. Traditionally, parsimonious modeling approaches rely on an iterative algorithm that minimizes an objective function with parsimony-promoting terms. The inherently sequential structure and data-dependent complexity and latency of iterative optimization constitute a major limitation in many applications requiring real-time performance or involving large-scale data. Another limitation encountered by these models is the difficulty of their inclusion in supervised learning scenarios, where the higher-level training objective would depend on the solution of the lower-level pursuit problem. The resulting bilevel optimization problems are in general notoriously difficult to solve. In this paper, we propose to move the emphasis from the model to the pursuit algorithm, and develop a process-centric view of parsimonious modeling, in which a deterministic fixed-complexity pursuit process is used in lieu of iterative optimization. We show a principled way to construct learnable pursuit process architectures for structured sparse and robust low rank models from the iteration of proximal descent algorithms. These architectures approximate the exact parsimonious representation with a fraction of the complexity of the standard optimization methods. We also show that carefully chosen training regimes allow to naturally extend parsimonious models to discriminative settings. State-of-the-art results are demonstrated on several challenging problems in image and audio processing with several orders of magnitude speedup compared to the exact optimization algorithms.
P. Sprechmann, A. M. Bronstein, G. Sapiro, Supervised non-negative matrix factorization for audio source separation, Chapter in Excursions in Harmonic Analysis (R. Balan, M. Begue, J. J. Benedetto, W. Czaja, K. Okoudjou Eds.), Birkhaeuser, 2015 detailsSupervised non-negative matrix factorization for audio source separation
P. Sprechmann, A. M. Bronstein, G. SapiroChapter in Excursions in Harmonic Analysis (R. Balan, M. Begue, J. J. Benedetto, W. Czaja, K. Okoudjou Eds.), Birkhaeuser, 2015Source separation is a widely studied problems in signal processing. Despite the permanent progress reported in the literature it is still considered a significant challenge. This chapter first reviews the use of non-negative matrix factorization (NMF) algorithms for solving source separation problems, and proposes a new way for the supervised training in NMF. Matrix factorization methods have received a lot of attention in recent year in the audio processing community, producing particularly good results in source separation. Traditionally, NMF algorithms consist of two separate stages: a training stage, in which a generative model is learned; and a testing stage in which the pre-learned model is used in a high level task such as enhancement, separation, or classification. As an alternative, we propose a tasksupervised NMF method for the adaptation of the basis spectra learned in the first stage to enhance the performance on the specific task used in the second stage. We cast this problem as a bilevel optimization program efficiently solved via stochastic gradient descent. The proposed approach is general enough to handle sparsity priors of the activations, and allow non-Euclidean data terms such as beta-divergences. The framework is evaluated on speech enhancement.
- Y. Choukroun, A. Shtern, A. M. Bronstein, R. Kimmel, Hamiltonian operator for spectral shape analysis, arXiv:1611.01990, 2016 details
Hamiltonian operator for spectral shape analysis
Y. Choukroun, A. Shtern, A. M. Bronstein, R. KimmelarXiv:1611.01990, 2016Many shape analysis methods treat the geometry of an object as a metric space that can be captured by the Laplace-Beltrami operator. In this paper, we propose to adapt the classical Hamiltonian operator from quantum me- chanics to the field of shape analysis. To this end we study the addition of a potential function to the Laplacian as a generator for dual spaces in which shape processing is performed. We present a general optimization approach for solving variational problems involving the basis defined by the Hamilto- nian using perturbation theory for its eigenvectors. The suggested operator is shown to produce better functional spaces to operate with, as demon- strated on different shape analysis tasks.
A. M. Bronstein, Y. Choukroun, R. Kimmel, M. Sela, Consistent discretization and minimization of the L1 norm on manifolds, Proc. 3D Vision (3DV), 2016 detailsConsistent discretization and minimization of the L1 norm on manifolds
A. M. Bronstein, Y. Choukroun, R. Kimmel, M. SelaProc. 3D Vision (3DV), 2016The L1 norm has been tremendously popular in signal and image processing in the past two decades due to its sparsity-promoting properties. More recently, its generalization to non-Euclidean domains has been found useful in shape analysis applications. For example, in conjunction with the minimization of the Dirichlet energy, it was shown to produce a compactly supported quasi-harmonic orthonormal basis, dubbed as compressed manifold modes. The continuous L1 norm on the manifold is often replaced by the vector l1 norm applied to sampled functions. We show that such an approach is incorrect in the sense that it does not consistently discretize the continuous norm and warn against its sensitivity to the specific sampling. We propose two alternative discretizations resulting in an iteratively-reweighed l2 norm. We demonstrate the proposed strategy on the compressed modes problem, which reduces to a sequence of simple eigendecomposition problems not requiring non-convex optimization on Stiefel manifolds and producing more stable and accurate results.
R. Litman, A. M. Bronstein, SpectroMeter: Amortized sublinear spectral approximation of distance on graphs, Proc. 3D Vision (3DV), 2016 detailsSpectroMeter: Amortized sublinear spectral approximation of distance on graphs
R. Litman, A. M. BronsteinProc. 3D Vision (3DV), 2016We present a method to approximate pairwise distance on a graph, having an amortized sub-linear complexity in its size. The proposed method follows the so-called heat method due to Crane et al. The only additional input is the values of the eigenfunctions of the graph Laplacian at a subset of the vertices. Using these values we estimate a random walk from the source points, and normalize the result into a unit gradient function. The eigenfunctions are then used to synthesize distance values abiding by these constraints at desired locations. We show that this method works in practice on different types of inputs ranging from triangular meshes to general graphs. We also demonstrate that the resulting approximate distance is accurate enough to be used as the input to a recent method for intrinsic shape correspondence computation.
T. Remez, O. Litany, S. Yoseff, H. Haim, A. M. Bronstein, FPGA system for real-time computational extended depth of field imaging using phase aperture coding, arXiv:1608.01074, 2016 detailsFPGA system for real-time computational extended depth of field imaging using phase aperture coding
T. Remez, O. Litany, S. Yoseff, H. Haim, A. M. BronsteinarXiv:1608.01074, 2016We present a proof-of-concept end-to-end system for computational extended depth of field (EDOF) imaging. The acquisition is performed through a phase-coded aperture implemented by placing a thin wavelength-dependent op- tical mask inside the pupil of a conventional camera lens, as a result of which, each color channel is focused at a different depth. The reconstruction process re- ceives the raw Bayer image as the input, and performs blind estimation of the output color image in focus at an extended range of depths using a patch-wise sparse prior. We present a fast non-iterative reconstruction algorithm operating with constant latency in fixed-point arithmetics and achieving real-time perfor- mance in a prototype FPGA implementation. The output of the system, on simu- lated and real-life scenes, is qualitatively and quantitatively better than the result of clear-aperture imaging followed by state-of-the-art blind deblurring.
R. Giryes, G. Sapiro, A. M. Bronstein, Deep neural networks with random Gaussian weights: A universal classification strategy?, IEEE Trans. Signal Processing, Vol. 64(13), 2016 detailsDeep neural networks with random Gaussian weights: A universal classification strategy?
R. Giryes, G. Sapiro, A. M. BronsteinIEEE Trans. Signal Processing, Vol. 64(13), 2016Three important properties of a classification machinery are: (i) the system preserves the important information of the input data; (ii) the training examples convey information for unseen data; and (iii) the system is able to treat differently points from different classes. In this work, we show that these fundamental properties are inherited by the architecture of deep neural networks. We formally prove that these networks with random Gaussian weights perform a distance-preserving embedding of the data, with a special treatment for in-class and out-of-class data. Similar points at the input of the network are likely to have the same The theoretical analysis of deep networks here presented exploits tools used in the compressed sensing and dictionary learning literature, thereby making a formal connection between these important topics. The derived results allow drawing conclusions on the metric learning properties of the network and their relation to its structure; and provide bounds on the required size of the training set such that the training examples would represent faithfully the unseen data. The results are validated with state-of-the-art trained networks.
O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. Cremers, Non-rigid puzzles, Computer Graphics Forum, Vol. 35(5), 2016 (SGP Best Paper Award) detailsNon-rigid puzzles
O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. CremersComputer Graphics Forum, Vol. 35(5), 2016 (SGP Best Paper Award)Shape correspondence is a fundamental problem in computer graphics and vision, with applications in various problems including animation, texture mapping, robotic vision, medical imaging, archaeology and many more. In settings where the shapes are allowed to undergo non-rigid deformations and only partial views are available, the problem becomes very challenging. To this end, we present a non-rigid multi-part shape matching algorithm. We assume to be given a reference shape and its multiple parts undergoing a non-rigid deformation. Each of these query parts can be additionally contaminated by clutter, may overlap with other parts, and there might be missing parts or redundant ones. Our method simultaneously solves for the segmentation of the reference model, and for a dense correspondence to (subsets of) the parts. Experimental results on synthetic as well as real scans demonstrate the effectiveness of our method in dealing with this challenging matching scenario.
X. Bian, H. Krim, A. M. Bronstein, L. Dai, Sparsity and nullity: paradigms for analysis dictionary learning, SIAM J. Imaging Sci., Vol. 9(3), 2016 detailsSparsity and nullity: paradigms for analysis dictionary learning
X. Bian, H. Krim, A. M. Bronstein, L. DaiSIAM J. Imaging Sci., Vol. 9(3), 2016Sparse models in dictionary learning have been successfully applied in a wide variety of machine learning and computer vision problems, and as a result, have recently attracted increased research interest. Another interesting related problem based on linear equality constraints, namely the sparse null space (SNS) problem, first appeared in 1986 and has since inspired results on sparse basis pursuit. In this paper, we investigate the relation between the SNS problem and the analysis dictionary learning (ADL) problem, and show that the SNS problem plays a central role, and may be utilized to solve dictionary learning problems. Moreover, we propose an efficient algorithm of sparse null space basis pursuit (SNS-BP) and extend it to a solution of ADL. Experimental results on numerical synthetic data and real-world data are further presented to validate the performance of our method.
D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. M. Bronstein, M. M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, J. Ye, Shape retrieval of non-rigid 3D human models, Intl. Journal of Computer Vision (IJCV), 2016 detailsShape retrieval of non-rigid 3D human models
D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. M. Bronstein, M. M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, J. YeIntl. Journal of Computer Vision (IJCV), 20163D models of humans are commonly used within computer graphics and vision, and so the ability to distinguish between body shapes is an important shape retrieval problem. We extend our recent paper which provided a benchmark for testing non-rigid 3D shape retrieval algorithms on 3D human models. This benchmark provided a far stricter challenge than previous shape benchmarks.We have added 145 new models for use as a separate training set, in order to standardise the training data used and provide a fairer comparison. We have also included experiments with the FAUST dataset of human scans. All participants of the previous benchmark study have taken part in the new tests reported here, many providing updated results using the new data. In addition, further participants have also taken part, and we provide extra analysis of the retrieval results. A total of 25 different shape retrieval methods are compared.
- S. Vedula, O. Senouf, A. M. Bronstein, O. V. Michailovich, M. Zibulevsky, Towards CT-quality ultrasound imaging using deep learning, arXiv:1710.06304, 2017 details
Towards CT-quality ultrasound imaging using deep learning
S. Vedula, O. Senouf, A. M. Bronstein, O. V. Michailovich, M. ZibulevskyarXiv:1710.06304, 2017The cost-effectiveness and practical harmlessness of ultra- sound imaging have made it one of the most widespread tools for medical diagnosis. Unfortunately, the beam-forming based image formation produces granular speckle noise, blur- ring, shading and other artifacts. To overcome these effects, the ultimate goal would be to reconstruct the tissue acoustic properties by solving a full wave propagation inverse prob- lem. In this work, we make a step towards this goal, using Multi-Resolution Convolutional Neural Networks (CNN). As a result, we are able to reconstruct CT-quality images from the reflected ultrasound radio-frequency(RF) data obtained by simulation from real CT scans of a human body. We also show that CNN is able to imitate existing computationally heavy despeckling methods, thereby saving orders of magni- tude in computations and making them amenable to real-time applications.
O. Litany, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein, Deep Functional Maps: Structured prediction for dense shape correspondence, Proc. Int'l Conf. on Computer Vision (ICCV), 2017 detailsDeep Functional Maps: Structured prediction for dense shape correspondence
O. Litany, T. Remez, E. Rodolà, A. M. Bronstein, M. M. BronsteinProc. Int'l Conf. on Computer Vision (ICCV), 2017We introduce a new framework for learning dense correspondence between deformable 3D shapes. Existing learning based approaches model shape correspondence as a labelling problem, where each point of a query shape receives a label identifying a point on some reference domain; the correspondence is then constructed a posteriori by composing the label predictions of two input shapes. We propose a paradigm shift and design a structured prediction model in the space of functional maps, linear operators that provide a compact representation of the correspondence. We model the learning process via a deep residual network which takes dense descriptor fields defined on two shapes as input, and outputs a soft map between the two given objects. The resulting correspondence is shown to be accurate on several challenging benchmarks comprising multiple categories, synthetic models, real scans with acquisition artifacts, topological noise, and partiality.
Z. Laehner, M. Vestner, A. Boyarski, O. Litany, R. Slossberg, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein, R. Kimmel, D. Cremers, Efficient deformable shape correspondence via kernel matching, Proc. 3D Vision (3DV), 2017 detailsEfficient deformable shape correspondence via kernel matching
Z. Laehner, M. Vestner, A. Boyarski, O. Litany, R. Slossberg, T. Remez, E. Rodolà, A. M. Bronstein, M. M. Bronstein, R. Kimmel, D. CremersProc. 3D Vision (3DV), 2017We present a method to match three dimensional shapes under non-isometric deformations, topology changes and partiality. We formulate the problem as matching between a set of pair-wise and point-wise descriptors, imposing a continuity prior on the mapping, and propose a projected descent optimization procedure inspired by difference of convex functions (DC) programming. Surprisingly, in spite of the highly non-convex nature of the resulting quadratic assignment problem, our method converges to a semantically meaningful and continuous mapping in most of our experiments, and scales well. We provide preliminary theoretical analysis and several interpretations of the method.
G. Alexandroni, Y. Podolsky, H. Greenspan, T. Remez, O. Litany, A. M. Bronstein, R. Giryes, White matter fiber representation using continuous dictionary learning, Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2017 detailsWhite matter fiber representation using continuous dictionary learning
G. Alexandroni, Y. Podolsky, H. Greenspan, T. Remez, O. Litany, A. M. Bronstein, R. GiryesProc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2017With increasingly sophisticated Diffusion Weighted MRI acquisition methods and modelling techniques, very large sets of streamlines (fibers) are presently generated per imaged brain. These reconstructions of white matter architecture, which are important for human brain research and pre-surgical planning, require a large amount of storage and are often unwieldy and difficult to manipulate and analyze. This work proposes a novel continuous parsimonious framework in which signals are sparsely represented in a dictionary with continuous atoms. The significant innovation in our new methodology is the ability to train such continuous dictionaries, unlike previous approaches that either used pre-fixed continuous transforms or training with finite atoms. This leads to an innovative fiber representation method, which uses Continuous Dictionary Learning to sparsely code each fiber with high accuracy. This method is tested on numerous tractograms produced from the Human Connectome Project data and achieves state-of-the-art performances in compression ratio and reconstruction error.
M. Vestner, R. Litman, E. Rodolà, A. M. Bronstein, D. Cremers, Product Manifold Filter: Non-rigid shape correspondence via kernel density estimation in the product space, Proc. Computer Vision and Pattern Recognition (CVPR), 2017 detailsProduct Manifold Filter: Non-rigid shape correspondence via kernel density estimation in the product space
M. Vestner, R. Litman, E. Rodolà, A. M. Bronstein, D. CremersProc. Computer Vision and Pattern Recognition (CVPR), 2017Many algorithms for the computation of correspondences between deformable shapes rely on some variant of nearest neighbor matching in a descriptor space. Such are, for example, various point-wise correspondence recovery algorithms used as a post-processing stage in the functional correspondence framework. Such frequently used techniques implicitly make restrictive assumptions (e.g., near-isometry) on the considered shapes and in practice suffer from a lack of accuracy and result in poor surjectivity. We propose an alternative recovery technique capable of guaranteeing a bijective correspondence and producing significantly higher accuracy and smoothness. Unlike other methods, our approach does not depend on the assumption that the analyzed shapes are isometric. We derive the proposed method from the statistical framework of kernel density estimation and demonstrate its performance on several challenging deformable 3D shape matching datasets.
O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, Fully spectral partial shape matching, Computer Graphics Forum, Vol. 36(2), 2017 detailsFully spectral partial shape matching
O. Litany, E. Rodolà, A. M. Bronstein, M. M. BronsteinComputer Graphics Forum, Vol. 36(2), 2017We propose an efficient procedure for calculating partial dense intrinsic correspondence between deformable shapes performed entirely in the spectral domain. Our technique relies on the recently introduced partial functional maps formalism and on the joint approximate diagonalization (JAD) of the Laplace-Beltrami operators previously introduced for matching non-isometric shapes. We show that a variant of the JAD problem with an appropriately modified coupling term (surprisingly) allows to construct quasi-harmonic bases localized on the latent corresponding parts. This circumvents the need to explicitly compute the unknown parts by means of the cumbersome alternating minimization used in the previous approaches, and allows performing all the calculations in the spectral domain with constant complexity independent of the number of shape vertices. We provide an extensive evaluation of the proposed technique on standard non-rigid correspondence benchmarks and show state-of-the-art performance in various settings, including partiality and the presence of topological noise.
A. Boyarski, A. M. Bronstein, M. M. Bronstein, Subspace least squares multidimensional scaling, Proc. Scale Space and Variational Methods (SSVM), 2017 detailsSubspace least squares multidimensional scaling
A. Boyarski, A. M. Bronstein, M. M. BronsteinProc. Scale Space and Variational Methods (SSVM), 2017Multidimensional Scaling (MDS) is one of the most popular methods for dimensionality reduction and visualization of high dimensional data. Apart from these tasks, it also found applications in the field of geometry processing for the analysis and reconstruction of non-rigid shapes. In this regard, MDS can be thought of as a shape from metric algorithm, consisting of finding a configuration of points in the Euclidean space that realize, as isometrically as possible, some given distance structure. In the present work we cast the least squares variant of MDS (LS-MDS) in the spectral domain. This uncovers a multiresolution property of distance scaling which speeds up the optimization by a significant amount, while producing comparable, and sometimes even better, embeddings.
T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Deep class-aware image denoising, Proc. Int'l Conf. on Image Processing (ICIP), 2017 detailsDeep class-aware image denoising
T. Remez, O. Litany, R. Giryes, A. M. BronsteinProc. Int'l Conf. on Image Processing (ICIP), 2017The increasing demand for high image quality in mobile devices brings forth the need for better computational enhancement techniques, and image denoising in particular. To this end, we propose a new fully convolutional deep neural network architecture which is simple yet powerful and achieves state-of-the-art performance for additive Gaussian noise removal. Furthermore, we claim that the personal photo-collections can usually be categorized into a small set of semantic classes. However simple, this observation has not been exploited in image denoising until now. We show that a significant boost in performance of up to 0.4dB PSNR can be achieved by making our network class-aware, namely, by fine-tuning it for images belonging to a specific semantic class. Relying on the hugely successful existing image classifiers, this research advocates for using a class-aware approach in all image enhancement tasks.
O. Litany, T. Remez, A. M. Bronstein, Cloud Dictionary: Sparse coding and modeling for point clouds, arXiv:1612.04956, 2017 detailsCloud Dictionary: Sparse coding and modeling for point clouds
O. Litany, T. Remez, A. M. BronsteinarXiv:1612.04956, 2017With the development of range sensors such as LIDAR and time-of-flight cameras, 3D point cloud scans have become ubiquitous in computer vision applications, the most prominent ones being gesture recognition and autonomous driving. Parsimony-based algorithms have shown great success on images and videos where data points are sampled on a regular Cartesian grid. We propose an adaptation of these techniques to irregularly sampled signals by using continuous dictionaries. We present an example application in the form of point cloud denoising.
T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Deep class-aware denoising, arXiv:1701.01698, 2017 detailsDeep class-aware denoising
T. Remez, O. Litany, R. Giryes, A. M. BronsteinarXiv:1701.01698, 2017The increasing demand for high image quality in mobile devices brings forth the need for better computational enhancement techniques, and image denoising in particular. At the same time, the images captured by these devices can be categorized into a small set of semantic classes. However simple, this observation has not been exploited in image denoising until now. In this paper, we demonstrate how the reconstruction quality improves when a denoiser is aware of the type of content in the image. To this end, we first propose a new fully convolutional deep neural network architecture which is simple yet powerful as it achieves state-of-the-art performance even without be- ing class-aware. We further show that a significant boost in performance of up to 0.4 dB PSNR can be achieved by making our network class-aware, namely, by fine-tuning it for images belonging to a specific semantic class. Relying on the hugely successful existing image classifiers, this research advocates for using a class-aware approach in all image enhancement tasks.
T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Deep convolutional denoising of low-light images, arXiv:1701.01687, 2017 detailsDeep convolutional denoising of low-light images
T. Remez, O. Litany, R. Giryes, A. M. BronsteinarXiv:1701.01687, 2017Poisson distribution is used for modeling noise in photon-limited imaging. While canonical examples include relatively exotic types of sensing like spectral imaging or astronomy, the problem is relevant to regular photography now more than ever due to the booming market for mobile cameras. Restricted form factor limits the amount of absorbed light, thus computational post-processing is called for. In this paper, we make use of the powerful framework of deep convolutional neural networks for Poisson denoising. We demonstrate how by training the same network with images having a specific peak value, our denoiser outperforms previous state-of-the-art by a large margin both visually and quantitatively. Being flexible and data-driven, our solution resolves the heavy ad hoc engineering used in previous methods and is an order of magnitude faster. We further show that by adding a reasonable prior on the class of the image being processed, another significant boost in performance is achieved.
O. Litany, T. Remez, D. Freedman, L. Shapira, A. M. Bronstein, R. Gal, ASIST: Automatic Semantically Invariant Scene Transformation, Computer Vision and Image Understanding, Vol. 157, 2017 detailsASIST: Automatic Semantically Invariant Scene Transformation
O. Litany, T. Remez, D. Freedman, L. Shapira, A. M. Bronstein, R. GalComputer Vision and Image Understanding, Vol. 157, 2017We present ASIST, a technique for transforming point clouds by replacing objects with their semantically equivalent counterparts. Transformations of this kind have applications in virtual reality, repair of fused scans, and robotics. ASIST is based on a unified formulation of semantic labeling and object replacement; both result from minimizing a single objective. We present numerical tools for the efficient solution of this optimization problem. The method is experimentally assessed on new datasets of both synthetic and real point clouds, and is additionally compared to two recent works on object replacement on data from the corresponding papers.
M. Ovsjanikov, E. Corman, M. M. Bronstein, E. Rodolà, M. Ben-Chen, L. Guibas, F. Chazal, A. M. Bronstein, Computing and processing correspondences with functional maps, SIGGRAPH Courses, 2017 detailsComputing and processing correspondences with functional maps
M. Ovsjanikov, E. Corman, M. M. Bronstein, E. Rodolà, M. Ben-Chen, L. Guibas, F. Chazal, A. M. BronsteinSIGGRAPH Courses, 2017Notions of similarity and correspondence between geometric shapes and images are central to many tasks in geometry processing, computer vision, and computer graphics. The goal of this course is to familiarize the audience with a set of recent techniques that greatly facilitate the computation of mappings or correspondences between geometric datasets, such as 3D shapes or 2D images by formulating them as mappings between functions rather than points or triangles. Methods based on the functional map framework have recently led to state-of-the-art results in problems as diverse as non-rigid shape matching, image co-segmentation and even some aspects of tangent vector field design. One challenge in adopting these methods in practice, however, is that their exposition often assumes a significant amount of background in geometry processing, spectral methods and functional analysis, which can make it difficult to gain an intuition about their performance or about their applicability to real-life problems. In this course, we try to provide all the tools necessary to appreciate and use these techniques, while assuming very little background knowledge. We also give a unifying treatment of these techniques, which may be difficult to extract from the individual publications and, at the same time, hint at the generality of this point of view, which can help tackle many problems in the analysis and creation of visual content. This course is structured as a half day course. We will assume that the participants have knowledge of basic linear algebra and some knowledge of differential geometry, to the extent of being familiar with the concepts of a manifold and a tangent vector space. We will discuss in detail the functional approach to finding correspondences between non-rigid shapes, the design and analysis of tangent vector fields on surfaces, consistent map estimation in networks of shapes and applications to shape and image segmentation, shape variability analysis, and other areas.
- E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein, ∆-encoder: an effective sample synthesis method for few-shot object recognition, Proc. Neural Information Processing Systems (NIPS), 2018 details
∆-encoder: an effective sample synthesis method for few-shot object recognition
E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. BronsteinProc. Neural Information Processing Systems (NIPS), 2018Learning to classify new categories based on just one or a few examples is a long-standing challenge in modern computer vision. In this work, we propose a simple yet effective method for few-shot (and one-shot) object recognition. Our approach is based on a modified auto-encoder, denoted ∆-encoder, that learns to synthesize new samples for an unseen category just by seeing few examples from it. The synthesized samples are then used to train a classifier. The proposed approach learns to both extract transferable intra-class deformations, or “deltas”, between same-class pairs of training examples, and to apply those deltas to the few provided examples of a novel class (unseen during training) in order to efficiently synthesize samples from that new class. The proposed method improves over the state-of-the-art in one-shot object-recognition and compares favorably in the few-shot case.
E. Rodolà, Z. Lähner, A. M. Bronstein, M. M. Bronstein, J. Solomon, Functional maps representation on product manifolds, arXiv:1809.10940, 2018 detailsFunctional maps representation on product manifolds
E. Rodolà, Z. Lähner, A. M. Bronstein, M. M. Bronstein, J. SolomonarXiv:1809.10940, 2018We consider the tasks of representing, analyzing and manipulating maps between shapes. We model maps as densities over the product manifold of the input shapes; these densities can be treated as scalar functions and therefore are manipulable using the language of signal processing on manifolds. Being a manifold itself, the product space endows the set of maps with a geometry of its own, which we exploit to define map operations in the spectral domain; we also derive relationships with other existing representations (soft maps and functional maps). To apply these ideas in practice, we discretize product manifolds and their Laplace-Beltrami operators, and we introduce localized spectral analysis of the product manifold as a novel tool for map processing. Our framework applies to maps defined between and across 2D and 3D shapes without requiring special adjustment, and it can be implemented efficiently with simple operations on sparse matrices.
C. Baskin, N. Liss, Y. Chai, E. Zheltonozhskii, E. Schwartz, R. Giryes, A. Mendelson, A. M. Bronstein, NICE: noise injection and clamping estimation for neural network quantization, arXiv:1810.00162, 2018 detailsNICE: noise injection and clamping estimation for neural network quantization
C. Baskin, N. Liss, Y. Chai, E. Zheltonozhskii, E. Schwartz, R. Giryes, A. Mendelson, A. M. BronsteinarXiv:1810.00162, 2018Convolutional Neural Networks (CNN) are very popular in many fields including computer vision, speech recognition, natural language processing, to name a few. Though deep learning leads to groundbreaking performance in these domains, the networks used are very demanding computationally and are far from real-time even on a GPU, which is not power efficient and therefore does not suit low power systems such as mobile devices. To overcome this challenge, some solutions have been proposed for quantizing the weights and activations of these networks, which accelerate the runtime significantly. Yet, this acceleration comes at the cost of a larger error. The uniqname method proposed in this work trains quantized neural networks by noise injection and a learned clamping, which improve the accuracy. This leads to state-of-the-art results on various regression and classification tasks, e.g., ImageNet classification with architectures such as ResNet-18/34/50 with low as 3-bit weights and activations. We implement the proposed solution on an FPGA to demonstrate its applicability for low power real-time applications.
Q. Qiu, J. Lezama, A. M. Bronstein, G. Sapiro, ForestHash: Semantic hashing with shallow random forests and tiny convolutional networks, Proc. European Conf. on Computer Vision (ECCV), 2018 detailsForestHash: Semantic hashing with shallow random forests and tiny convolutional networks
Q. Qiu, J. Lezama, A. M. Bronstein, G. SapiroProc. European Conf. on Computer Vision (ECCV), 2018Hash codes are efficient data representations for coping with the ever growing amounts of data. In this paper, we introduce a random forest semantic hashing scheme that embeds tiny convolutional neural networks (CNN) into shallow random forests, with near-optimal information-theoretic code aggregation among trees. We start with a simple hashing scheme, where random trees in a forest act as hashing functions by setting `1′ for the visited tree leaf, and `0′ for the rest. We show that traditional random forests fail to generate hashes that preserve the underlying similarity between the trees, rendering the random forests approach to hashing challenging. To address this, we propose to first randomly group arriving classes at each tee split node into two groups, obtaining a significantly simplified two-class classification problem, which can be handled using a light-weight CNN weak learner. Such random class grouping scheme enables code uniqueness by enforcing each class to share its code with different classes in different trees. A non-conventional low-rank loss is further adopted for the CNN weak learners to encourage code consistency by minimizing intra-class variations and maximizing inter-class distance for the two random class groups. Finally, we introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. The proposed approach significantly outperforms state-of-the-art hashing methods for image retrieval tasks on large-scale public datasets, while performing at the level of other state-of-the-art image classification techniques while utilizing a more compact and efficient scalable representation. This work proposes a principled and robust procedure to train and deploy in parallel an ensemble of light-weight CNNs, instead of simply going deeper.
T. Remez, O. Litany, R. Giryes, A. M. Bronstein, Class-aware fully-convolutional Gaussian and Poisson denoising, IEEE Trans. Image Processing, Vol. 27(11), 2018 detailsClass-aware fully-convolutional Gaussian and Poisson denoising
T. Remez, O. Litany, R. Giryes, A. M. BronsteinIEEE Trans. Image Processing, Vol. 27(11), 2018We propose a fully-convolutional neural-network architecture for image denoising which is simple yet powerful. Its structure allows to exploit the gradual nature of the denoising process, in which shallow layers handle local noise statistics, while deeper layers recover edges and enhance textures. Our method advances the state-of-the-art when trained for different noise levels and distributions (both Gaussian and Poisson). In addition, we show that making the denoiser class-aware by exploiting semantic class information boosts performance, enhances textures and reduces artifacts.
A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller, NetLSD: Hearing the shape of a graph, Proc. ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), 2018 detailsNetLSD: Hearing the shape of a graph
A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, MuellerProc. ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD), 2018Comparison among graphs is ubiquitous in graph analytics. However, it is a hard task in terms of the expressiveness of the employed similarity measure and the efficiency of its computation. Ideally, graph comparison should be invariant to the order of nodes and the sizes of compared graphs, adaptive to the scale of graph patterns, and scalable. Unfortunately, these properties have not been addressed together. Graph comparisons still rely on direct approaches, graph kernels, or representation-based methods, which are all inefficient and impractical for large graph collections. In this paper, we propose the Network Laplacian Spectral Descriptor (NetLSD): the first, to our knowledge, permutation- and size-invariant, scale-adaptive, and efficiently computable graph representation method that allows for straightforward comparisons of large graphs. NetLSD extracts a compact signature that inherits the formal properties of the Laplacian spectrum, specifically its heat or wave kernel; thus, it hears the shape of a graph. Our evaluation on a variety of real-world graphs demonstrates that it outperforms previous works in both expressiveness and efficiency.
O. Senouf, S. Vedula, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Blondheim, High frame-rate cardiac ultrasound imaging with deep learning, Proc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2018 detailsHigh frame-rate cardiac ultrasound imaging with deep learning
O. Senouf, S. Vedula, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. BlondheimProc. Int'l Conf. Medical Image Computing & Computer Assisted Intervention (MICCAI), 2018Cardiac ultrasound imaging requires a high frame rate in order to capture rapid motion. This can be achieved by multi-line acquisition (MLA), where several narrow-focused received lines are obtained from each wide-focused transmitted line. This shortens the acquisition time at the expense of introducing block artifacts. In this paper, we propose a data-driven learning-based approach to improve the MLA image quality. We train an end-to-end convolutional neural network on pairs of real ultrasound cardiac data, acquired through MLA and the corresponding single-line acquisition (SLA). The network achieves a significant improvement in image quality for both 5- and 7-line MLA resulting in a decorrelation measure similar to that of SLA while having the frame rate of MLA.
S. Vedula, O. Senouf, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. Gaitini, High quality ultrasonic multi-line transmission through deep learning, Proc. Machine Learning for Medical Image Reconstruction (MLMIR), 2018 detailsHigh quality ultrasonic multi-line transmission through deep learning
S. Vedula, O. Senouf, G. Zurakhov, A. M. Bronstein, M. Zibulevsky, O. Michailovich, D. Adam, D. GaitiniProc. Machine Learning for Medical Image Reconstruction (MLMIR), 2018Frame rate is a crucial consideration in cardiac ultrasound imaging and 3D sonography. Several methods have been proposed in the medical ultrasound literature aiming at accelerating the image acquisition. In this paper, we consider one such method called multi-line transmission (MLT), in which several evenly separated focused beams are transmitted simultaneously. While MLT reduces the acquisition time, it comes at the expense of a heavy loss of contrast due to the interactions between the beams (cross-talk artifact). In this paper, we introduce a data-driven method to reduce the artifacts arising in MLT. To this end, we propose to train an end-to-end convolutional neural network consisting of correction layers followed by a constant apodization layer. The network is trained on pairs of raw data obtained through MLT and the corresponding single-line transmission (SLT) data. Experimental evaluation demonstrates signicant improvement both in the visual image quality and in objective measures such as contrast ratio and contrast-to-noise ratio, while preserving resolution unlike traditional apodization-based methods. We show that the proposed method is able to generalize
well across dierent patients and anatomies on real and phantom data.A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, Mueller, SGR: Self-supervised spectral graph representation learning, Proc. KDD Deep Learning Day, 2018 detailsSGR: Self-supervised spectral graph representation learning
A. Tsitsulin, D. Mottin, P. Karras, A. M. Bronstein, E, MuellerProc. KDD Deep Learning Day, 2018Representing a graph as a vector is a challenging task; ideally, the representation should be easily computable and conducive to efficient comparisons among graphs, tailored to the particular data and an analytical task at hand. Unfortunately, a “one-size-fits-all” solution is unattainable, as different analytical tasks may require different attention to global or local graph features. We develop SGR, the first, to our knowledge, method for learning graph representations in a self-supervised manner. Grounded on spectral graph analysis, SGR seamlessly combines all aforementioned desirable properties. In extensive experiments, we show how our approach works on large graph collections, facilitates self-supervised representation learning across a variety of application domains, and performs competitively to state-of-the-art methods without re-training.
E. Schwartz, R. Giryes, A. M. Bronstein, DeepISP: Towards learning an end-to-end image processing pipeline, IEEE Trans. on Image Processing, 2018 detailsDeepISP: Towards learning an end-to-end image processing pipeline
E. Schwartz, R. Giryes, A. M. BronsteinIEEE Trans. on Image Processing, 2018We present DeepISP, a full end-to-end deep neural model of the camera image signal processing (ISP) pipeline. Our model learns a mapping from the raw low-light mosaiced image to the final visually compelling image and encompasses low-level tasks such as demosaicing and denoising as well as higher-level tasks such as color correction and image adjustment. The training and evaluation of the pipeline were performed on a dedicated dataset containing pairs of low-light and well-lit images captured by a Samsung S7 smartphone camera in both raw and processed JPEG formats. The proposed solution achieves state-of-the-art performance in the objective evaluation of PSNR on the subtask of joint denoising and demosaicing. For the full end-to-end pipeline, it achieves better visual quality compared to the manufacturer ISP, in both a subjective human assessment and when rated by a deep model trained for assessing image quality.
H. Haim, S. Elmalem, R. Giryes, A. M. Bronstein, E. Marom, Depth estimation from a single image using deep learned phase coded mask, IEEE Trans. Computational Imaging, Vol. 2(3), 2018 (Winner of the OSA Student Grand Challenge The Optical System of the Future) detailsDepth estimation from a single image using deep learned phase coded mask
H. Haim, S. Elmalem, R. Giryes, A. M. Bronstein, E. MaromIEEE Trans. Computational Imaging, Vol. 2(3), 2018 (Winner of the OSA Student Grand Challenge The Optical System of the Future)Depth estimation from a single image is a well-known challenge in computer vision. With the advent of deep learning, several approaches for monocular depth estimation have been proposed, all of which have inherent limitations due to the scarce depth cues that exist in a single image. Moreover, these methods are very demanding computationally, which makes them inadequate for systems with limited processing power. In this paper, a phase-coded aperture camera for depth estimation is proposed. The camera is equipped with an optical phase mask that provides unambiguous depth-related color characteristics for the captured image. These are used for estimating the scene depth map using a fully convolutional neural network. The phase-coded aperture structure is learned jointly with the network weights using backpropagation. The strong depth cues (encoded in the image by the phase mask, designed together with the network weights) allow a much simpler neural network architecture for faster and more accurate depth estimation. Performance achieved on simulated images as well as on a real optical setup is superior to the state-of-the-art monocular depth estimation methods (both with respect to the depth accuracy and required processing power), and is competitive with more complex and expensive depth estimation methods such as light-field cameras.
E. Tsitsin, A. M. Bronstein, T. Hendler, M. Medvedovsky, Passive electric impedance tomography, Proc. Electric Impedance Tomography (EIT), 2018 detailsPassive electric impedance tomography
E. Tsitsin, A. M. Bronstein, T. Hendler, M. MedvedovskyProc. Electric Impedance Tomography (EIT), 2018We introduce an electric impedance tomography modality without any active current injection. By loading the probe electrodes with a time-varying network of impedances, the proposed technique exploits electrical fields existing in the medium due to biological activity or EM interference from the environment or an implantable device. A phantom validation of the technique is presented.
E. Tsitsin, T. Mund, A. M. Bronstein, Printable anisotropic phantom for EEG with distributed current sources, Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018 detailsPrintable anisotropic phantom for EEG with distributed current sources
E. Tsitsin, T. Mund, A. M. BronsteinProc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018We introduce an electric impedance tomography modality without any active current injection. By loading the probe electrodes with a time-varying network of impedances, the proposed technique exploits electrical fields existing in the medium due to biological activity or EM interference from the environment or an implaPresented is the phantom mimicking the electromagnetic properties of the human head. The fabrication is based on the additive manufacturing (3d-printing) technology combined with the electrically conductive gel. The novel key features of the phantom are the controllable anisotropic electrical conductivity of the skull and the densely packed actively multiplexed monopolar current sources permitting interpolation of the measured gain function to any dipolar current source position and orientation within the head. The phantom was tested in realistic environment successfully simulating the possible signals from neural activations situated at any depth within the brain as well as EMI and motion artifacts. The proposed design can be readily repeated in any lab having an access to a standard 100 micron precision 3d-printer. The meshes of the phantom are available from the corresponding author.ntable device. A phantom validation of the technique is presented.
E. Tsitsin, M. Medvedovsky, A. M. Bronstein, VibroEEG: Improved EEG source reconstruction by combined acoustic-electric imaging, Proc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018 detailsVibroEEG: Improved EEG source reconstruction by combined acoustic-electric imaging
E. Tsitsin, M. Medvedovsky, A. M. BronsteinProc. IEEE Int'l Symposium on Biomedical Imaging (ISBI), 2018Electroencephalography (EEG) is the electrical neural activity recording modality with high temporal and low spatial resolution. Here we propose a novel technique that we call vibroEEG improving significantly the source localization accuracy of EEG. Our method combines electric potential acquisition in concert with acoustic excitation of the vibrational modes of the electrically active cerebral cortex which displace periodically the sources of the low frequency neural electrical activity. The sources residing on the maxima of the induced modes will be maximally weighted in the corresponding spectral components of the broadband signals measured on the noninvasive electrodes. In vibroEEG, for the first time the rich internal geometry of the cerebral cortex can be utilized to separate sources of neural activity lying close in the sense of the Euclidean metric. When the modes are excited locally using phased arrays the neural activity can essentially be probed at any cortical location. When a single transducer is used to induce the excitations, the EEG gain matrix is still being enriched with numerous independent gain vectors increasing its rank. We show theoretically and on numerical simulation that in both cases the source localization accuracy improves substantially.
C. Baskin, N. Liss, E. Zheltonozhskii, A. M. Bronstein, A. Mendelson, Streaming architectures for large-scale quantized neural networks on an FPGA-based dataflow platform, IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2018 detailsStreaming architectures for large-scale quantized neural networks on an FPGA-based dataflow platform
C. Baskin, N. Liss, E. Zheltonozhskii, A. M. Bronstein, A. MendelsonIEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2018Deep neural networks (DNNs) are used by different applications that are executed on a range of computer architectures, from IoT devices to supercomputers. The footprint of these networks is huge as well as their computational and communication needs. In order to ease the pressure on resources, research indicates that in many cases a low precision representation (1-2 bit per parameter) of weights and other parameters can achieve similar accuracy while requiring less resources. Using quantized values enables the use of FPGAs to run NNs, since FPGAs are well fitted to these primitives; e.g., FPGAs provide efficient support for bitwise operations and can work with arbitrary-precision representation of numbers. This paper presents a new streaming architecture for running QNNs on FPGAs. The proposed architecture scales out better than alternatives, allowing us to take advantage of systems with multiple FPGAs. We also included support for skip connections, that are used in state-of-the art NNs, and shown that our architecture allows to add those connections almost for free. All this allowed us to implement an 18-layer ResNet for 224×224 images classification, achieving 57.5% top-1 accuracy. In addition, we implemented a full-sized quantized AlexNet. In contrast to previous works, we use 2-bit activations instead of 1-bit ones, which improves AlexNet’s top-1 accuracy from 41.8% to 51.03% for the ImageNet classification. Both AlexNet and ResNet can handle 1000-class real-time classification on an FPGA. Our implementation of ResNet-18 consumes 5× less power and is 4× slower for ImageNet, when compared to the same NN on the latest Nvidia GPUs. Smaller NNs, that fit a single FPGA, are running faster then on GPUs on small (32×32) inputs, while consuming up to 20× less energy and power.
R. Giryes, Y. C. Eldar, A. M. Bronstein, G. Sapiro, Tradeoffs between convergence speed and reconstruction accuracy in inverse problems, IEEE Trans. on Signal Processing, Vol. 66(7), 2018 detailsTradeoffs between convergence speed and reconstruction accuracy in inverse problems
R. Giryes, Y. C. Eldar, A. M. Bronstein, G. SapiroIEEE Trans. on Signal Processing, Vol. 66(7), 2018Solving inverse problems with iterative algorithms is popular, especially for large data. Due to time constraints, the number of possible iterations is usually limited, potentially affecting the achievable accuracy. Given an error one is willing to tolerate, an important question is whether it is possible to modify the original iterations to obtain faster convergence to a minimizer achieving the allowed error without increasing the computational cost of each iteration considerably. Relying on recent recovery techniques developed for settings in which the desired signal belongs to some low-dimensional set, we show that using a coarse estimate of this set may lead to faster convergence at the cost of an additional reconstruction error related to the accuracy of the set approximation. Our theory ties to recent advances in sparse recovery, compressed sensing, and deep learning. Particularly, it may provide a possible explanation to the successful approximation of the L1-minimization solution by neural networks with layers representing iterations, as practiced in the learned iterative shrinkage-thresholding algorithm.
- A. Boyarski, S. Vedula, A. M. Bronstein, Deep matrix factorization with spectral geometric regularization, arXiv: 1911.07255, 2019 details
Deep matrix factorization with spectral geometric regularization
A. Boyarski, S. Vedula, A. M. BronsteinarXiv: 1911.07255, 2019We address the problem of reconstructing a matrix from a subset of its entries. Current methods, branded as geometric matrix completion, augment classical rank regularization techniques by incorporating geometric information into the solution. This information is usually provided as graphs encoding relations between rows/columns. In this work, we propose a simple spectral approach for solving the matrix completion problem, via the framework of functional maps. We introduce the zoomout loss, a multiresolution spectral geometric loss inspired by recent advances in shape correspondence, whose minimization leads to state-of-the-art results on various recommender systems datasets. Surprisingly, for some datasets, we were able to achieve comparable results even without incorporating geometric information. This puts into question both the quality of such information and current methods’ ability to use it in a meaningful and efficient way.
Code is available either as Google Colab notebook, or via https://github.com/amitboy/SGMC
Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, Loss aware post-training quantization, arXiv: 1911.07190, 2019 detailsLoss aware post-training quantization
Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. MendelsonarXiv: 1911.07190, 2019Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or above). In this work, we study the effect of quantization on the structure of the loss landscape. We show that the structure is flat and separable for mild quantization, enabling straightforward post-training quantization methods to achieve good results. On the other hand, we show that with more aggressive quantization, the loss landscape becomes highly non-separable with sharp minima points, making the selection of quantization parameters more challenging. Armed with this understanding, we design a method that quantizes the layer parameters jointly, enabling significant accuracy improvement over current post-training quantization methods. Reference implementation accompanies the paper.
Y. Nemcovsky, E. Zheltonozhskii, C. Baskin, B. Chmiel, A. M. Bronstein, A. Mendelson, Smoothed inference for adversarially-trained models, arXiv: 1911.07198, 2019 detailsSmoothed inference for adversarially-trained models
Y. Nemcovsky, E. Zheltonozhskii, C. Baskin, B. Chmiel, A. M. Bronstein, A. MendelsonarXiv: 1911.07198, 2019Deep neural networks are known to be vulnerable to inputs with maliciously constructed adversarial perturbations aimed at forcing misclassification. We study randomized smoothing as a way to both improve performance on unperturbed data as well as increase robustness to adversarial attacks. Moreover, we extend the method proposed by arXiv:1811.09310 by adding low-rank multivariate noise, which we then use as a base model for smoothing. The proposed method achieves 58.5% top-1 accuracy on CIFAR-10 under PGD attack and outperforms previous works by 4%. In addition, we consider a family of attacks, which were previously used for training purposes in the certified robustness scheme. We demonstrate that the proposed attacks are more effective than PGD against both smoothed and non-smoothed models. Since our method is based on sampling, it lends itself well for trading-off between the model inference complexity and its performance. A reference implementation of the proposed techniques is provided.
S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky, MetAdapt: Meta-learned task-adaptive architecture for few-shot classification, arXiv: 1912.00412, 2019 detailsMetAdapt: Meta-learned task-adaptive architecture for few-shot classification
S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. KarlinskyarXiv: 1912.00412, 2019Few-Shot Learning (FSL) is a topic of rapidly growing interest. Typically, in FSL a model is trained on a dataset consisting of many small tasks (meta-tasks) and learns to adapt to novel tasks that it will encounter during test time. This is also referred to as meta-learning. So far, meta-learning FSL methods have focused on optimizing parameters of pre-defined network architectures, in order to make them easily adaptable to novel tasks. Moreover, it was observed that, in general, larger architectures perform better than smaller ones up to a certain saturation point (and even degrade due to over-fitting). However, little attention has been given to explicitly optimizing the architectures for FSL, nor to an adaptation of the architecture at test time to particular novel tasks. In this work, we propose to employ tools borrowed from the Differentiable Neural Architecture Search (D-NAS) literature in order to optimize the architecture for FSL without over-fitting. Additionally, to make the architecture task adaptive, we propose the concept of `MetAdapt Controller’ modules. These modules are added to the model and are meta-trained to predict the optimal network connections for a given novel task. Using the proposed approach we observe state-of-the-art results on two popular few-shot benchmarks: miniImageNet and FC100.
E. Rozenberg, D. Freedman, A. M. Bronstein, Localization with limited annotation for chest X-rays, Proc. ML4H, NeurIPS, 2019 detailsLocalization with limited annotation for chest X-rays
E. Rozenberg, D. Freedman, A. M. BronsteinProc. ML4H, NeurIPS, 2019Localization of an object within an image is a common task in medical imaging. Learning to localize or detect objects typically requires the collection of data which has been labelled with bounding boxes or similar annotations, which can be very time consuming and expensive. A technique which could perform such learning with much less annotation would, therefore, be quite valuable. We present such a technique for localization with limited annotation, in which the number of images with bounding boxes can be a small fraction of the total dataset (e.g. less than 1%); all other images only possess a whole image label and no bounding box. We propose a novel loss function for tackling this problem; the loss is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning and is numerically well-posed. Furthermore, we propose a new architecture which accounts for both patch dependence and shift-invariance, through the inclusion of CRF layers and anti-aliasing filters, respectively. We apply our technique to the localization of thoracic diseases in chest X-ray images and demonstrate state-of-the-art localization performance on the ChestX-ray14 dataset.
S. Vedula, O. Senouf, G. Zurakov, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Learning beamforming in ultrasound imaging, Proc. Medical Imaging with Deep Learning (MIDL), 2019 detailsLearning beamforming in ultrasound imaging
S. Vedula, O. Senouf, G. Zurakov, A. M. Bronstein, O. Michailovich, M. ZibulevskyProc. Medical Imaging with Deep Learning (MIDL), 2019Medical ultrasound (US) is a widespread imaging modality owing its popularity to cost-efficiency, portability, speed, and lack of harmful ionizing radiation. In this paper, we demonstrate that replacing the traditional ultrasound processing pipeline with a data-driven, learnable counterpart leads to signicant improvement in image quality. Moreover, we demonstrate that greater improvement can be achieved through a learning-based design of the transmitted beam patterns simultaneously with learning an image reconstruction pipeline. We evaluate our method on an in-vivo first-harmonic cardiac ultrasound dataset acquired from volunteers and demonstrate the signicance of the learned pipeline and transmit beam patterns on the image quality when compared to standard transmit and receive beamformers used in high frame-rate US imaging. We believe that the presented methodology provides a fundamentally dierent perspective on the classical problem of ultrasound beam pattern design.E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. Bronstein, RepMet: Representative-based metric learning for classification and one-shot object detection, Proc. Computer Vision and Pattern Recognition (CVPR), 2019 detailsRepMet: Representative-based metric learning for classification and one-shot object detection
E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, R. Feris, A. Kumar, R. Giryes, A. M. BronsteinProc. Computer Vision and Pattern Recognition (CVPR), 2019Distance metric learning (DML) has been successfully applied to object classification, both in the standard regime of rich training data and in the few-shot scenario, where each category is represented by only few examples. In this work, we propose a new method for DML, featuring a joint learning of the embedding space and of the data distribution of the training categories, in a single training process. Our method improves upon leading algorithms for DML-based object classification. Furthermore, it opens the door for a new task in computer vision — a few-shot object detection, since the proposed DML architecture can be naturally embedded as the classification head of any standard object detector. In numerous experiments, we achieve state-of-the-art classification results on a variety of fine-grained datasets, and offer the community a benchmark on the few-shot detection task, performed on the Imagenet-LOC dataset.
O. Halimi, O. Litany, E. Rodolà, A. M. Bronstein, R. Kimmel, Self-supervised learning of dense shape correspondence, Proc. Computer Vision and Pattern Recognition (CVPR), 2019 detailsSelf-supervised learning of dense shape correspondence
O. Halimi, O. Litany, E. Rodolà, A. M. Bronstein, R. KimmelProc. Computer Vision and Pattern Recognition (CVPR), 2019We introduce the first completely unsupervised correspondence learning approach for deformable 3D shapes. Key to our model is the understanding that natural deformations (such as changes in the pose) approximately preserve the metric structure of the surface, yielding a natural criterion to drive the learning process toward distortion-minimizing predictions. On this basis, we overcome the need for annotated data and replace it with a purely geometric criterion. The resulting learning model is class-agnostic and is able to leverage any type of deformable geometric data for the training phase. In contrast to existing supervised approaches which specialize in the class seen at training time, we demonstrate stronger generalization as well as applicability to a variety of challenging settings. We showcase our method on a wide selection of correspondence benchmarks, where we outperform other methods in terms of accuracy, generalization, and efficiency.
A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, A. M. Bronstein, LaSO: Label-Set Operations networks for multi-label few-shot learning, Proc. Computer Vision and Pattern Recognition (CVPR), 2019 detailsLaSO: Label-Set Operations networks for multi-label few-shot learning
A. Alfassy, L. Karlinsky, A. Aides, J. Shtok, S. Harary, R. Feris, R. Giryes, A. M. BronsteinProc. Computer Vision and Pattern Recognition (CVPR), 2019Example synthesis is one of the leading methods to tackle the problem of few-shot learning, where only a small number of samples per class are available. However, current synthesis approaches only address the scenario of a single category label per image. In this work, we propose a novel technique for synthesizing samples with multiple labels for the (yet unhandled) multi-label few-shot classification scenario. We propose to combine pairs of given examples in feature space, so that the resulting synthesized feature vectors will correspond to examples whose label sets are obtained through certain set operations on the label sets of the corresponding input pairs. Thus, our method is capable of producing a sample containing the intersection, union or set-difference of labels present in two input samples. As we show, these set operations generalize to labels unseen during training. This enables performing augmentation on examples of novel categories, thus, facilitating multi-label few-shot classifier learning. We conduct numerous experiments showing promising results for the label-set manipulation capabilities of the proposed approach, both directly (using the classification and retrieval metrics), and in the context of performing data augmentation for multi-label few-shot learning. We propose a benchmark for this new and challenging task and show that our method compares favorably to all the common baselines.
A. Zabatani, V. Surazhsky, E. Sperling, S. Ben Moshe, O. Menashe, D. H. Silver, Z. Karni, A. M. Bronstein, M. M. Bronstein, R. Kimmel, Intel RealSense SR300 Coded light depth Camera, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2019 detailsIntel RealSense SR300 Coded light depth Camera
A. Zabatani, V. Surazhsky, E. Sperling, S. Ben Moshe, O. Menashe, D. H. Silver, Z. Karni, A. M. Bronstein, M. M. Bronstein, R. KimmelIEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2019Intel RealSense SR300 is a depth camera capable of providing a VGA-size depth map at 60 fps and 0.125mm depth resolution. In addition, it outputs an infrared VGA-resolution image and a 1080p color texture image at 30 fps.
SR300 form-factor enables it to be integrated into small consumer products and as a front-facing camera in laptops and Ultrabooks. The SR300 depth camera is based on a coded-light technology where triangulation between projected patterns and images captured by a dedicated sensor is used to produce the depth map. Each projected line is coded by a special temporal optical code, that enables a dense depth map reconstruction from its reflection. The solid mechanical assembly of the camera allows it to stay calibrated throughout temperature and pressure changes, drops, and hits. In addition, active dynamic control maintains a calibrated depth output. An extended API LibRS released with the camera allows developers to integrate the camera in various applications. Algorithms for 3D scanning, facial analysis, hand gesture recognition, and tracking are within reach for applications using the SR300. In this paper, we describe the underlying technology, hardware, and algorithms of the SR300, as well as its calibration procedure, and outline some use cases. We believe that this paper will provide a full case study of a mass-produced depth sensing product and technology.Y. Zur, C. Baskin, E. Zheltonozhskii, B. Chmiel, I. Evron, A. M. Bronstein, A. Mendelson, Towards learning of filter-level heterogeneous compression of convolutional neural networks, Proc. AutoML Workshop, Int'l Conf. on Machine Learning (ICML), 2019 detailsTowards learning of filter-level heterogeneous compression of convolutional neural networks
Y. Zur, C. Baskin, E. Zheltonozhskii, B. Chmiel, I. Evron, A. M. Bronstein, A. MendelsonProc. AutoML Workshop, Int'l Conf. on Machine Learning (ICML), 2019Recently, deep learning has become a de facto standard in machine learning with convolutional neural networks (CNNs) demonstrating spectacular success on a wide variety of tasks. However, CNNs are typically very demanding computationally at inference time. One of the ways to alleviate this burden on certain hardware platforms is quantization relying on the use of low-precision arithmetic representation for the weights and the activations. Another popular method is the pruning of the number of filters in each layer. While mainstream deep learning methods train the neural networks weights while keeping the network architecture fixed, the emerging neural architecture search (NAS) techniques make the latter also amenable to training. In this paper, we formulate optimal arithmetic bit length allocation and neural network pruning as a NAS problem, searching for the configurations satisfying a computational complexity budget while maximizing the accuracy. We use a differentiable search method based on the continuous relaxation of the search space proposed by Liu et al. (2019a). We show, by grid search, that heterogeneous quantized networks suffer from a high variance which renders the benefit of the search questionable. For pruning, improvement over homogeneous cases is possible, but it is still challenging to find those configurations with the proposed method. The code is publicly available at https://github.com/yochaiz/Slimmable and https://github.com/yochaiz/darts-UNIQ.
E. Schwartz, L. Karlinsky, R. Feris, R. Giryes, A. M. Bronstein, Baby steps towards few-shot learning with multiple semantics, arXiv:1906.01905, 2019 detailsBaby steps towards few-shot learning with multiple semantics
E. Schwartz, L. Karlinsky, R. Feris, R. Giryes, A. M. BronsteinarXiv:1906.01905, 2019Learning from one or few visual examples is one of the key capabilities of humans since early infancy, but is still a significant challenge for modern AI systems. While considerable progress has been achieved in few-shot learning from a few image examples, much less attention has been given to the verbal descriptions that are usually provided to infants when they are presented with a new object. In this paper, we focus on the role of additional semantics that can significantly facilitate few-shot visual learning. Building upon recent advances in few-shot learning with additional semantic information, we demonstrate that further improvements are possible using richer semantics and multiple semantic sources. Using these ideas, we offer the community a new result on the one-shot test of the popular miniImageNet benchmark, comparing favorably to the previous state-of-the-art results for both visual only and visual plus semantics-based approaches. We also performed an ablation study investigating the components and design choices of our approach.
A. Rampini, I. Tallini, M. Ovsjanikov, A. M. Bronstein, E. Rodola, Correspondence-free region localization for partial shape similarity via Hamiltonian spectrum alignment, Proc. 3D Vision (3DV), 2019 (Best paper award) detailsCorrespondence-free region localization for partial shape similarity via Hamiltonian spectrum alignment
A. Rampini, I. Tallini, M. Ovsjanikov, A. M. Bronstein, E. RodolaProc. 3D Vision (3DV), 2019 (Best paper award)We consider the problem of localizing relevant subsets of non-rigid geometric shapes given only a partial 3D query as the input. Such problems arise in several challenging tasks in 3D vision and graphics, including partial shape similarity, retrieval, and non-rigid correspondence. We phrase the problem as one of alignment between short sequences of eigenvalues of basic differential operators, which are constructed upon a scalar function defined on the 3D surfaces. Our method therefore seeks for a scalar function that entails this alignment. Differently from existing approaches, we do not require solving for a correspondence between the query and the target, therefore greatly simplifying the optimization process; our core technique is also descriptor-free, as it is driven by the geometry of the two objects as encoded in their operator spectra. We further show that our spectral alignment algorithm provides a remarkably simple alternative to the recent shape-from-spectrum reconstruction approaches. For both applications, we demonstrate improvement over the state-of-the-art either in terms of accuracy or computational cost.
O. Senouf, S. Vedula, T. Weiss, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Self-supervised learning of inverse problem solvers in medical imaging, Proc. Medical Image Learning with Less Labels and Imperfect Data, MICCAI 2019 detailsSelf-supervised learning of inverse problem solvers in medical imaging
O. Senouf, S. Vedula, T. Weiss, A. M. Bronstein, O. Michailovich, M. ZibulevskyProc. Medical Image Learning with Less Labels and Imperfect Data, MICCAI 2019In the past few years, deep learning-based methods have demonstrated enormous success for solving inverse problems in medical imaging. In this work, we address the following question: Given a set of measurements obtained from real imaging experiments, what is the best way to use a learnable model and the physics of the modality to solve the inverse problem and reconstruct the latent image? Standard supervised learning based methods approach this problem by collecting data sets of known latent images and their corresponding measurements. However, these methods are often impractical due to the lack of availability of appropriately sized training sets, and, more generally, due to the inherent difficulty in measuring the “groundtruth” latent image. In light of this, we propose a self-supervised approach to training inverse models in medical imaging in the absence of aligned data. Our method only requiring access to the measurements and the forward model at training. We showcase its effectiveness on inverse problems arising in accelerated magnetic resonance imaging (MRI).
R. M. Dyke, C Stride, Y.-K. Lai, P. L. Rosin, M. Aubry, A. Boyarski, A. M. Bronstein, M. M. Bronstein, D. Cremers, M. Fisher, T. Groueix, D. Guo, V. G. Kim, R. Kimmel, Z. Lähner, K. Li, O. Litany, T. Remez, E. Rodolà, B. C. Russell, Y. Sahillioglu, R. Slossberg, M. Vestner, Z. Wu, J. Yang, Gary Tam, Shape Correspondence with Isometric and Non-Isometric Deformations, Eurographics Workshop on 3D Object Retrieval, 2019 detailsShape Correspondence with Isometric and Non-Isometric Deformations
R. M. Dyke, C Stride, Y.-K. Lai, P. L. Rosin, M. Aubry, A. Boyarski, A. M. Bronstein, M. M. Bronstein, D. Cremers, M. Fisher, T. Groueix, D. Guo, V. G. Kim, R. Kimmel, Z. Lähner, K. Li, O. Litany, T. Remez, E. Rodolà, B. C. Russell, Y. Sahillioglu, R. Slossberg, M. Vestner, Z. Wu, J. Yang, Gary TamEurographics Workshop on 3D Object Retrieval, 2019The registration of surfaces with non-rigid deformation, especially non-isometric deformations, is a challenging problem. When applying such techniques to real scans, the problem is compounded by topological and geometric inconsistencies between shapes. In this paper, we capture a benchmark dataset of scanned 3D shapes undergoing various controlled deformations (articulating, bending, stretching and topologically changing), along with ground truth correspondences. With the aid of this tiered benchmark of increasingly challenging real scans, we explore this problem and investigate how robust current state-of-the-art methods perform in different challenging registration and correspondence scenarios. We discover that changes in topology is a challenging problem for some methods and that machine learning-based approaches prove to be more capable of handling non-isometric deformations on shapes that are moderately similar to the training set.
N. Diamant, D. Zadok, C. Baskin, E. Schwartz, A. M. Bronstein, Beholder-GAN: Generation and beautification of facial images with conditioning on their beauty level, Proc. Int'l Conf. on Image Processing (ICIP), 2019 detailsBeholder-GAN: Generation and beautification of facial images with conditioning on their beauty level
N. Diamant, D. Zadok, C. Baskin, E. Schwartz, A. M. BronsteinProc. Int'l Conf. on Image Processing (ICIP), 2019Beauty is in the eye of the beholder. This maxim, emphasizing the subjectivity of the perception of beauty, has enjoyed a wide consensus since ancient times. In the digital era, data-driven methods have been shown to be able to predict human-assigned beauty scores for facial images. In this work, we augment this ability and train a generative model that generates faces conditioned on a requested beauty score. In addition, we show how this trained generator can be used to beautify an input face image. By doing so, we achieve an unsupervised beautification model, in the sense that it relies on no ground truth target images.
G. Pai, R. Talmon, A. M. Bronstein, R. Kimmel, DIMAL: Deep isometric manifold learning using sparse geodesic sampling, Proc. IEEE Winter Conf. on Applications of Computer Vision (WACV), 2019 detailsDIMAL: Deep isometric manifold learning using sparse geodesic sampling
G. Pai, R. Talmon, A. M. Bronstein, R. KimmelProc. IEEE Winter Conf. on Applications of Computer Vision (WACV), 2019This paper explores a fully unsupervised deep learning approach for computing distance-preserving maps that generate low-dimensional embeddings for a certain class of manifolds. We use the Siamese configuration to train a neural network to solve the problem of least squares multidimensional scaling for generating maps that approximately preserve geodesic distances. By training with only a few landmarks, we show a significantly improved local and nonlocal generalization of the isometric mapping as compared to analogous non-parametric counterparts. Importantly, the combination of a deep-learning framework with a multidimensional scaling objective enables a numerical analysis of network architectures to aid in understanding their representation power. This provides a geometric perspective to the generalizability of deep learning.
O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. Cremers, Partial single- and multi-shape dense correspondence using functional maps, Chapter in The Handbook of Numerical Analysis - Processing, Analyzing and Learning of Images, Shapes, and Forms, Elsevier, 2019 detailsPartial single- and multi-shape dense correspondence using functional maps
O. Litany, E. Rodolà, A. M. Bronstein, M. M. Bronstein, D. CremersChapter in The Handbook of Numerical Analysis - Processing, Analyzing and Learning of Images, Shapes, and Forms, Elsevier, 2019Shape correspondence is a fundamental problem in computer graphics and vision, with applications in various problems including animation, texture mapping, robotic vision, medical imaging, archaeology and many more. In settings where the shapes are allowed to undergo non-rigid deformations and only partial views are available, the problem becomes very challenging. In this chapter we describe recent techniques designed to tackle such problems. Specifically, we explain how the renown functional maps framework can be extended to tackle the partial setting. We then present a further extension to the mutli-part case in which one tries to establish correspondence between a collection of shapes. Finally, we focus on improving the technique efficiency, by disposing of its spatial ingredient and thus keeping the computation in the spectral domain. Extensive experimental results are provided along with the theoretical explanations, to demonstrate the effectiveness of the described methods in these challenging scenarios.
A. Boyarski, A. M. Bronstein, Multidimensional scaling, Computer Vision: A Reference Guide, (Katsushi Ikeuchi, Ed.) detailsMultidimensional scaling
A. Boyarski, A. M. BronsteinComputer Vision: A Reference Guide, (Katsushi Ikeuchi, Ed.)The various multidimensional scaling models can be broadly classified into metric vs. non-metric, and strain (classical scaling) vs. stress (distance scaling) based MDS models. In metric MDS the goal is to maintain the distances in the embedding space as close as possible to the given dissimilarities, while in nonmetric MDS only the order relations between the dissimilarities are important. Strain-based MDS is an algebraic version of the problem that can be solved by eigenvalue decomposition. Stress-based MDS uses a geometric distortion criterion which results in a non-linear and non-convex optimization problem. Each of these models has its own merits and drawbacks, both numerically and application-wise. On top of these basic models, there exist numerous generalizations, including embedding into non-Euclidean domains, working with different stress models, working in different subspaces, and incorporating machine learning approaches to obtain faster, more accurate and more robust embeddings. This chapter reviews these models, with emphasis on their role in computer vision applications.
- C. Baskin, E. Schwartz, E. Zheltonozhskii, N. Liss, R. Giryes, A. M. Bronstein, A. Mendelson, UNIQ: Uniform noise injection for non-uniform quantization of neural networks, ACM Transactions on Computer Systems (TOCS), 2020 details
UNIQ: Uniform noise injection for non-uniform quantization of neural networks
C. Baskin, E. Schwartz, E. Zheltonozhskii, N. Liss, R. Giryes, A. M. Bronstein, A. MendelsonACM Transactions on Computer Systems (TOCS), 2020We present a novel method for training a neural network amenable to inference in low-precision arithmetic with quantized weights and activations. The training is performed in full precision with random noise injection emulating quantization noise. In order to circumvent the need to simulate realistic quantization noise distributions, the weight distributions are uniformized by a non-linear transfor- mation, and uniform noise is injected. This procedure emulates a non-uniform k-quantile quantizer at inference time, which adapts to the specific distribution of the quantized parameters. As a by-product of injecting noise to weights, we find that activations can also be quantized to as low as 8-bit with only a minor accuracy degradation. The method achieves state-of-the-art results for training low-precision networks on ImageNet. In particular, we observe no degradation in accuracy for MobileNet and ResNet-18/34/50 on ImageNet with as low as 4-bit quantization of weights. Our solution achieves the state-of-the-art results in accuracy, in the low computational budget regime, compared to similar models.
J. Alush-Aben, L. Ackerman-Schraier, T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, 3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI, Proc. Machine Learning for Medical Image Reconstruction, MICCAI 2020 details3D FLAT: Feasible Learned Acquisition Trajectories for Accelerated MRI
J. Alush-Aben, L. Ackerman-Schraier, T. Weiss, S. Vedula, O. Senouf, A. M. BronsteinProc. Machine Learning for Medical Image Reconstruction, MICCAI 2020Magnetic Resonance Imaging (MRI) has long been considered to be among the gold standards of today’s diagnostic imaging. The most significant drawback of MRI is long acquisition times, prohibiting its use in standard practice for some applications. Compressed sensing (CS) proposes to subsample the k-space (the Fourier domain dual to the physical space of spatial coordinates) leading to significantly accelerated acquisition. However, the benefit of compressed sensing has not been fully exploited; most of the sampling densities obtained through CS do not produce a trajectory that obeys the stringent constraints of the MRI machine imposed in practice. Inspired by recent success of deep learning-based approaches for image reconstruction and ideas from computational imaging on learning-based design of imaging systems, we introduce 3D FLAT, a novel protocol for data-driven design of 3D non-Cartesian accelerated trajectories in MRI. Our proposal leverages the entire 3D k-space to simultaneously learn a physically feasible acquisition trajectory with a reconstruction method. Experimental results, performed as a proof-of-concept, suggest that 3D FLAT achieves higher image quality for a given readout time compared to standard trajectories such as radial, stack-of-stars, or 2D learned trajectories (trajectories that evolve only in the 2D plane while fully sampling along the third dimension). Furthermore, we demonstrate evidence supporting the significant benefit of performing MRI acquisitions using non-Cartesian 3D trajectories over 2D non-Cartesian trajectories acquired slice-wise.
T. Weiss, S. Vedula, O. Senouf, O. Michailovich, A. M. Bronstein, Towards learned optimal q-space sampling in diffusion MRI, Proc. Computational Diffusion MRI, MICCAI 2020 detailsTowards learned optimal q-space sampling in diffusion MRI
T. Weiss, S. Vedula, O. Senouf, O. Michailovich, A. M. BronsteinProc. Computational Diffusion MRI, MICCAI 2020Fiber tractography is an important tool of computational neuroscience that enables reconstructing the spatial connectivity and organization of white matter of the brain. Fiber tractography takes advantage of diffusion Magnetic Resonance Imaging (dMRI) which allows measuring the apparent diffusivity of cerebral water along different spatial directions. Unfortunately, collecting such data comes at the price of reduced spatial resolution and substantially elevated acquisition times, which limits the clinical applicability of dMRI. This problem has been thus far addressed using two principal strategies. Most of the efforts have been extended towards improving the quality of signal estimation for any, yet fixed sampling scheme (defined through the choice of diffusion encoding gradients). On the other hand, optimization over the sampling scheme has also proven to be effective. Inspired by the previous results, the present work consolidates the above strategies into a unified estimation framework, in which the optimization is carried out with respect to both estimation model and sampling design concurrently. The proposed solution offers substantial improvements in the quality of signal estimation as well as the accuracy of ensuing analysis by means of fiber tractography. While proving the optimality of the learned estimation models would probably need more extensive evaluation, we nevertheless claim that the learned sampling schemes can be of immediate use, offering a way to improve the dMRI analysis without the necessity of deploying the neural network used for their estimation. We present a comprehensive comparative analysis based on the Human Connectome Project data.
E. Zheltonozhskii, C. Baskin, A. M. Bronstein, A. Mendelson, Self-supervised learning for large-scale unsupervised image clustering, NeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020 detailsSelf-supervised learning for large-scale unsupervised image clustering
E. Zheltonozhskii, C. Baskin, A. M. Bronstein, A. MendelsonNeurIPS 2020 Workshop: Self-Supervised Learning - Theory and Practice, 2020Unsupervised learning has always been appealing to machine learning researchers and practitioners, allowing them to avoid an expensive and complicated process of labeling the data. However, unsupervised learning of complex data is challenging, and even the best approaches show much weaker performance than their supervised counterparts. Self-supervised deep learning has become a strong instrument for representation learning in computer vision. However, those methods have not been evaluated in a fully unsupervised setting.
In this paper, we propose a simple scheme for unsupervised classification based on self-supervised representations. We evaluate the proposed approach with several recent self-supervised methods showing that it achieves competitive results for ImageNet classification (39% accuracy on ImageNet with 1000 clusters and 46% with overclustering). We suggest adding the unsupervised evaluation to a set of standard benchmarks for self-supervised learning.G. Mariani, L. Cosmo, A. M. Bronstein, E. Rodolà, Generating adversarial surfaces via band-limited perturbations, Computer Graphics Forum, 2020 detailsGenerating adversarial surfaces via band-limited perturbations
G. Mariani, L. Cosmo, A. M. Bronstein, E. RodolàComputer Graphics Forum, 2020Adversarial attacks have demonstrated remarkable efficacy in altering the output of a learning model by applying a minimal perturbation to the input data. While increasing attention has been placed on the image domain, however, the study of adversarial perturbations for geometric data has been notably lagging behind. In this paper, we show that effective adversarial attacks can be concocted for surfaces embedded in 3D, under weak smoothness assumptions on the perceptibility of the attack. We address the case of deformable 3D shapes in particular, and introduce a general model that is not tailored to any specific surface representation, nor does it assume access to a parametric description of the 3D object.In this context, we consider targeted and untargeted variants of the attack, demonstrating compelling results in either case. We further show how discovering adversarial examples, and then using them for adversarial training, leads to an increase in both robustness and accuracy. Our findings are confirmed empirically over multiple datasets spanning different semantic classes and deformations.
B. Chmiel, C. Baskin, R. Banner, E. Zheltonozshkii, Y. Yermolin, A. Karbachevsky, A. M. Bronstein, A. Mendelson, Feature map transform coding for energy-efficient CNN inference, Proc. Intl. Joint Conf. on Neural Networks (IJCNN), 2020 detailsFeature map transform coding for energy-efficient CNN inference
B. Chmiel, C. Baskin, R. Banner, E. Zheltonozshkii, Y. Yermolin, A. Karbachevsky, A. M. Bronstein, A. MendelsonProc. Intl. Joint Conf. on Neural Networks (IJCNN), 2020Convolutional neural networks (CNNs) achieve state-of-the-art accuracy in a variety of tasks in computer vision and beyond. One of the major obstacles hindering the ubiquitous use of CNNs for inference on low-power edge devices is their relatively high computational complexity and memory bandwidth requirements. The latter often dominates the energy footprint on modern hardware. In this paper, we introduce a lossy transform coding approach, inspired by image and video compression, designed to reduce the memory bandwidth due to the storage of intermediate activation calculation results. Our method exploits the high correlations between feature maps and adjacent pixels and allows to halve the data transfer volumes to the main memory without re-training. We analyze the performance of our approach on a variety of CNN architectures and demonstrated FPGA implementation of ResNet18 with our approach results in a reduction of around 40% in the memory energy footprint compared to quantized network with negligible impact on accuracy. A reference implementation accompanies the paper.
E. Amrani, R. Ben-Ari, T. Hakim, A. M. Bronstein, Self-Supervised Object Detection and Retrieval Using Unlabeled Videos, CVPR workshop, 2020 detailsSelf-Supervised Object Detection and Retrieval Using Unlabeled Videos
E. Amrani, R. Ben-Ari, T. Hakim, A. M. BronsteinCVPR workshop, 2020Unlabeled video in the wild presents a valuable, yet so far unharnessed, source of information for learning vision tasks. We present the first attempt of fully self-supervised learning of object detection from subtitled videos without any manual object annotation. To this end, we use the How2 multi-modal collection of instructional videos with English subtitles. We pose the problem as learning with a weakly- and noisily-labeled data, and propose a novel training model that can confront high noise levels, and yet train a classifier to localize the object of interest in the video frames, without any manual labeling involved. We evaluate our approach on a set of 11 manually annotated objects in over 5000 frames and compare it to an existing weakly-supervised approach as baseline. Benchmark data and code will be released upon acceptance of the paper.
D. H. Silver, M. Feder, Y. Gold-Zamir, A. L. Polsky, S. Rosentraub, E. Shachor, A. Weinberger, P. Mazur, V. D. Zukin, A. M. Bronstein, Data-driven prediction of embryo implantation probability using IVF time-lapse imaging, Proc. MIDL, 2020 detailsData-driven prediction of embryo implantation probability using IVF time-lapse imaging
D. H. Silver, M. Feder, Y. Gold-Zamir, A. L. Polsky, S. Rosentraub, E. Shachor, A. Weinberger, P. Mazur, V. D. Zukin, A. M. BronsteinProc. MIDL, 2020The process of fertilizing a human egg outside the body in order to help those suffering from infertility to conceive is known as in vitro fertilization (IVF). Despite being the most effective method of assisted reproductive technology (ART), the average success rate of IVF is a mere 20-40%. One step that is critical to the success of the procedure is selecting which embryo to transfer to the patient, a process typically conducted manually and without any universally accepted and standardized criteria. In this paper, we describe a novel data-driven system trained to directly predict embryo implantation probability from embryogenesis time-lapse imaging videos. Using retrospectively collected videos from 272 embryos, we demonstrate that, when compared to an external panel of embryologists, our algorithm results in a 12% increase of positive predictive value and a 29% increase of negative predictive value.
T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, O. Michailovich, M. Zibulevsky, Joint learning of Cartesian undersampling and reconstruction for accelerated MRI, Proc. Int’l Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2020 detailsJoint learning of Cartesian undersampling and reconstruction for accelerated MRI
T. Weiss, S. Vedula, O. Senouf, A. M. Bronstein, O. Michailovich, M. ZibulevskyProc. Int’l Conf. on Acoustics, Speech and Signal Processing (ICASSP), 2020Magnetic Resonance Imaging (MRI) is considered today the golden-standard modality for soft tissues. The long acquisition times, however, make it more prone to motion artifacts as well as contribute to the relatively high costs of this examination. Over the years, multiple studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MRI, however, these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the simultaneous learning-based design of the acquisition and reconstruction schemes manifesting significant improvement in the reconstruction quality with a constrained time budget. Inspired by these successes, in this work, we propose to learn accelerated MR acquisition schemes (in the form of Cartesian trajectories) jointly with the image reconstruction operator. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using the learned Cartesian trajectories at different speed up rates.
S. Sommer, A. M. Bronstein, Horizontal flows and manifold stochastics in geometric deep learning, IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2020 detailsHorizontal flows and manifold stochastics in geometric deep learning
S. Sommer, A. M. BronsteinIEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), 2020We introduce two constructions in geometric deep learning for 1) transporting orientation-dependent convolutional filters over a manifold in a continuous way and thereby defining a convolution operator that naturally incorporates the rotational effect of holonomy; and 2) allowing efficient evaluation of manifold convolution layers by sampling manifold valued random variables that center around a weighted Brownian motion maximum likelihood mean. Both methods are inspired by stochastics on manifolds and geometric statistics, and provide examples of how stochastic methods — here horizontal frame bundle flows and non-linear bridge sampling schemes, can be used in geometric deep learning. We outline the theoretical foundation of the two methods, discuss their relation to Euclidean deep networks and existing methodology in geometric deep learning, and establish important properties of the proposed constructions.
K. Rotker, D. Ben-Bashat, A. M. Bronstein, Over-parameterized models for vector fields, SIAM Journal on Imaging Sciences (SIIMS), 2020 detailsOver-parameterized models for vector fields
K. Rotker, D. Ben-Bashat, A. M. BronsteinSIAM Journal on Imaging Sciences (SIIMS), 2020Vector fields arise in a variety of quantity measure and visualization techniques such as fluid flow imaging, motion estimation, deformation measures, and color imaging, leading to a better understanding of physical phenomena. Recent progress in vector field imaging technologies has emphasized the need for efficient noise removal and reconstruction algorithms. A key ingredient in the success of extracting signals from noisy measurements is prior information, which can often be represented as a parameterized model. In this work, we extend the over-parameterization variational framework in order to perform model-based reconstruction of vector fields. The over-parameterization methodology combines local modeling of the data with global model parameter regularization. By considering the vector field as a linear combination of basis vector fields and appropriate scale and rotation coefficients, the denoising problem reduces to a simpler form of coefficient recovery. We introduce two versions of the over-parameterization framework: total variation-based method and sparsity-based method, relying on the co-sparse analysis model. We demonstrate the efficiency of the proposed frameworks for two- and three-dimensional vector fields with linear and quadratic over-parameterization models.
A. Tsitsulin, M. Munkhoeva, D. Mottin, P. Karras. A. M. Bronstein, I. Oseledets, E. Müller, Intrinsic multi-scale evaluation of generative models, Proc. ICLR, 2020 detailsIntrinsic multi-scale evaluation of generative models
A. Tsitsulin, M. Munkhoeva, D. Mottin, P. Karras. A. M. Bronstein, I. Oseledets, E. MüllerProc. ICLR, 2020Generative models are often used to sample high-dimensional data points from a manifold with small intrinsic dimension. Existing techniques for comparing generative models focus on global data properties such as mean and covariance; in that sense, they are extrinsic and uni-scale. We develop the first, to our knowledge, intrinsic and multi-scale method for characterizing and comparing underlying data manifolds, based on comparing all data moments by lower-bounding the spectral notion of the Gromov-Wasserstein distance between manifolds. In a thorough experimental study, we demonstrate that our method effectively evaluates the quality of generative models; further, we showcase its efficacy in discerning the disentanglement process in neural networks.
A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson, HCM: Hardware-aware complexity metric for neural network architectures, arXiv:2004.08906, 2020 detailsHCM: Hardware-aware complexity metric for neural network architectures
A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. MendelsonarXiv:2004.08906, 2020Convolutional Neural Networks (CNNs) have become common in many fields including computer vision, speech recognition, and natural language processing. Although CNN hardware accelerators are already included as part of many SoC architectures, the task of achieving high accuracy on resource-restricted devices is still considered challenging, mainly due to the vast number of design parameters that need to be balanced to achieve an efficient solution. Quantization techniques, when applied to the network parameters, lead to a reduction of power and area and may also change the ratio between communication and computation. As a result, some algorithmic solutions may suffer from lack of memory bandwidth or computational resources and fail to achieve the expected performance due to hardware constraints. Thus, the system designer and the micro-architect need to understand at early development stages the impact of their high-level decisions (e.g., the architecture of the CNN and the amount of bits used to represent its parameters) on the final product (e.g., the expected power saving, area, and accuracy). Unfortunately, existing tools fall short of supporting such decisions. This paper introduces a hardware-aware complexity metric that aims to assist the system designer of the neural network architectures, through the entire project lifetime (especially at its early stages) by predicting the impact of architectural and micro-architectural decisions on the final product. We demonstrate how the proposed metric can help evaluate different design alternatives of neural network models on resource-restricted devices such as real-time embedded systems, and to avoid making design mistakes at early stages.
E. Zheltonozhskii, C. Baskin, Y. Nemcovsky, B. Chmiel, A. Mendelson, A. M. Bronstein, Colored noise injection for training adversarially robust neural networks, arXiv:2003.02188, 2020 detailsColored noise injection for training adversarially robust neural networks
E. Zheltonozhskii, C. Baskin, Y. Nemcovsky, B. Chmiel, A. Mendelson, A. M. BronsteinarXiv:2003.02188, 2020Even though deep learning have shown unmatched performance on various tasks, neural networks has been shown to be vulnerable to small adversarial perturbation of the input which lead to significant performance degradation. In this work we extend the idea of adding independent Gaussian noise to weights and activation during adversarial training (PNI) to injection of colored noise for defense against common white-box and black-box attacks. We show that our approach outperforms PNI and various previous approaches in terms of adversarial accuracy on CIFAR-10 dataset. In addition, we provide an extensive ablation study of the proposed method justifying the chosen configurations.
A. Livne, A. M. Bronstein, R. Kimmel, Z. Aviv, S. Grofit, Do we need depth in state-of-the-art face authentication?, Proc. IEEE Int'l Conf. on 3D Vision (3DV), 2020 detailsDo we need depth in state-of-the-art face authentication?
A. Livne, A. M. Bronstein, R. Kimmel, Z. Aviv, S. GrofitProc. IEEE Int'l Conf. on 3D Vision (3DV), 2020Some face recognition methods are designed to utilize geometric features extracted from depth sensors to handle the challenges of single-image based recognition technologies. However, calculating the geometrical data is an expensive and challenging process. Here, we introduce a novel method that learns distinctive geometric features from stereo camera systems without the need to explicitly compute the facial surface or depth map. The raw face stereo images along with coordinate maps allow a CNN to learn geometric features. This way, we keep the simplicity and cost-efficiency of recognition from a single image, while enjoying the benefits of geometric data without explicitly reconstructing it. We demonstrate that the suggested method outperforms both existing single-image and explicit depth-based methods on large-scale benchmarks. We also provide an ablation study to show that the suggested method uses the coordinate maps to encode more informative features.
M. Shkolnik, B. Chmiel, R. Banner, G. Shomron, Y. Nahshan, A. M. Bronstein, U. Weiser, Robust Quantization: One Model to Rule Them All, Proc. NeurIPS, 2020 detailsRobust Quantization: One Model to Rule Them All
M. Shkolnik, B. Chmiel, R. Banner, G. Shomron, Y. Nahshan, A. M. Bronstein, U. WeiserProc. NeurIPS, 2020Neural network quantization methods often involve simulating the quantization process during training. This makes the trained model highly dependent on the precise way quantization is performed. Since low-precision accelerators differ in their quantization policies and their supported mix of data-types, a model trained for one accelerator may not be suitable for another. To address this issue, we propose KURE, a method that provides intrinsic robustness to the model against a broad range of quantization implementations. We show that KURE yields a generic model that may be deployed on numerous inference accelerators without a significant loss in accuracy
Y. Choukroun , A. Shtern, A. M. Bronstein, R. Kimmel, Hamiltonian operator for spectral shape analysis, IEEE Trans. Vis. and Comp. Graphics, vol. 26(2), 2020 detailsHamiltonian operator for spectral shape analysis
Y. Choukroun , A. Shtern, A. M. Bronstein, R. KimmelIEEE Trans. Vis. and Comp. Graphics, vol. 26(2), 2020Many shape analysis methods treat the geometry of an object as a metric space that can be captured by the Laplace-Beltrami operator. In this paper, we propose to adapt the classical Hamiltonian operator from quantum mechanics to the field of shape analysis. To this end, we study the addition of a potential function to the Laplacian as a generator for dual spaces in which shape processing is performed. We present general optimization approaches for solving variational problems involving the basis defined by the Hamiltonian using perturbation theory for its eigenvectors. The suggested operator is shown to produce better functional spaces to operate with, as demonstrated on different shape analysis tasks.
- A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. Karlinsky, Detector-free weakly supervised grounding by separation, Proc. CVPR, 2022 details
Detector-free weakly supervised grounding by separation
A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. KarlinskyProc. CVPR, 2022Nowadays, there is an abundance of data involving images and surrounding free-form text weakly corresponding to those images. Weakly Supervised phrase-Grounding (WSG) deals with the task of using this data to learn to localize (or to ground) arbitrary text phrases in images without any additional annotations. However, most recent SotA methods for WSG assume the existence of a pre-trained object detector, relying on it to produce the ROIs for localization. In this work, we focus on the task of Detector-Free WSG (DF-WSG) to solve WSG without relying on a pre-trained detector. We directly learn everything from the images and associated free-form text pairs, thus potentially gaining an advantage on the categories unsupported by the detector. The key idea behind our proposed Grounding by Separation (GbS) method is synthesizing `text to image-regions’ associations by random alpha-blending of arbitrary image pairs and using the corresponding texts of the pair as conditions to recover the alpha map from the blended image via a segmentation network. At test time, this allows using the query phrase as a condition for a non-blended query image, thus interpreting the test image as a composition of a region corresponding to the phrase and the complement region. Using this approach we demonstrate a significant accuracy improvement, of up to 8.5% over previous DF-WSG SotA, for a range of benchmarks including Flickr30K, Visual Genome, and ReferIt, as well as a significant complementary improvement (above 7%) over the detector-based approaches for WSG.
S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky, MetAdapt: Meta-learned task-adaptive architecture for few-shot classification, Pattern Recognition Letters, 2021 detailsMetAdapt: Meta-learned task-adaptive architecture for few-shot classification
S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. KarlinskyPattern Recognition Letters, 2021Few-Shot Learning (FSL) is a topic of rapidly growing interest. Typically, in FSL a model is trained on a dataset consisting of many small tasks (meta-tasks) and learns to adapt to novel tasks that it will encounter during test time. This is also referred to as meta-learning. Another topic closely related to meta-learning with a lot of interest in the community is Neural Architecture Search (NAS), automatically finding optimal architecture instead of engineering it manually. In this work we combine these two aspects of meta-learning. So far, meta-learning FSL methods have focused on optimizing parameters of pre-defined network architectures, in order to make them easily adaptable to novel tasks. Moreover, it was observed that, in general, larger architectures perform better than smaller ones up to a certain saturation point (where they start to degrade due to over-fitting). However, little attention has been given to explicitly optimizing the architectures for FSL, nor to an adaptation of the architecture at test time to particular novel tasks. In this work, we propose to employ tools inspired by the Differentiable Neural Architecture Search (D-NAS) literature in order to optimize the architecture for FSL without over-fitting. Additionally, to make the architecture task adaptive, we propose the concept of ‘MetAdapt Controller’ modules. These modules are added to the model and are meta-trained to predict the optimal network connections for a given novel task. Using the proposed approach we observe state-of-the-art resu
T. Weiss, N. Peretz, S. Vedula, A. Feuer, A. M. Bronstein, Joint optimization of system design and reconstruction in MIMO radar imaging, Proc. IEEE Int'l Workshop on Machine Learning for Signal Processing, 2021 detailsJoint optimization of system design and reconstruction in MIMO radar imaging
T. Weiss, N. Peretz, S. Vedula, A. Feuer, A. M. BronsteinProc. IEEE Int'l Workshop on Machine Learning for Signal Processing, 2021Multiple-input multiple-output (MIMO) radar is one of the leading depth sensing modalities. However, the usage of multiple receive channels lead to relative high costs and prevent the penetration of MIMOs in many areas such as the automotive industry. Over the last years, few studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MIMO radars, however these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of simultaneous learningbased design of the acquisition and reconstruction schemes, manifesting significant improvement in the reconstruction quality. Inspired by these successes, in this work, we propose to learn MIMO acquisition parameters in the form of receive (Rx) antenna elements locations jointly with an image neuralnetwork based reconstruction. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using our learned acquisition parameters with and without the neural-network reconstruction.
Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, Loss aware post-training quantization, Machine Learning, 2021 detailsLoss aware post-training quantization
Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. MendelsonMachine Learning, 2021Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or above). In this work, we study the effect of quantization on the structure of the loss landscape. We show that the structure is flat and separable for mild quantization, enabling straightforward post-training quantization methods to achieve good results. We show that with more aggressive quantization, the loss landscape becomes highly non-separable with steep curvature, making the selection of quantization parameters more challenging. Armed with this understanding, we design a method that quantizes the layer parameters jointly, enabling significant accuracy improvement over current post-training quantization methods.
C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, CAT: Compression-aware training for bandwidth reduction, JMLR, 2021 detailsCAT: Compression-aware training for bandwidth reduction
C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. MendelsonJMLR, 2021Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving visual processing tasks. One of the major obstacles hindering the ubiquitous use of CNNs for inference is their relatively high memory bandwidth requirements, which can be a main energy consumer and throughput bottleneck in hardware accelerators. Accordingly, an efficient feature map compression method can result in substantial performance gains. Inspired by quantization-aware training approaches, we propose a compression-aware training (CAT) method that involves training the model in a way that allows better compression of feature maps during inference. Our method trains the model to achieve low-entropy feature maps, which enables efficient compression at inference time using classical transform coding methods. CAT significantly improves the state-of-the-art results reported for quantization. For example, on ResNet-34 we achieve 73.1% accuracy (0.2% degradation from the baseline) with an average representation of only 1.79 bits per value.
E. Rozenberg, D. Freedman, A. M. Bronstein, Learning to localize objects using limited annotation with applications to thoracic diseases, IEEE Access Vol. 9, 2021 detailsLearning to localize objects using limited annotation with applications to thoracic diseases
E. Rozenberg, D. Freedman, A. M. BronsteinIEEE Access Vol. 9, 2021Motivation: The localization of objects in images is a longstanding objective within the field of image processing. Most current techniques are based on machine learning approaches, which typically require careful annotation of training samples in the form of expensive bounding box labels. The need for such large-scale annotation has only been exacerbated by the widespread adoption of deep learning techniques within the image processing community: deep learning is notoriously data-hungry. Method: In this work, we attack this problem directly by providing a new method for learning to localize objects with limited annotation: most training images can simply be annotated with their whole image labels (and no bounding box), with only a small fraction marked with bounding boxes. The training is driven by a novel loss function, which is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning. Care is taken to ensure that the loss is numerically well-posed. Additionally, we propose a neural network architecture which accounts for both patch dependence, through the use of Conditional Random Field layers, and shift-invariance, through the inclusion of anti-aliasing filters. Results: We demonstrate our method on the task of localizing thoracic diseases in chest X-ray images, achieving state-of-the-art performance on the ChestX-ray14 dataset. We further show that with a modicum of additional effort our technique can be extended from object localization to object detection, attaining high quality results on the Kaggle RSNA Pneumonia Detection Challenge. Conclusion: The technique presented in this paper has the potential to enable high accuracy localization in regimes in which annotated data is either scarce or expensive to acquire. Future work will focus on applying the ideas presented in this paper to the realm of semantic segmentation.
T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. Bronstein, PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI, Journal of Machine Learning for Biomedical Imaging (MELBA), 2021 detailsPILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI
T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. BronsteinJournal of Machine Learning for Biomedical Imaging (MELBA), 2021Magnetic Resonance Imaging (MRI) has long been considered to be among “the gold standards” of diagnostic medical imaging. The long acquisition times, however, render MRI prone to motion artifacts, let alone their adverse contribution to the relatively high costs of MRI examination. Over the last few decades, multiple studies have focused on the development of both physical and post-processing methods for accelerated acquisition of MRI scans. These two approaches, however, have so far been addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the concurrent learning-based design of data acquisition and image reconstruction schemes. Such schemes have already demonstrated substantial effectiveness, leading to considerably shorter acquisition times and improved quality of image reconstruction. Inspired by this initial success, in this work, we propose a novel approach to the learning of optimal schemes for conjoint acquisition and reconstruction of MRI scans, with the optimization, carried out simultaneously with respect to the time-efficiency of data acquisition and the quality of resulting reconstructions. To be of practical value, the schemes are encoded in the form of general k-space trajectories, whose associated magnetic gradients are constrained to obey a set of predefined hardware requirements (as defined in terms of, e.g., peak currents and maximum slew rates of magnetic gradients). With this proviso in mind, we propose a novel algorithm for the end-to-end training of a combined acquisition-reconstruction pipeline using a deep neural network with differentiable forward- and backpropagation operators. We also demonstrate the effectiveness of the proposed solution in application to both image reconstruction and image segmentation, reporting substantial improvements in terms of acceleration factors as well as the quality of these end tasks.
Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. Yaniv, Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis, Proc. US National Academy of Sciences (PNAS), 2021 detailsMeeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis
Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. YanivProc. US National Academy of Sciences (PNAS), 2021Despite their great promise, artificial intelligence (AI) systems have yet to become ubiquitous in the daily practice of medicine largely due to several crucial unmet needs of healthcare practitioners. These include lack of explanations in clinically meaningful terms, handling the presence of unknown medical conditions, and transparency regarding the system’s limitations, both in terms of statistical performance as well as recognizing situations for which the system’s predictions are irrelevant. We articulate these unmet clinical needs as machine-learning (ML) problems and systematically address them with cutting-edge ML techniques. We focus on electrocardiogram (ECG) analysis as an example domain in which AI has great potential and tackle two challenging tasks: the detection of a heterogeneous mix of known and unknown arrhythmias from ECG and the identification of underlying cardio-pathology from segments annotated as normal sinus rhythm recorded in patients with an intermittent arrhythmia. We validate our methods by simulating a screening for arrhythmias in a large-scale population while adhering to statistical significance requirements. Specifically, our system 1) visualizes the relative importance of each part of an ECG segment for the final model decision; 2) upholds specified statistical constraints on its out-of-sample performance and provides uncertainty estimation for its predictions; 3) handles inputs containing unknown rhythm types; and 4) handles data from unseen patients while also flagging cases in which the model’s outputs are not usable for a specific patient. This work represents a significant step toward overcoming the limitations currently impeding the integration of AI into clinical practice in cardiology and medicine in general.
L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. Giryes, StarNet: towards weakly supervised few-shot detection and explainable few-shot classification, Proc. AAAI, 2021 detailsStarNet: towards weakly supervised few-shot detection and explainable few-shot classification
L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. GiryesProc. AAAI, 2021In this paper, we propose a new few-shot learning method called StarNet, which is an end-to-end trainable non-parametric star-model few-shot classifier. While being meta-trained using only image-level class labels, StarNet learns not only to predict the class labels for each query image of a few-shot task, but also to localize (via a heatmap) what it believes to be the key image regions supporting its prediction, thus effectively detecting the instances of the novel categories. The localization is enabled by the StarNet’s ability to find large, arbitrarily shaped, semantically matching regions between all pairs of support and query images of a few-shot task. We evaluate StarNet on multiple few-shot classification benchmarks attaining significant state-of-the-art improvement on the CUB and ImageNetLOC-FS, and smaller improvements on other benchmarks. At the same time, in many cases, StarNet provides plausible explanations for its class label predictions, by highlighting the correctly paired novel category instances on the query and on its best matching support (for the predicted class). In addition, we test the proposed approach on the previously unexplored and challenging task of Weakly Supervised Few-Shot Object Detection (WS-FSOD), obtaining significant improvements over the baselines.
E. Amrani, R. Ben-Ari, D. Rotman, A. M. Bronstein, Noise estimation using density estimation for self-supervised multimodal learning, Proc. AAAI, 2021 detailsNoise estimation using density estimation for self-supervised multimodal learning
E. Amrani, R. Ben-Ari, D. Rotman, A. M. BronsteinProc. AAAI, 2021One of the key factors of enabling machine learning models to comprehend and solve real-world tasks is to leverage multimodal data. Unfortunately, the annotation of multimodal data is challenging and expensive. Recently, self-supervised multimodal methods that combine vision and language were proposed to learn multimodal representations without annotation. However, these methods choose to ignore the presence of high levels of noise and thus yield sub-optimal results. In this work, we show that the problem of noise estimation for multimodal data can be reduced to a multimodal density estimation task. Using multimodal density estimation, we propose a noise estimation building block for multimodal representation learning that is based strictly on the inherent correlation between different modalities. We demonstrate how our noise estimation can be broadly integrated and achieves comparable results to state-of-the-art performance on five different benchmark datasets for two challenging multimodal tasks: Video Question Answering and Text-To-Video Retrieval.
O. Dahary, M. Jacoby, A. M. Bronstein, Digital Gimbal: End-to-end deep image stabilization with learnable exposure times, Proc. CVPR, 2021 detailsDigital Gimbal: End-to-end deep image stabilization with learnable exposure times
O. Dahary, M. Jacoby, A. M. BronsteinProc. CVPR, 2021Mechanical image stabilization using actuated gimbals enables capturing long-exposure shots without suffering from blur due to camera motion. These devices, however, are often physically cumbersome and expensive, limiting their widespread use. In this work, we propose to digitally emulate a mechanically stabilized system from the input of a fast unstabilized camera. To exploit the trade-off between motion blur at long exposures and low SNR at short exposures, we train a CNN that estimates a sharp high-SNR image by aggregating a burst of noisy short-exposure frames, related by unknown motion. We further suggest learning the burst’s exposure times in an end-to-end manner, thus balancing the noise and blur across the frames. We demonstrate this method’s advantage over the traditional approach of deblurring a single image or denoising a fixed-exposure burst.
A. Boyarski, S. Vedula, A. M. Bronstein, Spectral geometric matrix completion, Proc. Mathematical and Scientific Machine Learning, 2021 detailsSpectral geometric matrix completion
A. Boyarski, S. Vedula, A. M. BronsteinProc. Mathematical and Scientific Machine Learning, 2021Deep Matrix Factorization (DMF) is an emerging approach to the problem of reconstructing a matrix from a subset of its entries. Recent works have established that gradient descent applied to a DMF model induces an implicit regularization on the rank of the recovered matrix. Despite these promising theoretical results, empirical evaluation of vanilla DMF on real benchmarks exhibits poor reconstructions which we attribute to the extremely low number of samples available. We propose an explicit spectral regularization scheme that is able to make DMF models competitive on real benchmarks, while still maintaining the implicit regularization induced by gradient descent, thus enjoying the best of both worlds.
E. Rozenberg, A. Karnieli, O. Yesharim, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie, Inverse design of quantum holograms in three-dimensional nonlinear photonic crystals, CLEO, 2021 detailsInverse design of quantum holograms in three-dimensional nonlinear photonic crystals
E. Rozenberg, A. Karnieli, O. Yesharim, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. ArieCLEO, 2021A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. Mendelson, Early-stage neural network hardware performance analysis, Sustainability 13(2):717, 2021 detailsEarly-stage neural network hardware performance analysis
A. Karbachevsky, C. Baskin, E. Zheltonozshkii, Y. Yermolin, F. Gabbay, A. M. Bronstein, A. MendelsonSustainability 13(2):717, 2021The demand for running NNs in embedded environments has increased significantly in recent years due to the significant success of convolutional neural network (CNN) approaches in various tasks, including image recognition and generation. The task of achieving high accuracy on resource-restricted devices, however, is still considered to be challenging, which is mainly due to the vast number of design parameters that need to be balanced. While the quantization of CNN parameters leads to a reduction of power and area, it can also generate unexpected changes in the balance between communication and computation. This change is hard to evaluate, and the lack of balance may lead to lower utilization of either memory bandwidth or computational resources, thereby reducing performance. This paper introduces a hardware performance analysis framework for identifying bottlenecks in the early stages of CNN hardware design. We demonstrate how the proposed method can help in evaluating different architecture alternatives of resource-restricted CNN accelerators (e.g., part of real-time embedded systems) early in design stages and, thus, prevent making design mistakes.Keywords: neural networks; accelerators; quantization; CNN architecture - J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. Karras, GRASP: Graph Alignment through Spectral Signatures, Proc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, 2022 details
GRASP: Graph Alignment through Spectral Signatures
J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. KarrasProc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, 2022What is the best way to match the nodes of two graphs? This graph alignment problem generalizes graph isomorphism and arises in applications from social network analysis to bioinformatics. Some solutions assume that auxiliary information on known matches or node or edge attributes is available, or utilize arbitrary graph features. Such methods fare poorly in the pure form of the problem, in which only graph structures are given. Other proposals translate the problem to one of aligning node embeddings, yet, by doing so, provide only a single-scale view of the graph. In this paper, we transfer the shape-analysis concept of functional maps from the continuous to the discrete case, and treat the graph alignment problem as a special case of the problem of finding a mapping between functions on graphs. We present GRASP, a method that first establishes a correspondence between functions derived from Laplacian matrix eigenvectors, which capture multiscale structural characteristics, and then exploits this correspondence to align nodes. Our experimental study, featuring noise levels higher than anything used in previous studies, shows that GRASP outperforms state-of-the-art methods for graph alignment across noise levels and graph types.
P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. Liu, Deep fused two-step cross-modal hashing with multiple semantic supervision, Multimedia Tools and Applications, 2022 detailsDeep fused two-step cross-modal hashing with multiple semantic supervision
P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. LiuMultimedia Tools and Applications, 2022Existing cross-modal hashing methods ignore the informative multimodal joint information and cannot fully exploit the semantic labels. In this paper, we propose a deep fused two-step cross-modal hashing (DFTH) framework with multiple semantic supervision. In the first step, DFTH learns unified hash codes for instances by a fusion network. Semantic label and similarity reconstruction have been introduced to acquire binary codes that are informative, discriminative and semantic similarity preserving. In the second step, two modality-specific hash networks are learned under the supervision of common hash codes reconstruction, label reconstruction, and intra-modal and inter-modal semantic similarity reconstruction. The modality-specific hash networks can generate semantic preserving binary codes for out-of-sample queries. To deal with the vanishing gradients of binarization, continuous differentiable tanh is introduced to approximate the discrete sign function, making the networks able to back-propagate by automatic gradient computation. Extensive experiments on MIRFlickr25K and NUS-WIDE show the superiority of DFTH over state-of-the-art methods.
P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. Liu, Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval, Applied Intelligence, 52(1), pp. 33-54, 2022 detailsIntra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval
P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. LiuApplied Intelligence, 52(1), pp. 33-54, 2022Cross-modal retrieval aims to retrieve related items across different modalities, for example, using an image query to retrieve related text. The existing deep methods ignore both the intra-modal and inter-modal intra-class low-rank structures when fusing various modalities, which decreases the retrieval performance. In this paper, two deep models (denoted as ILCMR and Semi-ILCMR) based on intra-class low-rank regularization are proposed for supervised and semi-supervised cross-modal retrieval, respectively. Specifically, ILCMR integrates the image network and text network into a unified framework to learn a common feature space by imposing three regularization terms to fuse the cross-modal data. First, to align them in the label space, we utilize semantic consistency regularization to convert the data representations to probability distributions over the classes. Second, we introduce an intra-modal low-rank regularization, which encourages the intra-class samples that originate from the same space to be more relevant in the common feature space. Third, an inter-modal low-rank regularization is applied to reduce the cross-modal discrepancy. To enable the low-rank regularization to be optimized using automatic gradients during network back-propagation, we propose the rank-r approximation and specify the explicit gradients for theoretical completeness. In addition to the three regularization terms that rely on label information incorporated by ILCMR, we propose Semi-ILCMR in the semi-supervised regime, which introduces a low-rank constraint before projecting the general representations into the common feature space. Extensive experiments on four public cross-modal datasets demonstrate the superiority of ILCMR and Semi-ILCMR over other state-of-the-art methods.
Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. Baskin, Physical passive patch adversarial attacks on visual odometry systems, Proc. ACCV, 2022 detailsPhysical passive patch adversarial attacks on visual odometry systems
Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. BaskinProc. ACCV, 2022Deep neural networks are known to be susceptible to adversarial perturbations — small perturbations that alter the output of the network and exist under strict norm limitations. While such perturbations are usually discussed as tailored to a specific input, a universal perturbation can be constructed to alter the model’s output on a set of inputs. Universal perturbations present a more realistic case of adversarial attacks, as awareness of the model’s exact input is not required. In addition, the universal attack setting raises the subject of generalization to unseen data, where given a set of inputs, the universal perturbations aim to alter the model’s output on out-of-sample data. In this work, we study physical passive patch adversarial attacks on visual odometry-based autonomous navigation systems. A visual odometry system aims to infer the relative camera motion between two corresponding viewpoints, and is frequently used by vision-based autonomous navigation systems to estimate their state. For such navigation systems, a patch adversarial perturbation poses a severe security issue, as it can be used to mislead a system onto some collision course. To the best of our knowledge, we show for the first time that the error margin of a visual odometry model can be significantly increased by deploying patch adversarial attacks in the scene. We provide evaluation on synthetic closed-loop drone navigation data and demonstrate that a comparable vulnerability exists in real data.
L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. Bronstein, Machine learning approaches demonstrate that protein structures carry information about their genetic coding, Nature Scientific Reports, 2022 detailsMachine learning approaches demonstrate that protein structures carry information about their genetic coding
L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. BronsteinNature Scientific Reports, 2022Synonymous codons translate into the same amino acid. Although the identity of synonymous codons is often considered
inconsequential to the final protein structure there is mounting evidence for an association between the two. Our study
examined this association using regression and classification models, finding that codon sequences predict protein backbone dihedral angles with a lower error than amino acid sequences, and that models trained with true dihedral angles have better classification of synonymous codons given structural information than models trained with random dihedral angles. Using this classification approach, we investigated local codon-codon dependencies and tested whether synonymous codon identity can be predicted more accurately from codon context than amino acid context alone, and most specifically which codon context position carries the most predictive power.A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein, Defining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues, biorXiv/2022/513383, 2022 detailsDefining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues
A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. BronsteinbiorXiv/2022/513383, 2022Proteins fold from chains of amino acids, forming secondary structures, α-helices and β-strands, that, at least for globular proteins, subsequently fold into a three-dimensional structure. A large-scale analysis of high-resolution protein structures suggests that amino acid pairs constitute another layer of ordered structure, more local than these conventionally defined secondary structures. We develop a cross-peptide-bond Ramachandran plot that captures the 15 conformational preferences of the amino acid pairs and show that the effect of a particular mutation on the stability of a protein depends in a predictable manner on the adjacent amino acid context.
A. Rosenberg, A. Marx, A. M. Bronstein, Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon, Nature Communications, 2022 detailsCodon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon
A. Rosenberg, A. Marx, A. M. BronsteinNature Communications, 2022Synonymous codons translate into chemically identical amino acids. Once considered inconsequential to the formation of the protein product, there is now significant evidence to suggest that codon usage affects co-translational protein folding and the final structure of the expressed protein. Here we develop a method for computing and comparing codon-specific Ramachandran plots and demonstrate that the backbone dihedral angle distributions of some synonymous codons are distinguishable with statistical significance for some secondary structures. This shows that there exists a dependence between codon identity and backbone torsion of the translated amino acid. Although these findings cannot pinpoint the causal direction of this dependence, we discuss the vast biological implications should coding be shown to directly shape protein conformation and demonstrate the usefulness of this method as a tool for probing associations between codon usage and protein structure. Finally, we urge for the inclusion of exact genetic information into structural databases.
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie, Inverse design of spontaneous parametric downconversion for generation of high-dimensional qudits, Optica 9, 602-615, 2022 detailsInverse design of spontaneous parametric downconversion for generation of high-dimensional qudits
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. ArieOptica 9, 602-615, 2022Spontaneous parametric down-conversion in quantum optics is an invaluable resource for the realization of high-dimensional qudits with spatial modes of light. One of the main open challenges is how to directly generate a desirable qudit state in the SPDC process. This problem can be addressed through advanced computational learning methods; however, due to difficulties in modeling the SPDC process by a fully differentiable algorithm that takes into account all interaction effects, progress has been limited. Here, we overcome these limitations and introduce a physically-constrained and differentiable model, validated against experimental results for shaped pump beams and structured crystals, capable of learning every interaction parameter in the process. We avoid any restrictions induced by the stochastic nature of our physical model and integrate the dynamic equations governing the evolution under the SPDC Hamiltonian. We solve the inverse problem of designing a nonlinear quantum optical system that achieves the desired quantum state of down-converted photon pairs. The desired states are defined using either the second-order correlations between different spatial modes or by specifying the required density matrix. By learning nonlinear volume holograms as well as different pump shapes, we successfully show how to generate maximally entangled states. Furthermore, we simulate all-optical coherent control over the generated quantum state by actively changing the profile of the pump beam. Our work can be useful for applications such as novel designs of high-dimensional quantum key distribution and quantum information processing protocols. In addition, our method can be readily applied for controlling other degrees of freedom of light in the SPDC process, such as the spectral and temporal properties, and may even be used in condensed-matter systems having a similar interaction Hamiltonian.
N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. Dreslinski, Mint: An Accelerator For Mining Temporal Motifs, Proc. MICRO, 2022 detailsMint: An Accelerator For Mining Temporal Motifs
N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. DreslinskiProc. MICRO, 2022A variety of complex systems, including social and communication networks, financial markets, biology, and neuroscience are modeled using temporal graphs that contain a set of nodes and directed timestamped edges. Temporal motifs in temporal graphs are generalized from subgraph patterns in static graphs in that they also account for edge ordering and time duration, in addition to the graph structure. Mining temporal motifs is a fundamental problem used in several application domains. However, existing software frameworks offer suboptimal performance due to high algorithmic complexity and irregular memory accesses of temporal motif mining. This paper presents Mint—a novel accelerator architecture and a programming model for mining temporal motifs efficiently. We first divide this workload into three fundamental tasks: search, book-keeping, and backtracking. Based on this, we propose a task–centric programming model that enables decoupled, asynchronous execution. This model unlocks massive opportunities for parallelism, and allows storing task context information on-chip. To best utilize the proposed programming model, we design a domain-specific hardware accelerator using its data path and memory subsystem design to cater to the unique workload characteristics of temporal motif mining. To further improve performance, we propose a novel optimization called search index memoization that significantly reduces memory traffic. We comprehensively compare the performance of Mint with state-of-the-art temporal motif mining software frameworks (both approximate and exact) running on both CPU and GPU, and show 9×–2576× benefit in performance.
E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. Litany, Contrast to divide: Self-supervised pre-training for learning with noisy labels, Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022 detailsContrast to divide: Self-supervised pre-training for learning with noisy labels
E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. LitanyProc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022The success of learning with noisy labels (LNL) methods relies heavily on the success of a warm-up stage where standard supervised training is performed using the full (noisy) training set. In this paper, we identify a” warm-up obstacle”: the inability of standard warm-up stages to train high quality feature extractors and avert memorization of noisy labels. We propose” Contrast to Divide”(C2D), a simple framework that solves this problem by pre-training the feature extractor in a self-supervised fashion. Using self-supervised pre-training boosts the performance of existing LNL approaches by drastically reducing the warm-up stage’s susceptibility to noise level, shortening its duration, and improving extracted feature quality. C2D works out of the box with existing methods and demonstrates markedly improved performance, especially in the high noise regime, where we get a boost of more than 27% for CIFAR-100 with 90% noise over the previous state of the art. In real-life noise settings, C2D trained on mini-WebVision outperforms previous works both in WebVision and ImageNet validation sets by 3% top-1 accuracy. We perform an in-depth analysis of the framework, including investigating the performance of different pre-training approaches and estimating the effective upper bound of the LNL performance with semi-supervised learning.
N. Diamant, N. Shandor, A. M. Bronstein, Delta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples, arXiv:2111.08419, 2022 detailsDelta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples
N. Diamant, N. Shandor, A. M. BronsteinarXiv:2111.08419, 2022Understating and controlling generative models’ latent space is a complex task. In this paper, we propose a novel method for learning to control any desired attribute in a pre-trained GAN’s latent space, for the purpose of editing synthesized and real-world data samples accordingly. We perform Sim2Real learning, relying on minimal samples to achieve an unlimited amount of continuous precise edits. We present an Autoencoder-based model that learns to encode the semantics of changes between images as a basis for editing new samples later on, achieving precise desired results – example shown in Fig. 1. While previous editing methods rely on a known structure of latent spaces (e.g., linearity of some semantics in StyleGAN), our method inherently does not require any structural constraints. We demonstrate our method in the domain of facial imagery: editing different expressions, poses, and lighting attributes, achieving state-of-the-art results.
T. Blau, R. Ganz, B. Kawar, A. M. Bronstein, M. Elad , Threat model-agnostic adversarial defense using diffusion models, arXiv preprint arXiv:2207.08089, 2022 detailsThreat model-agnostic adversarial defense using diffusion models
T. Blau, R. Ganz, B. Kawar, A. M. Bronstein, M. EladarXiv preprint arXiv:2207.08089, 2022Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks. Following the discovery of this vulnerability in real-world imaging and vision applications, the associated safety concerns have attracted vast research attention, and many defense techniques have been developed. Most of these defense methods rely on adversarial training (AT) — training the classification network on images perturbed according to a specific threat model, which defines the magnitude of the allowed modification. Although AT leads to promising results, training on a specific threat model fails to generalize to other types of perturbations. A different approach utilizes a preprocessing step to remove the adversarial perturbation from the attacked image. In this work, we follow the latter path and aim to develop a technique that leads to robust classifiers across various realizations of threat models. To this end, we harness the recent advances in stochastic generative modeling, and means to leverage these for sampling from conditional distributions. Our defense relies on an addition of Gaussian i.i.d noise to the attacked image, followed by a pretrained diffusion process — an architecture that performs a stochastic iterative process over a denoising network, yielding a high perceptual quality denoised outcome. The obtained robustness with this stochastic preprocessing step is validated through extensive experiments on the CIFAR-10 dataset, showing that our method outperforms the leading defense methods under various threat models.
D. E. Fordham, D. Rosentraub, A. L. Polsky, T. Aviram, Y. Wolf, O. Perl, A. Devir, S. Rosentraub, D. H. Silver, Y. Gold Zamir, A. M. Bronstein, M. Lara Lara, J. Ben Nagi, A. Alvarez, S. Munné, Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?, Human Reproduction, Volume 37, Issue 10, Pages 2275–2290, 2022 detailsEmbryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?
D. E. Fordham, D. Rosentraub, A. L. Polsky, T. Aviram, Y. Wolf, O. Perl, A. Devir, S. Rosentraub, D. H. Silver, Y. Gold Zamir, A. M. Bronstein, M. Lara Lara, J. Ben Nagi, A. Alvarez, S. MunnéHuman Reproduction, Volume 37, Issue 10, Pages 2275–2290, 2022STUDY QUESTION
What is the accuracy and agreement of embryologists when assessing the implantation probability of blastocysts using time-lapse imaging (TLI), and can it be improved with a data-driven algorithm?SUMMARY ANSWER
The overall interobserver agreement of a large panel of embryologists was moderate and prediction accuracy was modest, while the purpose-built artificial intelligence model generally resulted in higher performance metrics.WHAT IS KNOWN ALREADY
Previous studies have demonstrated significant interobserver variability amongst embryologists when assessing embryo quality. However, data concerning embryologists’ ability to predict implantation probability using TLI is still lacking. Emerging technologies based on data-driven tools have shown great promise for improving embryo selection and predicting clinical outcomes.STUDY DESIGN, SIZE, DURATION
TLI video files of 136 embryos with known implantation data were retrospectively collected from two clinical sites between 2018 and 2019 for the performance assessment of 36 embryologists and comparison with a deep neural network (DNN).PARTICIPANTS/MATERIALS, SETTING, METHODS
We recruited 39 embryologists from 13 different countries. All participants were blinded to clinical outcomes. A total of 136 TLI videos of embryos that reached the blastocyst stage were used for this experiment. Each embryo’s likelihood of successfully implanting was assessed by 36 embryologists, providing implantation probability grades (IPGs) from 1 to 5, where 1 indicates a very low likelihood of implantation and 5 indicates a very high likelihood. Subsequently, three embryologists with over 5 years of experience provided Gardner scores. All 136 blastocysts were categorized into three quality groups based on their Gardner scores. Embryologist predictions were then converted into predictions of implantation (IPG ≥ 3) and no implantation (IPG ≤ 2). Embryologists’ performance and agreement were assessed using Fleiss kappa coefficient. A 10-fold cross-validation DNN was developed to provide IPGs for TLI video files. The model’s performance was compared to that of the embryologists.MAIN RESULTS AND THE ROLE OF CHANCE
Logistic regression was employed for the following confounding variables: country of residence, academic level, embryo scoring system, log years of experience and experience using TLI. None were found to have a statistically significant impact on embryologist performance at α = 0.05. The average implantation prediction accuracy for the embryologists was 51.9% for all embryos (N = 136). The average accuracy of the embryologists when assessing top quality and poor quality embryos (according to the Gardner score categorizations) was 57.5% and 57.4%, respectively, and 44.6% for fair quality embryos. Overall interobserver agreement was moderate (κ = 0.56, N = 136). The best agreement was achieved in the poor + top quality group (κ = 0.65, N = 77), while the agreement in the fair quality group was lower (κ = 0.25, N = 59). The DNN showed an overall accuracy rate of 62.5%, with accuracies of 62.2%, 61% and 65.6% for the poor, fair and top quality groups, respectively. The AUC for the DNN was higher than that of the embryologists overall (0.70 DNN vs 0.61 embryologists) as well as in all of the Gardner groups (DNN vs embryologists—Poor: 0.69 vs 0.62; Fair: 0.67 vs 0.53; Top: 0.77 vs 0.54).LIMITATIONS, REASONS FOR CAUTION
Blastocyst assessment was performed using video files acquired from time-lapse incubators, where each video contained data from a single focal plane. Clinical data regarding the underlying cause of infertility and endometrial thickness before the transfer was not available, yet may explain implantation failure and lower accuracy of IPGs. Implantation was defined as the presence of a gestational sac, whereas the detection of fetal heartbeat is a more robust marker of embryo viability. The raw data were anonymized to the extent that it was not possible to quantify the number of unique patients and cycles included in the study, potentially masking the effect of bias from a limited patient pool. Furthermore, the lack of demographic data makes it difficult to draw conclusions on how representative the dataset was of the wider population. Finally, embryologists were required to assess the implantation potential, not embryo quality. Although this is not the traditional approach to embryo evaluation, morphology/morphokinetics as a means of assessing embryo quality is believed to be strongly correlated with viability and, for some methods, implantation potential.WIDER IMPLICATIONS OF THE FINDINGS
Embryo selection is a key element in IVF success and continues to be a challenge. Improving the predictive ability could assist in optimizing implantation success rates and other clinical outcomes and could minimize the financial and emotional burden on the patient. This study demonstrates moderate agreement rates between embryologists, likely due to the subjective nature of embryo assessment. In particular, we found that average embryologist accuracy and agreement were significantly lower for fair quality embryos when compared with that for top and poor quality embryos. Using data-driven algorithms as an assistive tool may help IVF professionals increase success rates and promote much needed standardization in the IVF clinic. Our results indicate a need for further research regarding technological advancement in this field.E. Amrani, A. M. Bronstein, Self-supervised classification network, Proc. ECCV, 2022 detailsSelf-supervised classification network
E. Amrani, A. M. BronsteinProc. ECCV, 2022We present Self-Classifier — a novel self-supervised end-to-end classification neural network. Self-Classifier learns labels and representations simultaneously in a single-stage end-to-end manner by optimizing for same-class prediction of two augmented views of the same sample. To guarantee non-degenerate solutions (i.e., solutions where all labels are assigned to the same class), a uniform prior is asserted on the labels. We show mathematically that unlike the regular cross-entropy loss, our approach avoids such solutions. Self-Classifier is simple to implement and is scalable to practically unlimited amounts of data. Unlike other unsupervised classification approaches, it does not require any form of pre-training or the use of expectation maximization algorithms, pseudo-labelling or external clustering. Unlike other contrastive learning representation learning approaches, it does not require a memory bank or a second network. Despite its relative simplicity, our approach achieves comparable results to state-of-the-art performance with ImageNet, CIFAR10 and CIFAR100 for its two objectives: unsupervised classification and unsupervised representation learning. Furthermore, it is the first unsupervised end-to-end classification network to perform well on the large-scale ImageNet dataset. Code will be made available.
- A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein, An amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units, Proc. US National Academy of Sciences (PNAS), 2023 details
An amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units
A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. BronsteinProc. US National Academy of Sciences (PNAS), 2023Protein structure, both at the global and local level, dictates function. Proteins fold from chains of amino acids, forming secondary structures, α-helices and β-strands, that, at least for globular proteins, subsequently fold into a three-dimensional structure. Here, we show that a Ramachandran-type plot focusing on the two dihedral angles separated by the peptide bond, and entirely contained within an amino acid pair, defines a local structural unit. We further demonstrate the usefulness of this cross-peptide-bond Ramachandran plot by showing that it captures β-turn conformations in coil regions, that traditional Ramachandran plot outliers fall into occupied regions of our plot, and that thermophilic proteins prefer specific amino acid pair conformations. Further, we demonstrate experimentally that the effect of a point mutation on backbone conformation and protein stability depends on the amino acid pair context, i.e., the identity of the adjacent amino acid, in a manner predictable by our method.
T. Weiss, L. Cosmo, E. Mayo Yanes, S. Chakraborty, A. M. Bronstein, R. Gershoni-Poranne, Guided diffusion for inverse molecular design, Nature Computational Science 3(10), 873–882, 2023 detailsGuided diffusion for inverse molecular design
T. Weiss, L. Cosmo, E. Mayo Yanes, S. Chakraborty, A. M. Bronstein, R. Gershoni-PoranneNature Computational Science 3(10), 873–882, 2023The holy grail of materials science is de novo molecular design — i.e., the ability to engineer molecules with desired characteristics. Recently, this goal has become increasingly achievable thanks to developments such as equivariant graph neural networks that can better predict molecular properties, and to the improved performance of generation tasks, in particular of conditional generation, in text-to-image generators and large language models. Herein, we introduce GaUDI, a guided diffusion model for inverse molecular design, which combines these advances and can generate novel molecules with desired properties. GaUDI decouples the generator and the property-predicting models and can be guided using both point-wise targets and open-ended targets (e.g., minimum/maximum). We demonstrate GaUDI’s effectiveness using single- and multiple-objective tasks applied to newly-generated data sets of polycyclic aromatic systems, achieving nearly 100% validity of generated molecules. Further, for some tasks, GaUDI discovers better molecules than those present in our data set of 475k molecules.
E. Schwartz, A. M. Bronstein, R. Giryes, ISP distillation, IEEE Open Journal of Signal Processing 4, 12-20, 2023 detailsISP distillation
E. Schwartz, A. M. Bronstein, R. GiryesIEEE Open Journal of Signal Processing 4, 12-20, 2023Nowadays, many of the images captured are ‘observed’ by machines only and not by humans, e.g., in autonomous systems. High-level machine vision models, such as object recognition or semantic segmentation, assume images are transformed into some canonical image space by the camera Image Signal Processor (ISP). However, the camera ISP is optimized for producing visually pleasing images for human observers and not for machines. Therefore, one may spare the ISP compute time and apply vision models directly to RAW images. Yet, it has been shown that training such models directly on RAW images results in a performance drop. To mitigate this drop, we use a RAW and RGB image pairs dataset, which can be easily acquired with no human labeling. We then train a model that is applied directly to the RAW data by using knowledge distillation such that the model predictions for RAW images will be aligned with the predictions of an off-the-shelf pre-trained model for processed RGB images. Our experiments show that our performance on RAW images for object classification and semantic segmentation is significantly better than models trained on labeled RAW images. It also reasonably matches the predictions of a pre-trained model on processed RGB images, while saving the ISP compute overhead.
T. Blau, R. Ganz, C. Baskin, M. Elad, A. M. Bronstein, Classifier robustness enhancement via test-time transformation, arXiv preprint arXiv:2303.15409 2023 detailsClassifier robustness enhancement via test-time transformation
T. Blau, R. Ganz, C. Baskin, M. Elad, A. M. BronsteinarXiv preprint arXiv:2303.15409 2023It has been recently discovered that adversarially trained classifiers exhibit an intriguing property, referred to as perceptually aligned gradients (PAG). PAG implies that the gradients of such classifiers possess a meaningful structure, aligned with human perception. Adversarial training is currently the best-known way to achieve classification robustness under adversarial attacks. The PAG property, however, has yet to be leveraged for further improving classifier robustness. In this work, we introduce Classifier Robustness Enhancement Via Test-Time Transformation (TETRA) — a novel defense method that utilizes PAG, enhancing the performance of trained robust classifiers. Our method operates in two phases. First, it modifies the input image via a designated targeted adversarial attack into each of the dataset’s classes. Then, it classifies the input image based on the distance to each of the modified instances, with the assumption that the shortest distance relates to the true class. We show that the proposed method achieves state-of-the-art results and validate our claim through extensive experiments on a variety of defense methods, classifier architectures, and datasets. We also empirically demonstrate that TETRA can boost the accuracy of any differentiable adversarial training classifier across a variety of attacks, including ones unseen at training. Specifically, applying TETRA leads to substantial improvement of up to +23%, +20%, and +26% on CIFAR10, CIFAR100, and ImageNet, respectively.
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie, Designing nonlinear photonic crystals for high-dimensional quantum state engineering, ICLR Workshop on Machine Learning for Materials, 2023 detailsDesigning nonlinear photonic crystals for high-dimensional quantum state engineering
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. ArieICLR Workshop on Machine Learning for Materials, 2023We propose a novel, physically-constrained and differentiable approach for the generation of D-dimensional qudit states via spontaneous parametric downconversion (SPDC) in quantum optics. We circumvent any limitations imposed by the inherently stochastic nature of the physical process and incorporate a set of stochastic dynamical equations governing its evolution under the SPDC Hamiltonian. We demonstrate the effectiveness of our model through the design of
structured nonlinear photonic crystals (NLPCs) and shaped pump beams; and show, theoretically and experimentally, how to generate maximally entangled states in the spatial degree of freedom. The learning of NLPC structures offers a promising new avenue for shaping and controlling arbitrary quantum states and enables all-optical coherent control of the generated states. We believe that this approach can readily be extended from bulky crystals to thin Metasurfaces and potentially applied to other quantum systems sharing a similar Hamiltonian structures, such as superfluids and superconductors.E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie, A machine learning approach to generate quantum light, ICLR Workshop on Physics for Machine Learning, 2023 detailsA machine learning approach to generate quantum light
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. ArieICLR Workshop on Physics for Machine Learning, 2023Spontaneous parametric down-conversion (SPDC) is a key technique in quantum optics used to generate entangled photon pairs. However, generating a desirable D-dimensional qudit state in the SPDC process remains a challenge. In this paper, we introduce a physically-constrained and differentiable model to overcome this challenge, and demonstrate its effectiveness through the design of shaped pump beams and structured nonlinear photonic crystals. We avoid any restrictions induced by the stochastic nature of our physical process and integrate a set of stochastic dynamical equations governing its evolution under the SPDC Hamiltonian. Our model is capable of learning the relevant interaction parameters and designing nonlinear quantum optical systems that achieve desired quantum states. We show, theoretically and experimentally, how to generate maximally entangled states in the spatial degree of freedom. Additionally, we demonstrate all-optical coherent control of the generated state by reshaping the pump beam. Our work has potential applications in high-dimensional quantum key distribution and quantum information processing.
H. Ye, S. Vedula, Y. Chen, Y. Yang, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati, GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference, Proc. ACM Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023 detailsGRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference
H. Ye, S. Vedula, Y. Chen, Y. Yang, A. M. Bronstein, R. Dreslinski, T. Mudge, N. TalatiProc. ACM Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023The high memory bandwidth demand of sparse embedding layers continues to be a critical challenge in scaling the performance of recommendation models. While prior works have exploited heterogeneous memory system designs and partial embedding sum memoization techniques, they offer limited benefits. This is because prior designs either target a very small subset of embeddings to simplify their analysis or incur a high processing cost to account for all embeddings, which does not scale with the large sizes of modern embedding tables. This paper proposes GRACE—a lightweight and scalable graph-based algorithm-system co-design framework to significantly improve the embedding layer performance of recommendation models. GRACE proposes a novel Item Co-occurrence Graph (ICG) that scalably records item co-occurrences. GRACE then presents a new system-aware ICG clustering algorithm to find frequently accessed item combinations of arbitrary lengths to compute and memoize their partial sums. High-frequency partial sums are stored in a software-managed cache space to reduce memory traffic and improve the throughput of computing sparse features. We further present a cache data layout and low-cost address computation logic to efficiently lookup item embeddings and their partial sums. Our evaluation shows that GRACE significantly outperforms the state-of-the-art techniques SPACE and MERCI by 1.5× and 1.4×, respectively.
S. Vedula, I. Tallini, A. A. Rosenberg, M. Pegoraro, E. Rodolà, Y. Romano, A. M. Bronstein, Continuous vector quantile regression, Proc. ICML Workshop Frontiers4LCD, 2023 detailsContinuous vector quantile regression
S. Vedula, I. Tallini, A. A. Rosenberg, M. Pegoraro, E. Rodolà, Y. Romano, A. M. BronsteinProc. ICML Workshop Frontiers4LCD, 2023Vector quantile regression (VQR) estimates the conditional vector quantile function (CVQF), a fundamental quantity which fully represents the conditional distribution of Y|X. VQR is formulated as an optimal transport (OT) problem between a uniform U~μ and the target (X,Y)~ν, the solution of which is a unique transport map, co-monotonic with U. Recently NL-VQR has been proposed to estimate support non-linear CVQFs, together with fast solvers which enabled the use of this tool in practical applications. Despite its utility, the scalability and estimation quality of NL-VQR is limited due to a discretization of the OT problem onto a grid of quantile levels. We propose a novel continuous formulation and parametrization of VQR using partial input-convex neural networks (PICNNs). Our approach allows for accurate, scalable, differentiable and invertible estimation of non-linear CVQFs. We further demonstrate, theoretically and experimentally, how continuous CVQFs can be used for general statistical inference tasks: estimation of likelihoods, CDFs, confidence sets, coverage, sampling, and more. This work is an important step towards unlocking the full potential of VQR.
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein, Vector quantile regression on manifolds, ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023 detailsVector quantile regression on manifolds
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. BronsteinICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023Quantile regression (QR) is a statistical tool for distribution-free
estimation of conditional quantiles of a target variable given explanatory
features. QR is limited by the assumption that the target distribution is
univariate and defined on an Euclidean domain. Although the notion of quantiles
was recently extended to multi-variate distributions, QR for multi-variate
distributions on manifolds remains underexplored, even though many important
applications inherently involve data distributed on, e.g., spheres (climate
measurements), tori (dihedral angles in proteins), or Lie groups (attitude in
navigation). By leveraging optimal transport theory and the notion of
c-concave functions, we meaningfully define conditional vector quantile
functions of high-dimensional variables on manifolds (M-CVQFs). Our approach
allows for quantile estimation, regression, and computation of conditional
confidence sets. We demonstrate the approach’s efficacy and provide insights
regarding the meaning of non-Euclidean quantiles through preliminary synthetic
data experiments.T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-Poranne, Interpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons, Journal of Organic Chemistry, 2023 detailsInterpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons
T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-PoranneJournal of Organic Chemistry, 2023In this work, interpretable deep learning was used to identify structure-property relationships governing the HOMO-LUMO gap and relative stability of polybenzenoid hydrocarbons (PBHs). To this end, a ring-based graph representation was used. In addition to affording reduced training times and excellent predictive ability, this representation could be combined with a subunit-based perception of PBHs, allowing chemical insights to be presented in terms of intuitive and simple structural motifs. The resulting insights agree with conventional organic chemistry knowledge and electronic structure-based analyses, and also reveal new behaviors and identify influential structural motifs. In particular, we evaluated and compared the effects of linear, angular, and branching motifs on these two molecular properties, as well as explored the role of dispersion in mitigating torsional strain inherent in non-planar PBHs. Hence, the observed regularities and the proposed analysis contribute to a deeper understanding of the behavior of PBHs and form the foundation for design strategies for new functional PBHs.
A. A. Rosenberg, S. Vedula, Y. Romano, A. M. Bronstein, Fast nonlinear vector quantile regression, Proc. ICML, 2023 detailsFast nonlinear vector quantile regression
A. A. Rosenberg, S. Vedula, Y. Romano, A. M. BronsteinProc. ICML, 2023Quantile regression (QR) is a powerful tool for estimating one or more conditional quantiles of a target variable Y given explanatory features X. A limitation of QR is that it is only defined for scalar target variables, due to the formulation of its objective function, and since the notion of quantiles has no standard definition for multivariate distributions. Recently, vector quantile regression (VQR) was proposed as an extension of QR for high-dimensional target variables, thanks to a meaningful generalization of the notion of quantiles to multivariate distributions. Despite its elegance, VQR is arguably not applicable in practice due to several limitations: (i) it assumes a linear model for the quantiles of the target Y given the features X; (ii) its exact formulation is intractable even for modestly-sized problems in terms of target dimensions, the number of regressed quantile levels, or the number of features, and its relaxed dual formulation may violate the monotonicity of the estimated quantiles; (iii) no fast or scalable solvers for VQR currently exist. In this work we fully address these limitations, namely: (i) We extend VQR to the non-linear case, showing substantial improvement over linear VQR; (ii) We propose vector monotone rearrangement, a method which ensures the estimates obtained by VQR relaxations are monotone functions; (iii) We provide fast, GPU-accelerated solvers for linear and nonlinear VQR which maintain a fixed memory footprint with the number of samples and quantile levels, and demonstrate that they scale to millions of samples and thousands of quantile levels; (iv) We release an optimized python package of our solvers as to widespread the use of VQR in real-world applications.
D. Zadok, O. Salzman, A. Wolf, A. M. Bronstein, Towards predicting fine finger motions from ultrasound images via kinematic representation, Proc. ICRA, 2023 detailsTowards predicting fine finger motions from ultrasound images via kinematic representation
D. Zadok, O. Salzman, A. Wolf, A. M. BronsteinProc. ICRA, 2023A central challenge in building robotic prostheses is the creation of a sensor-based system able to read physiological signals from the lower limb and instruct a robotic hand to perform various tasks. Existing systems typically perform discrete gestures such as pointing or grasping, by employing electromyography (EMG) or ultrasound (US) technologies to analyze the state of the muscles. In this work, we study the inference problem of identifying the activation of specific fingers from a sequence of US images when performing dexterous tasks such as keyboard typing or playing the piano. While estimating finger gestures has been done in the past by detecting prominent gestures, we are interested in classification done in the context of fine motions that evolve over time. We consider this task as an important step towards higher adoption rates of robotic prostheses among arm amputees, as it has the potential to dramatically increase functionality in performing daily tasks. Our key observation, motivating this work, is that modeling the hand as a robotic manipulator allows to encode an intermediate representation wherein US images are mapped to said configurations. Given a sequence of such learned configurations, coupled with a neural-network architecture that exploits temporal coherence, we are able to infer fine finger motions. We evaluated our method by collecting data from a group of subjects and demonstrating how our framework can be used to replay music played or text typed. To the best of our knowledge, this is the first study demonstrating these downstream tasks within an end-to-end system.
A. M. Bronstein, A. Marx, Water stabilizes an alternate turn conformation in horse heart myoglobin, Nature Scientific Reports, 2023 detailsWater stabilizes an alternate turn conformation in horse heart myoglobin
A. M. Bronstein, A. MarxNature Scientific Reports, 2023Comparison of myoglobin structures reveals that protein isolated from horse heart consistently adopts an alternate turn conformation in comparison to its homologues. Analysis of hundreds of high-resolution structures discounts crystallization conditions or the surrounding amino acid protein environment as explaining this difference, that is also not captured by the AlphaFold prediction. Rather, a water molecule is identified as stabilizing the conformation in the horse heart structure, which immediately reverts to the whale conformation in molecular dynamics simulations excluding that structural water.
B. Gahtan, R. Cohen, A. M. Bronstein, G. Kedar, Using deep reinforcement learning for mmWave real-time scheduling, Proc. Int'l Conf. Network of the Future (NoF), 2023 detailsUsing deep reinforcement learning for mmWave real-time scheduling
B. Gahtan, R. Cohen, A. M. Bronstein, G. KedarProc. Int'l Conf. Network of the Future (NoF), 2023We study the problem of real-time scheduling in a multi-hop millimeter-wave (mmWave) mesh. We develop a model-free deep reinforcement learning algorithm called Adaptive Activator RL (AARL), which determines the subset of mmWave links that should be activated during each time slot and the power level for each link. The most important property of AARL is its ability to make scheduling decisions within the strict time slot constraints of typical 5G mmWave networks. AARL can handle a variety of network topologies, network loads, and interference models, it can also adapt to different workloads. We demonstrate the operation of AARL on several topologies: a small topology with 10 links, a moderately-sized mesh with 48 links, and a large topology with 96 links. We show that for each topology, we compare the throughput obtained by AARL to that of a benchmark algorithm called RPMA (Residual Profit Maximizer Algorithm). The most important advantage of AARL compared to RPMA is that it is much faster and can make the necessary scheduling decisions very rapidly during every time slot, while RPMA cannot. In addition, the quality of the scheduling decisions made by AARL outperforms those made by RPMA.
T. Shor, T. Weiss, D. Noti, A. M. Bronstein, Multi PILOT: Feasible learned multiple acquisition trajectories for dynamic MRI, Proc. Medical Imaging with Deep Learning (MIDL), 2023 detailsMulti PILOT: Feasible learned multiple acquisition trajectories for dynamic MRI
T. Shor, T. Weiss, D. Noti, A. M. BronsteinProc. Medical Imaging with Deep Learning (MIDL), 2023Dynamic Magnetic Resonance Imaging (MRI) is known to be a powerful and reliable technique for the dynamic imaging of internal organs and tissues, making it a leading diagnostic tool. A major difficulty in using MRI in this setting is the relatively long acquisition time (and, hence, increased cost) required for imaging in high spatio-temporal resolution,
leading to the appearance of related motion artifacts and decrease in resolution. Compressed Sensing (CS) techniques have become a common tool to reduce MRI acquisition time by subsampling images in the k-space according to some acquisition trajectory. Several studies have particularly focused on applying deep learning techniques to learn these acquisition trajectories in order to attain better image reconstruction, rather than using some predefined set of trajectories. To the best of our knowledge, learning acquisition trajectories has been only explored in the context of static MRI. In this study, we consider acquisition trajectory learning in the dynamic imaging setting. We design an end-to-end pipeline for the joint optimization of multiple per-frame acquisition trajectories along with a reconstruction neural network, and demonstrate improved image reconstruction quality in shorter acquisition times.A. B. Bainson, J. Hermanns, P. Petsinis, N. Aavad, C. Dam Larsen, T. Swayne, A. Boyarski, D. Mottin, A. M. Bronstein, P. Karras, Spectral subgraph localization, Proc. Learning on Graphs Conference, 2023 detailsSpectral subgraph localization
A. B. Bainson, J. Hermanns, P. Petsinis, N. Aavad, C. Dam Larsen, T. Swayne, A. Boyarski, D. Mottin, A. M. Bronstein, P. KarrasProc. Learning on Graphs Conference, 2023Several graph analysis problems are based on some variant of subgraph isomorphism: Given two graphs, G and Q, does G contain a subgraph isomorphic to Q? As this problem is NP-complete, past work usually avoids addressing it explicitly. In this paper, we propose a method that localizes, i.e., finds the best-match position of, Q in G, by aligning their Laplacian spectra and enhance its stability via bagging strategies; we relegate the finding of an exact node correspondence from Q to G to a subsequent and separate graph alignment task. We demonstrate that our localization strategy outperforms a baseline based on the state-of-the-art method for graph alignment in terms of accuracy on real graphs and scales to hundreds of nodes as no other method does.
- B. Gahtan, R. J. Sahala, A. M. Bronstein, R. Cohen, Exploring QUIC dynamics: a large-scale dataset for encrypted traffic analysis, arXiv:2410.03728, 2024 details
Exploring QUIC dynamics: a large-scale dataset for encrypted traffic analysis
B. Gahtan, R. J. Sahala, A. M. Bronstein, R. CohenarXiv:2410.03728, 2024QUIC, a new and increasingly used transport protocol, addresses and resolves the limitations of TCP by offering improved security, performance, and features such as stream multiplexing and connection migration. These features, however, also present challenges for network operators who need to monitor and analyze web traffic. In this paper, we introduce VisQUIC, a labeled dataset comprising over 100,000 QUIC traces from more than 44,000 websites (URLs), collected over a four-month period. These traces provide the foundation for generating more than seven million images, with configurable parameters of window length, pixel resolution, normalization, and labels. These images enable an observer looking at the interactions between a client and a server to analyze and gain insights about QUIC encrypted connections. To illustrate the dataset’s potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC, which can reveal server behavior, client–server interactions, and the load imposed by an observed connection. We formulate the problem as a discrete regression problem, train a machine learning (ML) model for it, and then evaluate it using the proposed dataset on an example use case.
B. Gahtan, S. Funk, E. Kodesh, I. Ketko, T. Kuflik, A. M. Bronstein, Automatic identification and visualization of group training activities using wearable data, arXiv:2410.05452, 2024 detailsAutomatic identification and visualization of group training activities using wearable data
B. Gahtan, S. Funk, E. Kodesh, I. Ketko, T. Kuflik, A. M. BronsteinarXiv:2410.05452, 2024Human Activity Recognition (HAR) identifies daily activities from time-series data collected by wearable devices like smartwatches. Recent advancements in Internet of Things (IoT), cloud computing, and low-cost sensors have broadened HAR applications across fields like healthcare, biometrics, sports, and personal fitness. However, challenges remain in efficiently processing the vast amounts of data generated by these devices and developing models that can accurately recognize a wide range of activities from continuous recordings, without relying on predefined activity training sessions. This paper presents a comprehensive framework for imputing, analyzing, and identifying activities from wearable data, specifically targeting group training scenarios without explicit activity sessions. Our approach is based on data collected from 135 soldiers wearing Garmin 55 smartwatches over six months. The framework integrates multiple data streams, handles missing data through cross-domain statistical methods, and identifies activities with high accuracy using machine learning (ML). Additionally, we utilized statistical analysis techniques to evaluate the performance of each individual within the group, providing valuable insights into their respective positions in the group in an easy-to-understand visualization. These visualizations facilitate easy understanding of performance metrics, enhancing group interactions and informing individualized training programs. We evaluate our framework through traditional train-test splits and out-of-sample scenarios, focusing on the model’s generalization capabilities. Additionally, we address sleep data imputation without relying on ML, improving recovery analysis. Our findings demonstrate the potential of wearable data for accurately identifying group activities, paving the way for intelligent, data-driven training solutions.
B. Gahtan, R. J. Shahla, R. Cohen, A. M. Bronstein, Estimating the number of HTTP/3 responses in QUIC using deep learning, arXiv:2410.06140, 2024 detailsEstimating the number of HTTP/3 responses in QUIC using deep learning
B. Gahtan, R. J. Shahla, R. Cohen, A. M. BronsteinarXiv:2410.06140, 2024QUIC, a new and increasingly used transport protocol, enhances TCP by providing better security, performance, and features like stream multiplexing. These features, however, also impose challenges for network middle-boxes that need to monitor and analyze web traffic. This paper proposes a novel solution for estimating the number of HTTP/3 responses in a given QUIC connection by an observer. This estimation reveals server behavior, client-server interactions, and data transmission efficiency, which is crucial for various applications such as designing a load balancing solution and detecting HTTP/3 flood attacks. The proposed scheme transforms QUIC connection traces into a sequence of images and trains machine learning (ML) models to predict the number of responses. Then, by aggregating images of a QUIC connection, an observer can estimate the total number of responses. As the problem is formulated as a discrete regression problem, we introduce a dedicated loss function. The proposed scheme is evaluated on a dataset of over seven million images, generated from 100,000 traces collected from over 44,000 websites over a four-month period, from various vantage points. The scheme achieves up to 97% cumulative accuracy in both known and unknown web server settings and 92% accuracy in estimating the total number of responses in unseen QUIC traces.
H. Abraham, B. Gahtan, A. Kobovich, O. Leitersdorf, A. M. Bronstein, E. Yaakobi, Beyond the alphabet: deep signal embedding for enhanced DNA clustering, arXiv:2410.06188, 2024 detailsBeyond the alphabet: deep signal embedding for enhanced DNA clustering
H. Abraham, B. Gahtan, A. Kobovich, O. Leitersdorf, A. M. Bronstein, E. YaakobiarXiv:2410.06188, 2024The emerging field of DNA storage employs strands of DNA bases (A/T/C/G) as a storage medium for digital information to enable massive density and durability. The DNA storage pipeline includes: (1) encoding the raw data into sequences of DNA bases; (2) synthesizing the sequences as DNA strands that are stored over time as an unordered set; (3) sequencing the DNA strands to generate DNA reads; and (4) deducing the original data. The DNA synthesis and sequencing stages each generate several independent error-prone duplicates of each strand which are then utilized in the final stage to reconstruct the best estimate for the original strand. Specifically, the reads are first clustered into groups likely originating from the same strand (based on their similarity to each other), and then each group approximates the strand that led to the reads of that group. This work improves the DNA clustering stage by embedding it as part of the DNA sequencing. Traditional DNA storage solutions begin after the DNA sequencing process generates discrete DNA reads (A/T/C/G), yet we identify that there is untapped potential in using the raw signals generated by the Nanopore DNA sequencing machine before they are discretized into bases, a process known as basecalling, which is done using a deep neural network. We propose a deep neural network that clusters these signals directly, demonstrating superior accuracy, and reduced computation times compared to current approaches that cluster after basecalling.
B. Gahtan, R. Cohen, A. M. Bronstein, E. Shapira, Data-driven cellular network selector for vehicle teleoperations, arXiv:2410.19791, 2024 detailsData-driven cellular network selector for vehicle teleoperations
B. Gahtan, R. Cohen, A. M. Bronstein, E. ShapiraarXiv:2410.19791, 2024Remote control of robotic systems, also known as teleoperation, is crucial for the development of autonomous vehicle (AV) technology. It allows a remote operator to view live video from AVs and, in some cases, to make real-time decisions. The effectiveness of video-based teleoperation systems is heavily influenced by the quality of the cellular network and, in particular, its packet loss rate and latency. To optimize these parameters, an AV can be connected to multiple cellular networks and determine in real time over which cellular network each video packet will be transmitted. We present an algorithm, called Active Network Selector (ANS), which uses a time series machine learning approach for solving this problem. We compare ANS to a baseline non-learning algorithm, which is used today in commercial systems, and show that ANS performs much better, with respect to both packet loss and packet latency.
T. Shor, C. Baskin, A. M. Bronstein, Leveraging latents for efficient thermography classification and segmentation, Proc. Medical Imaging with Deep Learning (MIDL), 2024 detailsLeveraging latents for efficient thermography classification and segmentation
T. Shor, C. Baskin, A. M. BronsteinProc. Medical Imaging with Deep Learning (MIDL), 2024Breast cancer is a prominent health concern worldwide, currently being the secondmost common and second-deadliest type of cancer in women. While current breast cancer diagnosis mainly relies on mammography imaging, in recent years the use of thermography for breast cancer imaging has been garnering growing popularity. Thermographic imaging relies on infrared cameras to capture body-emitted heat distributions. While these heat signatures have proven useful for computer-vision systems for accurate breast cancer segmentation and classification, prior work often relies on handcrafted feature engineering or complex architectures, potentially limiting the comparability and applicability of these methods. In this work, we present a novel algorithm for both breast cancer classification and segmentation. Rather than focusing efforts on manual feature and architecture engineering, our algorithm focuses on leveraging an informative, learned feature space, thus making our solution simpler to use and extend to other frameworks and downstream tasks, as well as more applicable to data-scarce settings. Our classification produces SOTA results, while we are the first work to produce segmentation regions studied in this paper.
A. A. Rosenberg, S. Vedula, A. M. Bronstein, A. Marx, Seeing Double: Molecular dynamics simulations reveal the stability of certain alternate protein conformations in crystal structures, bioRxiv 2024.08.31.610605, 2024 detailsSeeing Double: Molecular dynamics simulations reveal the stability of certain alternate protein conformations in crystal structures
A. A. Rosenberg, S. Vedula, A. M. Bronstein, A. MarxbioRxiv 2024.08.31.610605, 2024Proteins jiggle around, adopting ensembles of interchanging conformations. Here we show through a large-scale analysis of the Protein Data Bank and using molecular dynamics simulations, that segments of protein chains can also commonly adopt dual, transiently stable conformations which is not explained by direct interactions. Our analysis highlights how alternate conformations can be maintained as non-interchanging, separated states intrinsic to the protein chain, namely through steric barriers or the adoption of transient secondary structure elements. We further demonstrate that despite the commonality of the phenomenon, current structural ensemble prediction methods fail to capture these bimodal distributions of conformations.
S. Vedula, V. Maiorca, L. Basile, F. Locatello, A. M. Bronstein, Scalable unsupervised alignment of general metric and non-metric structures, arXiv preprint arXiv:2406.13507, 2024 (also in Proc. ICML Workshop on AI4Science) detailsScalable unsupervised alignment of general metric and non-metric structures
S. Vedula, V. Maiorca, L. Basile, F. Locatello, A. M. BronsteinarXiv preprint arXiv:2406.13507, 2024 (also in Proc. ICML Workshop on AI4Science)Aligning data from different domains is a fundamental problem in machine learning with broad applications across very different areas, most notably aligning experimental readouts in single-cell multiomics. Mathematically, this problem can be formulated as the minimization of disagreement of pair-wise quantities such as distances and is related to the Gromov-Hausdorff and Gromov-Wasserstein distances. Computationally, it is a quadratic assignment problem (QAP) that is known to be NP-hard. Prior works attempted to solve the QAP directly with entropic or low-rank regularization on the permutation, which is computationally tractable only for modestly-sized inputs, and encode only limited inductive bias related to the domains being aligned. We consider the alignment of metric structures formulated as a discrete Gromov-Wasserstein problem and instead of solving the QAP directly, we propose to learn a related well-scalable linear assignment problem (LAP) whose solution is also a minimizer of the QAP. We also show a flexible extension of the proposed framework to general non-metric dissimilarities through differentiable ranks. We extensively evaluate our approach on synthetic and real datasets from single-cell multiomics and neural latent spaces, achieving state-of-the-art performance while being conceptually and computationally simple.
A. A. Rosenberg, A. Marx, A. M. Bronstein, A dataset of alternately located segments in protein crystal structures, Scientific Data, 11 (783), 2024 detailsA dataset of alternately located segments in protein crystal structures
A. A. Rosenberg, A. Marx, A. M. BronsteinScientific Data, 11 (783), 2024Protein Data Bank (PDB) files list the relative spatial location of atoms in a protein structure as the final output of the process of fitting and refining to experimentally determined electron density measurements. Where experimental evidence exists for multiple conformations, atoms are modelled in alternate locations. Programs reading PDB files commonly ignore these alternate conformations by default leaving users oblivious to the presence of alternate conformations in the structures they analyze. This has led to underappreciation of their prevalence, under characterisation of their features and limited the accessibility to this high-resolution data representing structural ensembles. We have trawled PDB files to extract structural features of residues with alternately located atoms. The output includes the distance between alternate conformations and identifies the location of these segments within the protein chain and in proximity of all other atoms within a defined radius. This dataset should be of use in efforts to predict multiple structures from a single sequence and support studies investigating protein flexibility and the association with protein function.
D. Freedman, E. Rozenberg, A. M. Bronstein, A theoretical framework for an efficient normalizing flow-based solution to the Schrödinger equation, arXiv preprint arXiv:2406.00047, 2024 detailsA theoretical framework for an efficient normalizing flow-based solution to the Schrödinger equation
D. Freedman, E. Rozenberg, A. M. BronsteinarXiv preprint arXiv:2406.00047, 2024A central problem in quantum mechanics involves solving the Electronic Schrödinger Equation for a molecule or material. The Variational Monte Carlo approach to this problem approximates a particular variational objective via sampling, and then optimizes this approximated objective over a chosen parameterized family of wavefunctions, known as the ansatz. Recently neural networks have been used as the ansatz, with accompanying success. However, sampling from such wavefunctions has required the use of a Markov Chain Monte Carlo approach, which is inherently inefficient. In this work, we propose a solution to this problem via an ansatz which is cheap to sample from, yet satisfies the requisite quantum mechanical properties. We prove that a normalizing flow using the following two essential ingredients satisfies our requirements: (a) a base distribution which is constructed from Determinantal Point Processes; (b) flow layers which are equivariant to a particular subgroup of the permutation group. We then show how to construct both continuous and discrete normalizing flows which satisfy the requisite equivariance. We further demonstrate the manner in which the non-smooth nature (“cusps”) of the wavefunction may be captured, and how the framework may be generalized to provide induction across multiple molecules. The resulting theoretical framework entails an efficient approach to solving the Electronic Schrödinger Equation.
Y. Elul, E. Rozenberg, A. Boyarski, Y. Yaniv, A. Schuster, A. M. Bronstein , Data-driven modeling of interrelated dynamical systems, Nature Communications Physics (7), 144, 2024 detailsData-driven modeling of interrelated dynamical systems
Y. Elul, E. Rozenberg, A. Boyarski, Y. Yaniv, A. Schuster, A. M. BronsteinNature Communications Physics (7), 144, 2024Non-linear dynamical systems describe numerous real-world phenomena, ranging from the weather, to financial markets and disease progression. Individual systems may share substantial common information, for example patients’ anatomy. Lately, deep-learning has emerged as a leading method for data-driven modeling of non-linear dynamical systems. Yet, despite recent breakthroughs, prior works largely ignored the existence of shared information between different systems. However, such cases are quite common, for example, in medicine: we may wish to have a patient-specific model for some disease, but the data collected from a single patient is usually too small to train a deep-learning model. Hence, we must properly utilize data gathered from other patients. Here, we explicitly consider such cases by jointly modeling multiple systems. We show that the current single-system models consistently fail when trying to learn simultaneously from multiple systems. We suggest a framework for jointly approximating the Koopman operators of multiple systems, while intrinsically exploiting common information. We demonstrate how we can adapt to a new system using order-of-magnitude less new data and show the superiority of our model over competing methods, in terms of both forecasting ability and statistical fidelity, across chaotic, cardiac, and climate systems.
O. Wengrowicz, A. M. Bronstein, O. Cohen, Unsupervised physics-informed deep learning-based reconstruction for time-resolved imaging by multiplexed ptychography, Optics Express 32(6), pp. 8791-8803, 2024 detailsUnsupervised physics-informed deep learning-based reconstruction for time-resolved imaging by multiplexed ptychography
O. Wengrowicz, A. M. Bronstein, O. CohenOptics Express 32(6), pp. 8791-8803, 2024We explore numerically an unsupervised, physics-informed, deep learning-based reconstruction technique for time-resolved imaging by multiplexed ptychography. In our method, the untrained deep learning model replaces the iterative algorithm’s update step, yielding superior reconstructions of multiple dynamic object frames compared to conventional methodologies. More precisely, we demonstrate improvements in image quality and resolution, while reducing sensitivity to the number of recorded frames, the mutual orthogonality of different probe modes, overlap between neighboring probe beams and the cutoff frequency of the ptychographic microscope – properties that are generally of paramount importance for ptychographic reconstruction algorithms.
G. Serussi, T. Shor, T. Hirshberg, C. Baskin, A. M. Bronstein, Active propulsion noise shaping for multi-rotor aircraft localization, Proc. Int'l Conf. on Intelligent Robots and Systems (IROS), 2024 detailsActive propulsion noise shaping for multi-rotor aircraft localization
G. Serussi, T. Shor, T. Hirshberg, C. Baskin, A. M. BronsteinProc. Int'l Conf. on Intelligent Robots and Systems (IROS), 2024Multi-rotor aerial autonomous vehicles (MAVs) primarily rely on vision for navigation purposes. However, visual localization and odometry techniques suffer from poor performance in low or direct sunlight, a limited field of view, and vulnerability to occlusions. Acoustic sensing can serve as a complementary or even alternative modality for vision in many situations, and it also has the added benefits of lower system cost and energy footprint, which is especially important for micro aircraft. This paper proposes actively controlling and shaping the aircraft propulsion noise generated by the rotors to benefit localization tasks, rather than considering it a harmful nuisance. We present a neural network architecture for selfnoise-based localization in a known environment. We show that training it simultaneously with learning time-varying rotor phase modulation achieves accurate and robust localization. The proposed methods are evaluated using a computationally affordable simulation of MAV rotor noise in 2D acoustic environments that is fitted to real recordings of rotor pressure fields.
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein, Vector quantile regression on manifolds, Proc. AIStats, 2024 detailsVector quantile regression on manifolds
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. BronsteinProc. AIStats, 2024Quantile regression (QR) is a statistical tool for distribution-free estimation of conditional quantiles of a target variable given explanatory features. QR is limited by the assumption that the target distribution is univariate and defined on an Euclidean domain. Although the notion of quantiles was recently extended to multi-variate distributions, QR for multi-variate distributions on manifolds remains underexplored, even though many important applications inherently involve data distributed on, e.g., spheres (climate and geological phenomena), and tori (dihedral angles in proteins). By leveraging optimal transport theory and c-concave functions, we meaningfully define conditional vector quantile functions of high-dimensional variables on manifolds (M-CVQFs). Our approach allows for quantile estimation, regression, and computation of conditional confidence sets and likelihoods. We demonstrate the approach’s efficacy and provide insights regarding the meaning of non-Euclidean quantiles through synthetic and real data experiments.
Y. Chen, H. Ye, S. Vedula, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati, Demystifying graph sparsification algorithms in graph properties preservation, Proc.Int'l Conf. on Very Large Databases (VLDB), 2024 detailsDemystifying graph sparsification algorithms in graph properties preservation
Y. Chen, H. Ye, S. Vedula, A. M. Bronstein, R. Dreslinski, T. Mudge, N. TalatiProc.Int'l Conf. on Very Large Databases (VLDB), 2024Graph sparsification is a technique that approximates a given graph by a sparse graph with a subset of vertices and/or edges. The goal of an effective sparsification algorithm is to maintain specific graph properties relevant to the downstream task while minimizing the graph’s size. Graph algorithms often suffer from long execution time due to the irregularity and the large real-world graph size. Graph sparsification can be applied to greatly reduce the run time of graph algorithms by substituting the full graph with a much smaller sparsified graph, without significantly degrading the output quality. However, the interaction between numerous sparsifiers and graph properties is not widely explored, and the potential of graph sparsification is not fully understood.
In this work, we cover 16 widely-used graph metrics, 12 representative graph sparsification algorithms, and 14 real-world input graphs spanning various categories, exhibiting diverse characteristics, sizes, and densities. We developed a framework to extensively assess the performance of these sparsification algorithms against graph metrics, and provide insights to the results. Our study shows that there is no one sparsifier that performs the best in preserving all graph properties, e.g. sparsifiers that preserve distance-related graph properties (eccentricity) struggle to perform well on Graph Neural Networks (GNN). This paper presents a comprehensive experimental study evaluating the performance of sparsification algorithms in preserving essential graph metrics. The insights inform future research in incorporating matching graph sparsification to graph algorithms to maximize benefits while minimizing quality degradation. Furthermore, we provide a framework to facilitate the future evaluation of evolving sparsification algorithms, graph metrics, and ever-growing graph data. - B. Gahtan, R. J. Sahala, A. M. Bronstein, R. Cohen, Exploring QUIC dynamics: a large-scale dataset for encrypted traffic analysis, arXiv:2410.03728, 2024 details
Exploring QUIC dynamics: a large-scale dataset for encrypted traffic analysis
B. Gahtan, R. J. Sahala, A. M. Bronstein, R. CohenarXiv:2410.03728, 2024QUIC, a new and increasingly used transport protocol, addresses and resolves the limitations of TCP by offering improved security, performance, and features such as stream multiplexing and connection migration. These features, however, also present challenges for network operators who need to monitor and analyze web traffic. In this paper, we introduce VisQUIC, a labeled dataset comprising over 100,000 QUIC traces from more than 44,000 websites (URLs), collected over a four-month period. These traces provide the foundation for generating more than seven million images, with configurable parameters of window length, pixel resolution, normalization, and labels. These images enable an observer looking at the interactions between a client and a server to analyze and gain insights about QUIC encrypted connections. To illustrate the dataset’s potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC, which can reveal server behavior, client–server interactions, and the load imposed by an observed connection. We formulate the problem as a discrete regression problem, train a machine learning (ML) model for it, and then evaluate it using the proposed dataset on an example use case.
B. Gahtan, S. Funk, E. Kodesh, I. Ketko, T. Kuflik, A. M. Bronstein, Automatic identification and visualization of group training activities using wearable data, arXiv:2410.05452, 2024 detailsAutomatic identification and visualization of group training activities using wearable data
B. Gahtan, S. Funk, E. Kodesh, I. Ketko, T. Kuflik, A. M. BronsteinarXiv:2410.05452, 2024Human Activity Recognition (HAR) identifies daily activities from time-series data collected by wearable devices like smartwatches. Recent advancements in Internet of Things (IoT), cloud computing, and low-cost sensors have broadened HAR applications across fields like healthcare, biometrics, sports, and personal fitness. However, challenges remain in efficiently processing the vast amounts of data generated by these devices and developing models that can accurately recognize a wide range of activities from continuous recordings, without relying on predefined activity training sessions. This paper presents a comprehensive framework for imputing, analyzing, and identifying activities from wearable data, specifically targeting group training scenarios without explicit activity sessions. Our approach is based on data collected from 135 soldiers wearing Garmin 55 smartwatches over six months. The framework integrates multiple data streams, handles missing data through cross-domain statistical methods, and identifies activities with high accuracy using machine learning (ML). Additionally, we utilized statistical analysis techniques to evaluate the performance of each individual within the group, providing valuable insights into their respective positions in the group in an easy-to-understand visualization. These visualizations facilitate easy understanding of performance metrics, enhancing group interactions and informing individualized training programs. We evaluate our framework through traditional train-test splits and out-of-sample scenarios, focusing on the model’s generalization capabilities. Additionally, we address sleep data imputation without relying on ML, improving recovery analysis. Our findings demonstrate the potential of wearable data for accurately identifying group activities, paving the way for intelligent, data-driven training solutions.
B. Gahtan, R. J. Shahla, R. Cohen, A. M. Bronstein, Estimating the number of HTTP/3 responses in QUIC using deep learning, arXiv:2410.06140, 2024 detailsEstimating the number of HTTP/3 responses in QUIC using deep learning
B. Gahtan, R. J. Shahla, R. Cohen, A. M. BronsteinarXiv:2410.06140, 2024QUIC, a new and increasingly used transport protocol, enhances TCP by providing better security, performance, and features like stream multiplexing. These features, however, also impose challenges for network middle-boxes that need to monitor and analyze web traffic. This paper proposes a novel solution for estimating the number of HTTP/3 responses in a given QUIC connection by an observer. This estimation reveals server behavior, client-server interactions, and data transmission efficiency, which is crucial for various applications such as designing a load balancing solution and detecting HTTP/3 flood attacks. The proposed scheme transforms QUIC connection traces into a sequence of images and trains machine learning (ML) models to predict the number of responses. Then, by aggregating images of a QUIC connection, an observer can estimate the total number of responses. As the problem is formulated as a discrete regression problem, we introduce a dedicated loss function. The proposed scheme is evaluated on a dataset of over seven million images, generated from 100,000 traces collected from over 44,000 websites over a four-month period, from various vantage points. The scheme achieves up to 97% cumulative accuracy in both known and unknown web server settings and 92% accuracy in estimating the total number of responses in unseen QUIC traces.
H. Abraham, B. Gahtan, A. Kobovich, O. Leitersdorf, A. M. Bronstein, E. Yaakobi, Beyond the alphabet: deep signal embedding for enhanced DNA clustering, arXiv:2410.06188, 2024 detailsBeyond the alphabet: deep signal embedding for enhanced DNA clustering
H. Abraham, B. Gahtan, A. Kobovich, O. Leitersdorf, A. M. Bronstein, E. YaakobiarXiv:2410.06188, 2024The emerging field of DNA storage employs strands of DNA bases (A/T/C/G) as a storage medium for digital information to enable massive density and durability. The DNA storage pipeline includes: (1) encoding the raw data into sequences of DNA bases; (2) synthesizing the sequences as DNA strands that are stored over time as an unordered set; (3) sequencing the DNA strands to generate DNA reads; and (4) deducing the original data. The DNA synthesis and sequencing stages each generate several independent error-prone duplicates of each strand which are then utilized in the final stage to reconstruct the best estimate for the original strand. Specifically, the reads are first clustered into groups likely originating from the same strand (based on their similarity to each other), and then each group approximates the strand that led to the reads of that group. This work improves the DNA clustering stage by embedding it as part of the DNA sequencing. Traditional DNA storage solutions begin after the DNA sequencing process generates discrete DNA reads (A/T/C/G), yet we identify that there is untapped potential in using the raw signals generated by the Nanopore DNA sequencing machine before they are discretized into bases, a process known as basecalling, which is done using a deep neural network. We propose a deep neural network that clusters these signals directly, demonstrating superior accuracy, and reduced computation times compared to current approaches that cluster after basecalling.
B. Gahtan, R. Cohen, A. M. Bronstein, E. Shapira, Data-driven cellular network selector for vehicle teleoperations, arXiv:2410.19791, 2024 detailsData-driven cellular network selector for vehicle teleoperations
B. Gahtan, R. Cohen, A. M. Bronstein, E. ShapiraarXiv:2410.19791, 2024Remote control of robotic systems, also known as teleoperation, is crucial for the development of autonomous vehicle (AV) technology. It allows a remote operator to view live video from AVs and, in some cases, to make real-time decisions. The effectiveness of video-based teleoperation systems is heavily influenced by the quality of the cellular network and, in particular, its packet loss rate and latency. To optimize these parameters, an AV can be connected to multiple cellular networks and determine in real time over which cellular network each video packet will be transmitted. We present an algorithm, called Active Network Selector (ANS), which uses a time series machine learning approach for solving this problem. We compare ANS to a baseline non-learning algorithm, which is used today in commercial systems, and show that ANS performs much better, with respect to both packet loss and packet latency.
T. Shor, C. Baskin, A. M. Bronstein, Leveraging latents for efficient thermography classification and segmentation, Proc. Medical Imaging with Deep Learning (MIDL), 2024 detailsLeveraging latents for efficient thermography classification and segmentation
T. Shor, C. Baskin, A. M. BronsteinProc. Medical Imaging with Deep Learning (MIDL), 2024Breast cancer is a prominent health concern worldwide, currently being the secondmost common and second-deadliest type of cancer in women. While current breast cancer diagnosis mainly relies on mammography imaging, in recent years the use of thermography for breast cancer imaging has been garnering growing popularity. Thermographic imaging relies on infrared cameras to capture body-emitted heat distributions. While these heat signatures have proven useful for computer-vision systems for accurate breast cancer segmentation and classification, prior work often relies on handcrafted feature engineering or complex architectures, potentially limiting the comparability and applicability of these methods. In this work, we present a novel algorithm for both breast cancer classification and segmentation. Rather than focusing efforts on manual feature and architecture engineering, our algorithm focuses on leveraging an informative, learned feature space, thus making our solution simpler to use and extend to other frameworks and downstream tasks, as well as more applicable to data-scarce settings. Our classification produces SOTA results, while we are the first work to produce segmentation regions studied in this paper.
A. A. Rosenberg, S. Vedula, A. M. Bronstein, A. Marx, Seeing Double: Molecular dynamics simulations reveal the stability of certain alternate protein conformations in crystal structures, bioRxiv 2024.08.31.610605, 2024 detailsSeeing Double: Molecular dynamics simulations reveal the stability of certain alternate protein conformations in crystal structures
A. A. Rosenberg, S. Vedula, A. M. Bronstein, A. MarxbioRxiv 2024.08.31.610605, 2024Proteins jiggle around, adopting ensembles of interchanging conformations. Here we show through a large-scale analysis of the Protein Data Bank and using molecular dynamics simulations, that segments of protein chains can also commonly adopt dual, transiently stable conformations which is not explained by direct interactions. Our analysis highlights how alternate conformations can be maintained as non-interchanging, separated states intrinsic to the protein chain, namely through steric barriers or the adoption of transient secondary structure elements. We further demonstrate that despite the commonality of the phenomenon, current structural ensemble prediction methods fail to capture these bimodal distributions of conformations.
S. Vedula, V. Maiorca, L. Basile, F. Locatello, A. M. Bronstein, Scalable unsupervised alignment of general metric and non-metric structures, arXiv preprint arXiv:2406.13507, 2024 (also in Proc. ICML Workshop on AI4Science) detailsScalable unsupervised alignment of general metric and non-metric structures
S. Vedula, V. Maiorca, L. Basile, F. Locatello, A. M. BronsteinarXiv preprint arXiv:2406.13507, 2024 (also in Proc. ICML Workshop on AI4Science)Aligning data from different domains is a fundamental problem in machine learning with broad applications across very different areas, most notably aligning experimental readouts in single-cell multiomics. Mathematically, this problem can be formulated as the minimization of disagreement of pair-wise quantities such as distances and is related to the Gromov-Hausdorff and Gromov-Wasserstein distances. Computationally, it is a quadratic assignment problem (QAP) that is known to be NP-hard. Prior works attempted to solve the QAP directly with entropic or low-rank regularization on the permutation, which is computationally tractable only for modestly-sized inputs, and encode only limited inductive bias related to the domains being aligned. We consider the alignment of metric structures formulated as a discrete Gromov-Wasserstein problem and instead of solving the QAP directly, we propose to learn a related well-scalable linear assignment problem (LAP) whose solution is also a minimizer of the QAP. We also show a flexible extension of the proposed framework to general non-metric dissimilarities through differentiable ranks. We extensively evaluate our approach on synthetic and real datasets from single-cell multiomics and neural latent spaces, achieving state-of-the-art performance while being conceptually and computationally simple.
A. A. Rosenberg, A. Marx, A. M. Bronstein, A dataset of alternately located segments in protein crystal structures, Scientific Data, 11 (783), 2024 detailsA dataset of alternately located segments in protein crystal structures
A. A. Rosenberg, A. Marx, A. M. BronsteinScientific Data, 11 (783), 2024Protein Data Bank (PDB) files list the relative spatial location of atoms in a protein structure as the final output of the process of fitting and refining to experimentally determined electron density measurements. Where experimental evidence exists for multiple conformations, atoms are modelled in alternate locations. Programs reading PDB files commonly ignore these alternate conformations by default leaving users oblivious to the presence of alternate conformations in the structures they analyze. This has led to underappreciation of their prevalence, under characterisation of their features and limited the accessibility to this high-resolution data representing structural ensembles. We have trawled PDB files to extract structural features of residues with alternately located atoms. The output includes the distance between alternate conformations and identifies the location of these segments within the protein chain and in proximity of all other atoms within a defined radius. This dataset should be of use in efforts to predict multiple structures from a single sequence and support studies investigating protein flexibility and the association with protein function.
D. Freedman, E. Rozenberg, A. M. Bronstein, A theoretical framework for an efficient normalizing flow-based solution to the Schrödinger equation, arXiv preprint arXiv:2406.00047, 2024 detailsA theoretical framework for an efficient normalizing flow-based solution to the Schrödinger equation
D. Freedman, E. Rozenberg, A. M. BronsteinarXiv preprint arXiv:2406.00047, 2024A central problem in quantum mechanics involves solving the Electronic Schrödinger Equation for a molecule or material. The Variational Monte Carlo approach to this problem approximates a particular variational objective via sampling, and then optimizes this approximated objective over a chosen parameterized family of wavefunctions, known as the ansatz. Recently neural networks have been used as the ansatz, with accompanying success. However, sampling from such wavefunctions has required the use of a Markov Chain Monte Carlo approach, which is inherently inefficient. In this work, we propose a solution to this problem via an ansatz which is cheap to sample from, yet satisfies the requisite quantum mechanical properties. We prove that a normalizing flow using the following two essential ingredients satisfies our requirements: (a) a base distribution which is constructed from Determinantal Point Processes; (b) flow layers which are equivariant to a particular subgroup of the permutation group. We then show how to construct both continuous and discrete normalizing flows which satisfy the requisite equivariance. We further demonstrate the manner in which the non-smooth nature (“cusps”) of the wavefunction may be captured, and how the framework may be generalized to provide induction across multiple molecules. The resulting theoretical framework entails an efficient approach to solving the Electronic Schrödinger Equation.
Y. Elul, E. Rozenberg, A. Boyarski, Y. Yaniv, A. Schuster, A. M. Bronstein , Data-driven modeling of interrelated dynamical systems, Nature Communications Physics (7), 144, 2024 detailsData-driven modeling of interrelated dynamical systems
Y. Elul, E. Rozenberg, A. Boyarski, Y. Yaniv, A. Schuster, A. M. BronsteinNature Communications Physics (7), 144, 2024Non-linear dynamical systems describe numerous real-world phenomena, ranging from the weather, to financial markets and disease progression. Individual systems may share substantial common information, for example patients’ anatomy. Lately, deep-learning has emerged as a leading method for data-driven modeling of non-linear dynamical systems. Yet, despite recent breakthroughs, prior works largely ignored the existence of shared information between different systems. However, such cases are quite common, for example, in medicine: we may wish to have a patient-specific model for some disease, but the data collected from a single patient is usually too small to train a deep-learning model. Hence, we must properly utilize data gathered from other patients. Here, we explicitly consider such cases by jointly modeling multiple systems. We show that the current single-system models consistently fail when trying to learn simultaneously from multiple systems. We suggest a framework for jointly approximating the Koopman operators of multiple systems, while intrinsically exploiting common information. We demonstrate how we can adapt to a new system using order-of-magnitude less new data and show the superiority of our model over competing methods, in terms of both forecasting ability and statistical fidelity, across chaotic, cardiac, and climate systems.
O. Wengrowicz, A. M. Bronstein, O. Cohen, Unsupervised physics-informed deep learning-based reconstruction for time-resolved imaging by multiplexed ptychography, Optics Express 32(6), pp. 8791-8803, 2024 detailsUnsupervised physics-informed deep learning-based reconstruction for time-resolved imaging by multiplexed ptychography
O. Wengrowicz, A. M. Bronstein, O. CohenOptics Express 32(6), pp. 8791-8803, 2024We explore numerically an unsupervised, physics-informed, deep learning-based reconstruction technique for time-resolved imaging by multiplexed ptychography. In our method, the untrained deep learning model replaces the iterative algorithm’s update step, yielding superior reconstructions of multiple dynamic object frames compared to conventional methodologies. More precisely, we demonstrate improvements in image quality and resolution, while reducing sensitivity to the number of recorded frames, the mutual orthogonality of different probe modes, overlap between neighboring probe beams and the cutoff frequency of the ptychographic microscope – properties that are generally of paramount importance for ptychographic reconstruction algorithms.
G. Serussi, T. Shor, T. Hirshberg, C. Baskin, A. M. Bronstein, Active propulsion noise shaping for multi-rotor aircraft localization, Proc. Int'l Conf. on Intelligent Robots and Systems (IROS), 2024 detailsActive propulsion noise shaping for multi-rotor aircraft localization
G. Serussi, T. Shor, T. Hirshberg, C. Baskin, A. M. BronsteinProc. Int'l Conf. on Intelligent Robots and Systems (IROS), 2024Multi-rotor aerial autonomous vehicles (MAVs) primarily rely on vision for navigation purposes. However, visual localization and odometry techniques suffer from poor performance in low or direct sunlight, a limited field of view, and vulnerability to occlusions. Acoustic sensing can serve as a complementary or even alternative modality for vision in many situations, and it also has the added benefits of lower system cost and energy footprint, which is especially important for micro aircraft. This paper proposes actively controlling and shaping the aircraft propulsion noise generated by the rotors to benefit localization tasks, rather than considering it a harmful nuisance. We present a neural network architecture for selfnoise-based localization in a known environment. We show that training it simultaneously with learning time-varying rotor phase modulation achieves accurate and robust localization. The proposed methods are evaluated using a computationally affordable simulation of MAV rotor noise in 2D acoustic environments that is fitted to real recordings of rotor pressure fields.
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein, Vector quantile regression on manifolds, Proc. AIStats, 2024 detailsVector quantile regression on manifolds
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. BronsteinProc. AIStats, 2024Quantile regression (QR) is a statistical tool for distribution-free estimation of conditional quantiles of a target variable given explanatory features. QR is limited by the assumption that the target distribution is univariate and defined on an Euclidean domain. Although the notion of quantiles was recently extended to multi-variate distributions, QR for multi-variate distributions on manifolds remains underexplored, even though many important applications inherently involve data distributed on, e.g., spheres (climate and geological phenomena), and tori (dihedral angles in proteins). By leveraging optimal transport theory and c-concave functions, we meaningfully define conditional vector quantile functions of high-dimensional variables on manifolds (M-CVQFs). Our approach allows for quantile estimation, regression, and computation of conditional confidence sets and likelihoods. We demonstrate the approach’s efficacy and provide insights regarding the meaning of non-Euclidean quantiles through synthetic and real data experiments.
Y. Chen, H. Ye, S. Vedula, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati, Demystifying graph sparsification algorithms in graph properties preservation, Proc.Int'l Conf. on Very Large Databases (VLDB), 2024 detailsDemystifying graph sparsification algorithms in graph properties preservation
Y. Chen, H. Ye, S. Vedula, A. M. Bronstein, R. Dreslinski, T. Mudge, N. TalatiProc.Int'l Conf. on Very Large Databases (VLDB), 2024Graph sparsification is a technique that approximates a given graph by a sparse graph with a subset of vertices and/or edges. The goal of an effective sparsification algorithm is to maintain specific graph properties relevant to the downstream task while minimizing the graph’s size. Graph algorithms often suffer from long execution time due to the irregularity and the large real-world graph size. Graph sparsification can be applied to greatly reduce the run time of graph algorithms by substituting the full graph with a much smaller sparsified graph, without significantly degrading the output quality. However, the interaction between numerous sparsifiers and graph properties is not widely explored, and the potential of graph sparsification is not fully understood.
In this work, we cover 16 widely-used graph metrics, 12 representative graph sparsification algorithms, and 14 real-world input graphs spanning various categories, exhibiting diverse characteristics, sizes, and densities. We developed a framework to extensively assess the performance of these sparsification algorithms against graph metrics, and provide insights to the results. Our study shows that there is no one sparsifier that performs the best in preserving all graph properties, e.g. sparsifiers that preserve distance-related graph properties (eccentricity) struggle to perform well on Graph Neural Networks (GNN). This paper presents a comprehensive experimental study evaluating the performance of sparsification algorithms in preserving essential graph metrics. The insights inform future research in incorporating matching graph sparsification to graph algorithms to maximize benefits while minimizing quality degradation. Furthermore, we provide a framework to facilitate the future evaluation of evolving sparsification algorithms, graph metrics, and ever-growing graph data.A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein, An amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units, Proc. US National Academy of Sciences (PNAS), 2023 detailsAn amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units
A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. BronsteinProc. US National Academy of Sciences (PNAS), 2023Protein structure, both at the global and local level, dictates function. Proteins fold from chains of amino acids, forming secondary structures, α-helices and β-strands, that, at least for globular proteins, subsequently fold into a three-dimensional structure. Here, we show that a Ramachandran-type plot focusing on the two dihedral angles separated by the peptide bond, and entirely contained within an amino acid pair, defines a local structural unit. We further demonstrate the usefulness of this cross-peptide-bond Ramachandran plot by showing that it captures β-turn conformations in coil regions, that traditional Ramachandran plot outliers fall into occupied regions of our plot, and that thermophilic proteins prefer specific amino acid pair conformations. Further, we demonstrate experimentally that the effect of a point mutation on backbone conformation and protein stability depends on the amino acid pair context, i.e., the identity of the adjacent amino acid, in a manner predictable by our method.
T. Weiss, L. Cosmo, E. Mayo Yanes, S. Chakraborty, A. M. Bronstein, R. Gershoni-Poranne, Guided diffusion for inverse molecular design, Nature Computational Science 3(10), 873–882, 2023 detailsGuided diffusion for inverse molecular design
T. Weiss, L. Cosmo, E. Mayo Yanes, S. Chakraborty, A. M. Bronstein, R. Gershoni-PoranneNature Computational Science 3(10), 873–882, 2023The holy grail of materials science is de novo molecular design — i.e., the ability to engineer molecules with desired characteristics. Recently, this goal has become increasingly achievable thanks to developments such as equivariant graph neural networks that can better predict molecular properties, and to the improved performance of generation tasks, in particular of conditional generation, in text-to-image generators and large language models. Herein, we introduce GaUDI, a guided diffusion model for inverse molecular design, which combines these advances and can generate novel molecules with desired properties. GaUDI decouples the generator and the property-predicting models and can be guided using both point-wise targets and open-ended targets (e.g., minimum/maximum). We demonstrate GaUDI’s effectiveness using single- and multiple-objective tasks applied to newly-generated data sets of polycyclic aromatic systems, achieving nearly 100% validity of generated molecules. Further, for some tasks, GaUDI discovers better molecules than those present in our data set of 475k molecules.
E. Schwartz, A. M. Bronstein, R. Giryes, ISP distillation, IEEE Open Journal of Signal Processing 4, 12-20, 2023 detailsISP distillation
E. Schwartz, A. M. Bronstein, R. GiryesIEEE Open Journal of Signal Processing 4, 12-20, 2023Nowadays, many of the images captured are ‘observed’ by machines only and not by humans, e.g., in autonomous systems. High-level machine vision models, such as object recognition or semantic segmentation, assume images are transformed into some canonical image space by the camera Image Signal Processor (ISP). However, the camera ISP is optimized for producing visually pleasing images for human observers and not for machines. Therefore, one may spare the ISP compute time and apply vision models directly to RAW images. Yet, it has been shown that training such models directly on RAW images results in a performance drop. To mitigate this drop, we use a RAW and RGB image pairs dataset, which can be easily acquired with no human labeling. We then train a model that is applied directly to the RAW data by using knowledge distillation such that the model predictions for RAW images will be aligned with the predictions of an off-the-shelf pre-trained model for processed RGB images. Our experiments show that our performance on RAW images for object classification and semantic segmentation is significantly better than models trained on labeled RAW images. It also reasonably matches the predictions of a pre-trained model on processed RGB images, while saving the ISP compute overhead.
T. Blau, R. Ganz, C. Baskin, M. Elad, A. M. Bronstein, Classifier robustness enhancement via test-time transformation, arXiv preprint arXiv:2303.15409 2023 detailsClassifier robustness enhancement via test-time transformation
T. Blau, R. Ganz, C. Baskin, M. Elad, A. M. BronsteinarXiv preprint arXiv:2303.15409 2023It has been recently discovered that adversarially trained classifiers exhibit an intriguing property, referred to as perceptually aligned gradients (PAG). PAG implies that the gradients of such classifiers possess a meaningful structure, aligned with human perception. Adversarial training is currently the best-known way to achieve classification robustness under adversarial attacks. The PAG property, however, has yet to be leveraged for further improving classifier robustness. In this work, we introduce Classifier Robustness Enhancement Via Test-Time Transformation (TETRA) — a novel defense method that utilizes PAG, enhancing the performance of trained robust classifiers. Our method operates in two phases. First, it modifies the input image via a designated targeted adversarial attack into each of the dataset’s classes. Then, it classifies the input image based on the distance to each of the modified instances, with the assumption that the shortest distance relates to the true class. We show that the proposed method achieves state-of-the-art results and validate our claim through extensive experiments on a variety of defense methods, classifier architectures, and datasets. We also empirically demonstrate that TETRA can boost the accuracy of any differentiable adversarial training classifier across a variety of attacks, including ones unseen at training. Specifically, applying TETRA leads to substantial improvement of up to +23%, +20%, and +26% on CIFAR10, CIFAR100, and ImageNet, respectively.
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie, Designing nonlinear photonic crystals for high-dimensional quantum state engineering, ICLR Workshop on Machine Learning for Materials, 2023 detailsDesigning nonlinear photonic crystals for high-dimensional quantum state engineering
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. ArieICLR Workshop on Machine Learning for Materials, 2023We propose a novel, physically-constrained and differentiable approach for the generation of D-dimensional qudit states via spontaneous parametric downconversion (SPDC) in quantum optics. We circumvent any limitations imposed by the inherently stochastic nature of the physical process and incorporate a set of stochastic dynamical equations governing its evolution under the SPDC Hamiltonian. We demonstrate the effectiveness of our model through the design of
structured nonlinear photonic crystals (NLPCs) and shaped pump beams; and show, theoretically and experimentally, how to generate maximally entangled states in the spatial degree of freedom. The learning of NLPC structures offers a promising new avenue for shaping and controlling arbitrary quantum states and enables all-optical coherent control of the generated states. We believe that this approach can readily be extended from bulky crystals to thin Metasurfaces and potentially applied to other quantum systems sharing a similar Hamiltonian structures, such as superfluids and superconductors.E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie, A machine learning approach to generate quantum light, ICLR Workshop on Physics for Machine Learning, 2023 detailsA machine learning approach to generate quantum light
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. ArieICLR Workshop on Physics for Machine Learning, 2023Spontaneous parametric down-conversion (SPDC) is a key technique in quantum optics used to generate entangled photon pairs. However, generating a desirable D-dimensional qudit state in the SPDC process remains a challenge. In this paper, we introduce a physically-constrained and differentiable model to overcome this challenge, and demonstrate its effectiveness through the design of shaped pump beams and structured nonlinear photonic crystals. We avoid any restrictions induced by the stochastic nature of our physical process and integrate a set of stochastic dynamical equations governing its evolution under the SPDC Hamiltonian. Our model is capable of learning the relevant interaction parameters and designing nonlinear quantum optical systems that achieve desired quantum states. We show, theoretically and experimentally, how to generate maximally entangled states in the spatial degree of freedom. Additionally, we demonstrate all-optical coherent control of the generated state by reshaping the pump beam. Our work has potential applications in high-dimensional quantum key distribution and quantum information processing.
H. Ye, S. Vedula, Y. Chen, Y. Yang, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati, GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference, Proc. ACM Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023 detailsGRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference
H. Ye, S. Vedula, Y. Chen, Y. Yang, A. M. Bronstein, R. Dreslinski, T. Mudge, N. TalatiProc. ACM Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023The high memory bandwidth demand of sparse embedding layers continues to be a critical challenge in scaling the performance of recommendation models. While prior works have exploited heterogeneous memory system designs and partial embedding sum memoization techniques, they offer limited benefits. This is because prior designs either target a very small subset of embeddings to simplify their analysis or incur a high processing cost to account for all embeddings, which does not scale with the large sizes of modern embedding tables. This paper proposes GRACE—a lightweight and scalable graph-based algorithm-system co-design framework to significantly improve the embedding layer performance of recommendation models. GRACE proposes a novel Item Co-occurrence Graph (ICG) that scalably records item co-occurrences. GRACE then presents a new system-aware ICG clustering algorithm to find frequently accessed item combinations of arbitrary lengths to compute and memoize their partial sums. High-frequency partial sums are stored in a software-managed cache space to reduce memory traffic and improve the throughput of computing sparse features. We further present a cache data layout and low-cost address computation logic to efficiently lookup item embeddings and their partial sums. Our evaluation shows that GRACE significantly outperforms the state-of-the-art techniques SPACE and MERCI by 1.5× and 1.4×, respectively.
S. Vedula, I. Tallini, A. A. Rosenberg, M. Pegoraro, E. Rodolà, Y. Romano, A. M. Bronstein, Continuous vector quantile regression, Proc. ICML Workshop Frontiers4LCD, 2023 detailsContinuous vector quantile regression
S. Vedula, I. Tallini, A. A. Rosenberg, M. Pegoraro, E. Rodolà, Y. Romano, A. M. BronsteinProc. ICML Workshop Frontiers4LCD, 2023Vector quantile regression (VQR) estimates the conditional vector quantile function (CVQF), a fundamental quantity which fully represents the conditional distribution of Y|X. VQR is formulated as an optimal transport (OT) problem between a uniform U~μ and the target (X,Y)~ν, the solution of which is a unique transport map, co-monotonic with U. Recently NL-VQR has been proposed to estimate support non-linear CVQFs, together with fast solvers which enabled the use of this tool in practical applications. Despite its utility, the scalability and estimation quality of NL-VQR is limited due to a discretization of the OT problem onto a grid of quantile levels. We propose a novel continuous formulation and parametrization of VQR using partial input-convex neural networks (PICNNs). Our approach allows for accurate, scalable, differentiable and invertible estimation of non-linear CVQFs. We further demonstrate, theoretically and experimentally, how continuous CVQFs can be used for general statistical inference tasks: estimation of likelihoods, CDFs, confidence sets, coverage, sampling, and more. This work is an important step towards unlocking the full potential of VQR.
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein, Vector quantile regression on manifolds, ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023 detailsVector quantile regression on manifolds
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. BronsteinICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023Quantile regression (QR) is a statistical tool for distribution-free
estimation of conditional quantiles of a target variable given explanatory
features. QR is limited by the assumption that the target distribution is
univariate and defined on an Euclidean domain. Although the notion of quantiles
was recently extended to multi-variate distributions, QR for multi-variate
distributions on manifolds remains underexplored, even though many important
applications inherently involve data distributed on, e.g., spheres (climate
measurements), tori (dihedral angles in proteins), or Lie groups (attitude in
navigation). By leveraging optimal transport theory and the notion of
c-concave functions, we meaningfully define conditional vector quantile
functions of high-dimensional variables on manifolds (M-CVQFs). Our approach
allows for quantile estimation, regression, and computation of conditional
confidence sets. We demonstrate the approach’s efficacy and provide insights
regarding the meaning of non-Euclidean quantiles through preliminary synthetic
data experiments.T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-Poranne, Interpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons, Journal of Organic Chemistry, 2023 detailsInterpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons
T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-PoranneJournal of Organic Chemistry, 2023In this work, interpretable deep learning was used to identify structure-property relationships governing the HOMO-LUMO gap and relative stability of polybenzenoid hydrocarbons (PBHs). To this end, a ring-based graph representation was used. In addition to affording reduced training times and excellent predictive ability, this representation could be combined with a subunit-based perception of PBHs, allowing chemical insights to be presented in terms of intuitive and simple structural motifs. The resulting insights agree with conventional organic chemistry knowledge and electronic structure-based analyses, and also reveal new behaviors and identify influential structural motifs. In particular, we evaluated and compared the effects of linear, angular, and branching motifs on these two molecular properties, as well as explored the role of dispersion in mitigating torsional strain inherent in non-planar PBHs. Hence, the observed regularities and the proposed analysis contribute to a deeper understanding of the behavior of PBHs and form the foundation for design strategies for new functional PBHs.
A. A. Rosenberg, S. Vedula, Y. Romano, A. M. Bronstein, Fast nonlinear vector quantile regression, Proc. ICML, 2023 detailsFast nonlinear vector quantile regression
A. A. Rosenberg, S. Vedula, Y. Romano, A. M. BronsteinProc. ICML, 2023Quantile regression (QR) is a powerful tool for estimating one or more conditional quantiles of a target variable Y given explanatory features X. A limitation of QR is that it is only defined for scalar target variables, due to the formulation of its objective function, and since the notion of quantiles has no standard definition for multivariate distributions. Recently, vector quantile regression (VQR) was proposed as an extension of QR for high-dimensional target variables, thanks to a meaningful generalization of the notion of quantiles to multivariate distributions. Despite its elegance, VQR is arguably not applicable in practice due to several limitations: (i) it assumes a linear model for the quantiles of the target Y given the features X; (ii) its exact formulation is intractable even for modestly-sized problems in terms of target dimensions, the number of regressed quantile levels, or the number of features, and its relaxed dual formulation may violate the monotonicity of the estimated quantiles; (iii) no fast or scalable solvers for VQR currently exist. In this work we fully address these limitations, namely: (i) We extend VQR to the non-linear case, showing substantial improvement over linear VQR; (ii) We propose vector monotone rearrangement, a method which ensures the estimates obtained by VQR relaxations are monotone functions; (iii) We provide fast, GPU-accelerated solvers for linear and nonlinear VQR which maintain a fixed memory footprint with the number of samples and quantile levels, and demonstrate that they scale to millions of samples and thousands of quantile levels; (iv) We release an optimized python package of our solvers as to widespread the use of VQR in real-world applications.
D. Zadok, O. Salzman, A. Wolf, A. M. Bronstein, Towards predicting fine finger motions from ultrasound images via kinematic representation, Proc. ICRA, 2023 detailsTowards predicting fine finger motions from ultrasound images via kinematic representation
D. Zadok, O. Salzman, A. Wolf, A. M. BronsteinProc. ICRA, 2023A central challenge in building robotic prostheses is the creation of a sensor-based system able to read physiological signals from the lower limb and instruct a robotic hand to perform various tasks. Existing systems typically perform discrete gestures such as pointing or grasping, by employing electromyography (EMG) or ultrasound (US) technologies to analyze the state of the muscles. In this work, we study the inference problem of identifying the activation of specific fingers from a sequence of US images when performing dexterous tasks such as keyboard typing or playing the piano. While estimating finger gestures has been done in the past by detecting prominent gestures, we are interested in classification done in the context of fine motions that evolve over time. We consider this task as an important step towards higher adoption rates of robotic prostheses among arm amputees, as it has the potential to dramatically increase functionality in performing daily tasks. Our key observation, motivating this work, is that modeling the hand as a robotic manipulator allows to encode an intermediate representation wherein US images are mapped to said configurations. Given a sequence of such learned configurations, coupled with a neural-network architecture that exploits temporal coherence, we are able to infer fine finger motions. We evaluated our method by collecting data from a group of subjects and demonstrating how our framework can be used to replay music played or text typed. To the best of our knowledge, this is the first study demonstrating these downstream tasks within an end-to-end system.
A. M. Bronstein, A. Marx, Water stabilizes an alternate turn conformation in horse heart myoglobin, Nature Scientific Reports, 2023 detailsWater stabilizes an alternate turn conformation in horse heart myoglobin
A. M. Bronstein, A. MarxNature Scientific Reports, 2023Comparison of myoglobin structures reveals that protein isolated from horse heart consistently adopts an alternate turn conformation in comparison to its homologues. Analysis of hundreds of high-resolution structures discounts crystallization conditions or the surrounding amino acid protein environment as explaining this difference, that is also not captured by the AlphaFold prediction. Rather, a water molecule is identified as stabilizing the conformation in the horse heart structure, which immediately reverts to the whale conformation in molecular dynamics simulations excluding that structural water.
B. Gahtan, R. Cohen, A. M. Bronstein, G. Kedar, Using deep reinforcement learning for mmWave real-time scheduling, Proc. Int'l Conf. Network of the Future (NoF), 2023 detailsUsing deep reinforcement learning for mmWave real-time scheduling
B. Gahtan, R. Cohen, A. M. Bronstein, G. KedarProc. Int'l Conf. Network of the Future (NoF), 2023We study the problem of real-time scheduling in a multi-hop millimeter-wave (mmWave) mesh. We develop a model-free deep reinforcement learning algorithm called Adaptive Activator RL (AARL), which determines the subset of mmWave links that should be activated during each time slot and the power level for each link. The most important property of AARL is its ability to make scheduling decisions within the strict time slot constraints of typical 5G mmWave networks. AARL can handle a variety of network topologies, network loads, and interference models, it can also adapt to different workloads. We demonstrate the operation of AARL on several topologies: a small topology with 10 links, a moderately-sized mesh with 48 links, and a large topology with 96 links. We show that for each topology, we compare the throughput obtained by AARL to that of a benchmark algorithm called RPMA (Residual Profit Maximizer Algorithm). The most important advantage of AARL compared to RPMA is that it is much faster and can make the necessary scheduling decisions very rapidly during every time slot, while RPMA cannot. In addition, the quality of the scheduling decisions made by AARL outperforms those made by RPMA.
T. Shor, T. Weiss, D. Noti, A. M. Bronstein, Multi PILOT: Feasible learned multiple acquisition trajectories for dynamic MRI, Proc. Medical Imaging with Deep Learning (MIDL), 2023 detailsMulti PILOT: Feasible learned multiple acquisition trajectories for dynamic MRI
T. Shor, T. Weiss, D. Noti, A. M. BronsteinProc. Medical Imaging with Deep Learning (MIDL), 2023Dynamic Magnetic Resonance Imaging (MRI) is known to be a powerful and reliable technique for the dynamic imaging of internal organs and tissues, making it a leading diagnostic tool. A major difficulty in using MRI in this setting is the relatively long acquisition time (and, hence, increased cost) required for imaging in high spatio-temporal resolution,
leading to the appearance of related motion artifacts and decrease in resolution. Compressed Sensing (CS) techniques have become a common tool to reduce MRI acquisition time by subsampling images in the k-space according to some acquisition trajectory. Several studies have particularly focused on applying deep learning techniques to learn these acquisition trajectories in order to attain better image reconstruction, rather than using some predefined set of trajectories. To the best of our knowledge, learning acquisition trajectories has been only explored in the context of static MRI. In this study, we consider acquisition trajectory learning in the dynamic imaging setting. We design an end-to-end pipeline for the joint optimization of multiple per-frame acquisition trajectories along with a reconstruction neural network, and demonstrate improved image reconstruction quality in shorter acquisition times.A. B. Bainson, J. Hermanns, P. Petsinis, N. Aavad, C. Dam Larsen, T. Swayne, A. Boyarski, D. Mottin, A. M. Bronstein, P. Karras, Spectral subgraph localization, Proc. Learning on Graphs Conference, 2023 detailsSpectral subgraph localization
A. B. Bainson, J. Hermanns, P. Petsinis, N. Aavad, C. Dam Larsen, T. Swayne, A. Boyarski, D. Mottin, A. M. Bronstein, P. KarrasProc. Learning on Graphs Conference, 2023Several graph analysis problems are based on some variant of subgraph isomorphism: Given two graphs, G and Q, does G contain a subgraph isomorphic to Q? As this problem is NP-complete, past work usually avoids addressing it explicitly. In this paper, we propose a method that localizes, i.e., finds the best-match position of, Q in G, by aligning their Laplacian spectra and enhance its stability via bagging strategies; we relegate the finding of an exact node correspondence from Q to G to a subsequent and separate graph alignment task. We demonstrate that our localization strategy outperforms a baseline based on the state-of-the-art method for graph alignment in terms of accuracy on real graphs and scales to hundreds of nodes as no other method does.
J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. Karras, GRASP: Graph Alignment through Spectral Signatures, Proc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, 2022 detailsGRASP: Graph Alignment through Spectral Signatures
J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. KarrasProc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, 2022What is the best way to match the nodes of two graphs? This graph alignment problem generalizes graph isomorphism and arises in applications from social network analysis to bioinformatics. Some solutions assume that auxiliary information on known matches or node or edge attributes is available, or utilize arbitrary graph features. Such methods fare poorly in the pure form of the problem, in which only graph structures are given. Other proposals translate the problem to one of aligning node embeddings, yet, by doing so, provide only a single-scale view of the graph. In this paper, we transfer the shape-analysis concept of functional maps from the continuous to the discrete case, and treat the graph alignment problem as a special case of the problem of finding a mapping between functions on graphs. We present GRASP, a method that first establishes a correspondence between functions derived from Laplacian matrix eigenvectors, which capture multiscale structural characteristics, and then exploits this correspondence to align nodes. Our experimental study, featuring noise levels higher than anything used in previous studies, shows that GRASP outperforms state-of-the-art methods for graph alignment across noise levels and graph types.
P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. Liu, Deep fused two-step cross-modal hashing with multiple semantic supervision, Multimedia Tools and Applications, 2022 detailsDeep fused two-step cross-modal hashing with multiple semantic supervision
P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. LiuMultimedia Tools and Applications, 2022Existing cross-modal hashing methods ignore the informative multimodal joint information and cannot fully exploit the semantic labels. In this paper, we propose a deep fused two-step cross-modal hashing (DFTH) framework with multiple semantic supervision. In the first step, DFTH learns unified hash codes for instances by a fusion network. Semantic label and similarity reconstruction have been introduced to acquire binary codes that are informative, discriminative and semantic similarity preserving. In the second step, two modality-specific hash networks are learned under the supervision of common hash codes reconstruction, label reconstruction, and intra-modal and inter-modal semantic similarity reconstruction. The modality-specific hash networks can generate semantic preserving binary codes for out-of-sample queries. To deal with the vanishing gradients of binarization, continuous differentiable tanh is introduced to approximate the discrete sign function, making the networks able to back-propagate by automatic gradient computation. Extensive experiments on MIRFlickr25K and NUS-WIDE show the superiority of DFTH over state-of-the-art methods.
P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. Liu, Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval, Applied Intelligence, 52(1), pp. 33-54, 2022 detailsIntra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval
P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. LiuApplied Intelligence, 52(1), pp. 33-54, 2022Cross-modal retrieval aims to retrieve related items across different modalities, for example, using an image query to retrieve related text. The existing deep methods ignore both the intra-modal and inter-modal intra-class low-rank structures when fusing various modalities, which decreases the retrieval performance. In this paper, two deep models (denoted as ILCMR and Semi-ILCMR) based on intra-class low-rank regularization are proposed for supervised and semi-supervised cross-modal retrieval, respectively. Specifically, ILCMR integrates the image network and text network into a unified framework to learn a common feature space by imposing three regularization terms to fuse the cross-modal data. First, to align them in the label space, we utilize semantic consistency regularization to convert the data representations to probability distributions over the classes. Second, we introduce an intra-modal low-rank regularization, which encourages the intra-class samples that originate from the same space to be more relevant in the common feature space. Third, an inter-modal low-rank regularization is applied to reduce the cross-modal discrepancy. To enable the low-rank regularization to be optimized using automatic gradients during network back-propagation, we propose the rank-r approximation and specify the explicit gradients for theoretical completeness. In addition to the three regularization terms that rely on label information incorporated by ILCMR, we propose Semi-ILCMR in the semi-supervised regime, which introduces a low-rank constraint before projecting the general representations into the common feature space. Extensive experiments on four public cross-modal datasets demonstrate the superiority of ILCMR and Semi-ILCMR over other state-of-the-art methods.
Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. Baskin, Physical passive patch adversarial attacks on visual odometry systems, Proc. ACCV, 2022 detailsPhysical passive patch adversarial attacks on visual odometry systems
Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. BaskinProc. ACCV, 2022Deep neural networks are known to be susceptible to adversarial perturbations — small perturbations that alter the output of the network and exist under strict norm limitations. While such perturbations are usually discussed as tailored to a specific input, a universal perturbation can be constructed to alter the model’s output on a set of inputs. Universal perturbations present a more realistic case of adversarial attacks, as awareness of the model’s exact input is not required. In addition, the universal attack setting raises the subject of generalization to unseen data, where given a set of inputs, the universal perturbations aim to alter the model’s output on out-of-sample data. In this work, we study physical passive patch adversarial attacks on visual odometry-based autonomous navigation systems. A visual odometry system aims to infer the relative camera motion between two corresponding viewpoints, and is frequently used by vision-based autonomous navigation systems to estimate their state. For such navigation systems, a patch adversarial perturbation poses a severe security issue, as it can be used to mislead a system onto some collision course. To the best of our knowledge, we show for the first time that the error margin of a visual odometry model can be significantly increased by deploying patch adversarial attacks in the scene. We provide evaluation on synthetic closed-loop drone navigation data and demonstrate that a comparable vulnerability exists in real data.
L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. Bronstein, Machine learning approaches demonstrate that protein structures carry information about their genetic coding, Nature Scientific Reports, 2022 detailsMachine learning approaches demonstrate that protein structures carry information about their genetic coding
L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. BronsteinNature Scientific Reports, 2022Synonymous codons translate into the same amino acid. Although the identity of synonymous codons is often considered
inconsequential to the final protein structure there is mounting evidence for an association between the two. Our study
examined this association using regression and classification models, finding that codon sequences predict protein backbone dihedral angles with a lower error than amino acid sequences, and that models trained with true dihedral angles have better classification of synonymous codons given structural information than models trained with random dihedral angles. Using this classification approach, we investigated local codon-codon dependencies and tested whether synonymous codon identity can be predicted more accurately from codon context than amino acid context alone, and most specifically which codon context position carries the most predictive power.A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein, Defining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues, biorXiv/2022/513383, 2022 detailsDefining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues
A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. BronsteinbiorXiv/2022/513383, 2022Proteins fold from chains of amino acids, forming secondary structures, α-helices and β-strands, that, at least for globular proteins, subsequently fold into a three-dimensional structure. A large-scale analysis of high-resolution protein structures suggests that amino acid pairs constitute another layer of ordered structure, more local than these conventionally defined secondary structures. We develop a cross-peptide-bond Ramachandran plot that captures the 15 conformational preferences of the amino acid pairs and show that the effect of a particular mutation on the stability of a protein depends in a predictable manner on the adjacent amino acid context.
A. Rosenberg, A. Marx, A. M. Bronstein, Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon, Nature Communications, 2022 detailsCodon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon
A. Rosenberg, A. Marx, A. M. BronsteinNature Communications, 2022Synonymous codons translate into chemically identical amino acids. Once considered inconsequential to the formation of the protein product, there is now significant evidence to suggest that codon usage affects co-translational protein folding and the final structure of the expressed protein. Here we develop a method for computing and comparing codon-specific Ramachandran plots and demonstrate that the backbone dihedral angle distributions of some synonymous codons are distinguishable with statistical significance for some secondary structures. This shows that there exists a dependence between codon identity and backbone torsion of the translated amino acid. Although these findings cannot pinpoint the causal direction of this dependence, we discuss the vast biological implications should coding be shown to directly shape protein conformation and demonstrate the usefulness of this method as a tool for probing associations between codon usage and protein structure. Finally, we urge for the inclusion of exact genetic information into structural databases.
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie, Inverse design of spontaneous parametric downconversion for generation of high-dimensional qudits, Optica 9, 602-615, 2022 detailsInverse design of spontaneous parametric downconversion for generation of high-dimensional qudits
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. ArieOptica 9, 602-615, 2022Spontaneous parametric down-conversion in quantum optics is an invaluable resource for the realization of high-dimensional qudits with spatial modes of light. One of the main open challenges is how to directly generate a desirable qudit state in the SPDC process. This problem can be addressed through advanced computational learning methods; however, due to difficulties in modeling the SPDC process by a fully differentiable algorithm that takes into account all interaction effects, progress has been limited. Here, we overcome these limitations and introduce a physically-constrained and differentiable model, validated against experimental results for shaped pump beams and structured crystals, capable of learning every interaction parameter in the process. We avoid any restrictions induced by the stochastic nature of our physical model and integrate the dynamic equations governing the evolution under the SPDC Hamiltonian. We solve the inverse problem of designing a nonlinear quantum optical system that achieves the desired quantum state of down-converted photon pairs. The desired states are defined using either the second-order correlations between different spatial modes or by specifying the required density matrix. By learning nonlinear volume holograms as well as different pump shapes, we successfully show how to generate maximally entangled states. Furthermore, we simulate all-optical coherent control over the generated quantum state by actively changing the profile of the pump beam. Our work can be useful for applications such as novel designs of high-dimensional quantum key distribution and quantum information processing protocols. In addition, our method can be readily applied for controlling other degrees of freedom of light in the SPDC process, such as the spectral and temporal properties, and may even be used in condensed-matter systems having a similar interaction Hamiltonian.
N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. Dreslinski, Mint: An Accelerator For Mining Temporal Motifs, Proc. MICRO, 2022 detailsMint: An Accelerator For Mining Temporal Motifs
N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. DreslinskiProc. MICRO, 2022A variety of complex systems, including social and communication networks, financial markets, biology, and neuroscience are modeled using temporal graphs that contain a set of nodes and directed timestamped edges. Temporal motifs in temporal graphs are generalized from subgraph patterns in static graphs in that they also account for edge ordering and time duration, in addition to the graph structure. Mining temporal motifs is a fundamental problem used in several application domains. However, existing software frameworks offer suboptimal performance due to high algorithmic complexity and irregular memory accesses of temporal motif mining. This paper presents Mint—a novel accelerator architecture and a programming model for mining temporal motifs efficiently. We first divide this workload into three fundamental tasks: search, book-keeping, and backtracking. Based on this, we propose a task–centric programming model that enables decoupled, asynchronous execution. This model unlocks massive opportunities for parallelism, and allows storing task context information on-chip. To best utilize the proposed programming model, we design a domain-specific hardware accelerator using its data path and memory subsystem design to cater to the unique workload characteristics of temporal motif mining. To further improve performance, we propose a novel optimization called search index memoization that significantly reduces memory traffic. We comprehensively compare the performance of Mint with state-of-the-art temporal motif mining software frameworks (both approximate and exact) running on both CPU and GPU, and show 9×–2576× benefit in performance.
E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. Litany, Contrast to divide: Self-supervised pre-training for learning with noisy labels, Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022 detailsContrast to divide: Self-supervised pre-training for learning with noisy labels
E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. LitanyProc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022The success of learning with noisy labels (LNL) methods relies heavily on the success of a warm-up stage where standard supervised training is performed using the full (noisy) training set. In this paper, we identify a” warm-up obstacle”: the inability of standard warm-up stages to train high quality feature extractors and avert memorization of noisy labels. We propose” Contrast to Divide”(C2D), a simple framework that solves this problem by pre-training the feature extractor in a self-supervised fashion. Using self-supervised pre-training boosts the performance of existing LNL approaches by drastically reducing the warm-up stage’s susceptibility to noise level, shortening its duration, and improving extracted feature quality. C2D works out of the box with existing methods and demonstrates markedly improved performance, especially in the high noise regime, where we get a boost of more than 27% for CIFAR-100 with 90% noise over the previous state of the art. In real-life noise settings, C2D trained on mini-WebVision outperforms previous works both in WebVision and ImageNet validation sets by 3% top-1 accuracy. We perform an in-depth analysis of the framework, including investigating the performance of different pre-training approaches and estimating the effective upper bound of the LNL performance with semi-supervised learning.
N. Diamant, N. Shandor, A. M. Bronstein, Delta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples, arXiv:2111.08419, 2022 detailsDelta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples
N. Diamant, N. Shandor, A. M. BronsteinarXiv:2111.08419, 2022Understating and controlling generative models’ latent space is a complex task. In this paper, we propose a novel method for learning to control any desired attribute in a pre-trained GAN’s latent space, for the purpose of editing synthesized and real-world data samples accordingly. We perform Sim2Real learning, relying on minimal samples to achieve an unlimited amount of continuous precise edits. We present an Autoencoder-based model that learns to encode the semantics of changes between images as a basis for editing new samples later on, achieving precise desired results – example shown in Fig. 1. While previous editing methods rely on a known structure of latent spaces (e.g., linearity of some semantics in StyleGAN), our method inherently does not require any structural constraints. We demonstrate our method in the domain of facial imagery: editing different expressions, poses, and lighting attributes, achieving state-of-the-art results.
T. Blau, R. Ganz, B. Kawar, A. M. Bronstein, M. Elad , Threat model-agnostic adversarial defense using diffusion models, arXiv preprint arXiv:2207.08089, 2022 detailsThreat model-agnostic adversarial defense using diffusion models
T. Blau, R. Ganz, B. Kawar, A. M. Bronstein, M. EladarXiv preprint arXiv:2207.08089, 2022Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks. Following the discovery of this vulnerability in real-world imaging and vision applications, the associated safety concerns have attracted vast research attention, and many defense techniques have been developed. Most of these defense methods rely on adversarial training (AT) — training the classification network on images perturbed according to a specific threat model, which defines the magnitude of the allowed modification. Although AT leads to promising results, training on a specific threat model fails to generalize to other types of perturbations. A different approach utilizes a preprocessing step to remove the adversarial perturbation from the attacked image. In this work, we follow the latter path and aim to develop a technique that leads to robust classifiers across various realizations of threat models. To this end, we harness the recent advances in stochastic generative modeling, and means to leverage these for sampling from conditional distributions. Our defense relies on an addition of Gaussian i.i.d noise to the attacked image, followed by a pretrained diffusion process — an architecture that performs a stochastic iterative process over a denoising network, yielding a high perceptual quality denoised outcome. The obtained robustness with this stochastic preprocessing step is validated through extensive experiments on the CIFAR-10 dataset, showing that our method outperforms the leading defense methods under various threat models.
A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. Karlinsky, Detector-free weakly supervised grounding by separation, Proc. CVPR, 2022 detailsDetector-free weakly supervised grounding by separation
A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. KarlinskyProc. CVPR, 2022Nowadays, there is an abundance of data involving images and surrounding free-form text weakly corresponding to those images. Weakly Supervised phrase-Grounding (WSG) deals with the task of using this data to learn to localize (or to ground) arbitrary text phrases in images without any additional annotations. However, most recent SotA methods for WSG assume the existence of a pre-trained object detector, relying on it to produce the ROIs for localization. In this work, we focus on the task of Detector-Free WSG (DF-WSG) to solve WSG without relying on a pre-trained detector. We directly learn everything from the images and associated free-form text pairs, thus potentially gaining an advantage on the categories unsupported by the detector. The key idea behind our proposed Grounding by Separation (GbS) method is synthesizing `text to image-regions’ associations by random alpha-blending of arbitrary image pairs and using the corresponding texts of the pair as conditions to recover the alpha map from the blended image via a segmentation network. At test time, this allows using the query phrase as a condition for a non-blended query image, thus interpreting the test image as a composition of a region corresponding to the phrase and the complement region. Using this approach we demonstrate a significant accuracy improvement, of up to 8.5% over previous DF-WSG SotA, for a range of benchmarks including Flickr30K, Visual Genome, and ReferIt, as well as a significant complementary improvement (above 7%) over the detector-based approaches for WSG.
D. E. Fordham, D. Rosentraub, A. L. Polsky, T. Aviram, Y. Wolf, O. Perl, A. Devir, S. Rosentraub, D. H. Silver, Y. Gold Zamir, A. M. Bronstein, M. Lara Lara, J. Ben Nagi, A. Alvarez, S. Munné, Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?, Human Reproduction, Volume 37, Issue 10, Pages 2275–2290, 2022 detailsEmbryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?
D. E. Fordham, D. Rosentraub, A. L. Polsky, T. Aviram, Y. Wolf, O. Perl, A. Devir, S. Rosentraub, D. H. Silver, Y. Gold Zamir, A. M. Bronstein, M. Lara Lara, J. Ben Nagi, A. Alvarez, S. MunnéHuman Reproduction, Volume 37, Issue 10, Pages 2275–2290, 2022STUDY QUESTION
What is the accuracy and agreement of embryologists when assessing the implantation probability of blastocysts using time-lapse imaging (TLI), and can it be improved with a data-driven algorithm?SUMMARY ANSWER
The overall interobserver agreement of a large panel of embryologists was moderate and prediction accuracy was modest, while the purpose-built artificial intelligence model generally resulted in higher performance metrics.WHAT IS KNOWN ALREADY
Previous studies have demonstrated significant interobserver variability amongst embryologists when assessing embryo quality. However, data concerning embryologists’ ability to predict implantation probability using TLI is still lacking. Emerging technologies based on data-driven tools have shown great promise for improving embryo selection and predicting clinical outcomes.STUDY DESIGN, SIZE, DURATION
TLI video files of 136 embryos with known implantation data were retrospectively collected from two clinical sites between 2018 and 2019 for the performance assessment of 36 embryologists and comparison with a deep neural network (DNN).PARTICIPANTS/MATERIALS, SETTING, METHODS
We recruited 39 embryologists from 13 different countries. All participants were blinded to clinical outcomes. A total of 136 TLI videos of embryos that reached the blastocyst stage were used for this experiment. Each embryo’s likelihood of successfully implanting was assessed by 36 embryologists, providing implantation probability grades (IPGs) from 1 to 5, where 1 indicates a very low likelihood of implantation and 5 indicates a very high likelihood. Subsequently, three embryologists with over 5 years of experience provided Gardner scores. All 136 blastocysts were categorized into three quality groups based on their Gardner scores. Embryologist predictions were then converted into predictions of implantation (IPG ≥ 3) and no implantation (IPG ≤ 2). Embryologists’ performance and agreement were assessed using Fleiss kappa coefficient. A 10-fold cross-validation DNN was developed to provide IPGs for TLI video files. The model’s performance was compared to that of the embryologists.MAIN RESULTS AND THE ROLE OF CHANCE
Logistic regression was employed for the following confounding variables: country of residence, academic level, embryo scoring system, log years of experience and experience using TLI. None were found to have a statistically significant impact on embryologist performance at α = 0.05. The average implantation prediction accuracy for the embryologists was 51.9% for all embryos (N = 136). The average accuracy of the embryologists when assessing top quality and poor quality embryos (according to the Gardner score categorizations) was 57.5% and 57.4%, respectively, and 44.6% for fair quality embryos. Overall interobserver agreement was moderate (κ = 0.56, N = 136). The best agreement was achieved in the poor + top quality group (κ = 0.65, N = 77), while the agreement in the fair quality group was lower (κ = 0.25, N = 59). The DNN showed an overall accuracy rate of 62.5%, with accuracies of 62.2%, 61% and 65.6% for the poor, fair and top quality groups, respectively. The AUC for the DNN was higher than that of the embryologists overall (0.70 DNN vs 0.61 embryologists) as well as in all of the Gardner groups (DNN vs embryologists—Poor: 0.69 vs 0.62; Fair: 0.67 vs 0.53; Top: 0.77 vs 0.54).LIMITATIONS, REASONS FOR CAUTION
Blastocyst assessment was performed using video files acquired from time-lapse incubators, where each video contained data from a single focal plane. Clinical data regarding the underlying cause of infertility and endometrial thickness before the transfer was not available, yet may explain implantation failure and lower accuracy of IPGs. Implantation was defined as the presence of a gestational sac, whereas the detection of fetal heartbeat is a more robust marker of embryo viability. The raw data were anonymized to the extent that it was not possible to quantify the number of unique patients and cycles included in the study, potentially masking the effect of bias from a limited patient pool. Furthermore, the lack of demographic data makes it difficult to draw conclusions on how representative the dataset was of the wider population. Finally, embryologists were required to assess the implantation potential, not embryo quality. Although this is not the traditional approach to embryo evaluation, morphology/morphokinetics as a means of assessing embryo quality is believed to be strongly correlated with viability and, for some methods, implantation potential.WIDER IMPLICATIONS OF THE FINDINGS
Embryo selection is a key element in IVF success and continues to be a challenge. Improving the predictive ability could assist in optimizing implantation success rates and other clinical outcomes and could minimize the financial and emotional burden on the patient. This study demonstrates moderate agreement rates between embryologists, likely due to the subjective nature of embryo assessment. In particular, we found that average embryologist accuracy and agreement were significantly lower for fair quality embryos when compared with that for top and poor quality embryos. Using data-driven algorithms as an assistive tool may help IVF professionals increase success rates and promote much needed standardization in the IVF clinic. Our results indicate a need for further research regarding technological advancement in this field.S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. Karlinsky, MetAdapt: Meta-learned task-adaptive architecture for few-shot classification, Pattern Recognition Letters, 2021 detailsMetAdapt: Meta-learned task-adaptive architecture for few-shot classification
S. Doveh, E. Schwartz, C. Xue, R. Feris, A. M. Bronstein, R. Giryes, L. KarlinskyPattern Recognition Letters, 2021Few-Shot Learning (FSL) is a topic of rapidly growing interest. Typically, in FSL a model is trained on a dataset consisting of many small tasks (meta-tasks) and learns to adapt to novel tasks that it will encounter during test time. This is also referred to as meta-learning. Another topic closely related to meta-learning with a lot of interest in the community is Neural Architecture Search (NAS), automatically finding optimal architecture instead of engineering it manually. In this work we combine these two aspects of meta-learning. So far, meta-learning FSL methods have focused on optimizing parameters of pre-defined network architectures, in order to make them easily adaptable to novel tasks. Moreover, it was observed that, in general, larger architectures perform better than smaller ones up to a certain saturation point (where they start to degrade due to over-fitting). However, little attention has been given to explicitly optimizing the architectures for FSL, nor to an adaptation of the architecture at test time to particular novel tasks. In this work, we propose to employ tools inspired by the Differentiable Neural Architecture Search (D-NAS) literature in order to optimize the architecture for FSL without over-fitting. Additionally, to make the architecture task adaptive, we propose the concept of ‘MetAdapt Controller’ modules. These modules are added to the model and are meta-trained to predict the optimal network connections for a given novel task. Using the proposed approach we observe state-of-the-art resu
T. Weiss, N. Peretz, S. Vedula, A. Feuer, A. M. Bronstein, Joint optimization of system design and reconstruction in MIMO radar imaging, Proc. IEEE Int'l Workshop on Machine Learning for Signal Processing, 2021 detailsJoint optimization of system design and reconstruction in MIMO radar imaging
T. Weiss, N. Peretz, S. Vedula, A. Feuer, A. M. BronsteinProc. IEEE Int'l Workshop on Machine Learning for Signal Processing, 2021Multiple-input multiple-output (MIMO) radar is one of the leading depth sensing modalities. However, the usage of multiple receive channels lead to relative high costs and prevent the penetration of MIMOs in many areas such as the automotive industry. Over the last years, few studies concentrated on designing reduced measurement schemes and image reconstruction schemes for MIMO radars, however these problems have been so far addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of simultaneous learningbased design of the acquisition and reconstruction schemes, manifesting significant improvement in the reconstruction quality. Inspired by these successes, in this work, we propose to learn MIMO acquisition parameters in the form of receive (Rx) antenna elements locations jointly with an image neuralnetwork based reconstruction. To this end, we propose an algorithm for training the combined acquisition-reconstruction pipeline end-to-end in a differentiable way. We demonstrate the significance of using our learned acquisition parameters with and without the neural-network reconstruction.
Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, Loss aware post-training quantization, Machine Learning, 2021 detailsLoss aware post-training quantization
Y. Nahshan, B. Chmiel, C. Baskin, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. MendelsonMachine Learning, 2021Neural network quantization enables the deployment of large models on resource-constrained devices. Current post-training quantization methods fall short in terms of accuracy for INT4 (or lower) but provide reasonable accuracy for INT8 (or above). In this work, we study the effect of quantization on the structure of the loss landscape. We show that the structure is flat and separable for mild quantization, enabling straightforward post-training quantization methods to achieve good results. We show that with more aggressive quantization, the loss landscape becomes highly non-separable with steep curvature, making the selection of quantization parameters more challenging. Armed with this understanding, we design a method that quantizes the layer parameters jointly, enabling significant accuracy improvement over current post-training quantization methods.
C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. Mendelson, CAT: Compression-aware training for bandwidth reduction, JMLR, 2021 detailsCAT: Compression-aware training for bandwidth reduction
C. Baskin, B. Chmiel, E. Zheltonozhskii, R. Banner, A. M. Bronstein, A. MendelsonJMLR, 2021Convolutional neural networks (CNNs) have become the dominant neural network architecture for solving visual processing tasks. One of the major obstacles hindering the ubiquitous use of CNNs for inference is their relatively high memory bandwidth requirements, which can be a main energy consumer and throughput bottleneck in hardware accelerators. Accordingly, an efficient feature map compression method can result in substantial performance gains. Inspired by quantization-aware training approaches, we propose a compression-aware training (CAT) method that involves training the model in a way that allows better compression of feature maps during inference. Our method trains the model to achieve low-entropy feature maps, which enables efficient compression at inference time using classical transform coding methods. CAT significantly improves the state-of-the-art results reported for quantization. For example, on ResNet-34 we achieve 73.1% accuracy (0.2% degradation from the baseline) with an average representation of only 1.79 bits per value.
E. Amrani, A. M. Bronstein, Self-supervised classification network, Proc. ECCV, 2022 detailsSelf-supervised classification network
E. Amrani, A. M. BronsteinProc. ECCV, 2022We present Self-Classifier — a novel self-supervised end-to-end classification neural network. Self-Classifier learns labels and representations simultaneously in a single-stage end-to-end manner by optimizing for same-class prediction of two augmented views of the same sample. To guarantee non-degenerate solutions (i.e., solutions where all labels are assigned to the same class), a uniform prior is asserted on the labels. We show mathematically that unlike the regular cross-entropy loss, our approach avoids such solutions. Self-Classifier is simple to implement and is scalable to practically unlimited amounts of data. Unlike other unsupervised classification approaches, it does not require any form of pre-training or the use of expectation maximization algorithms, pseudo-labelling or external clustering. Unlike other contrastive learning representation learning approaches, it does not require a memory bank or a second network. Despite its relative simplicity, our approach achieves comparable results to state-of-the-art performance with ImageNet, CIFAR10 and CIFAR100 for its two objectives: unsupervised classification and unsupervised representation learning. Furthermore, it is the first unsupervised end-to-end classification network to perform well on the large-scale ImageNet dataset. Code will be made available.
E. Rozenberg, D. Freedman, A. M. Bronstein, Learning to localize objects using limited annotation with applications to thoracic diseases, IEEE Access Vol. 9, 2021 detailsLearning to localize objects using limited annotation with applications to thoracic diseases
E. Rozenberg, D. Freedman, A. M. BronsteinIEEE Access Vol. 9, 2021Motivation: The localization of objects in images is a longstanding objective within the field of image processing. Most current techniques are based on machine learning approaches, which typically require careful annotation of training samples in the form of expensive bounding box labels. The need for such large-scale annotation has only been exacerbated by the widespread adoption of deep learning techniques within the image processing community: deep learning is notoriously data-hungry. Method: In this work, we attack this problem directly by providing a new method for learning to localize objects with limited annotation: most training images can simply be annotated with their whole image labels (and no bounding box), with only a small fraction marked with bounding boxes. The training is driven by a novel loss function, which is a continuous relaxation of a well-defined discrete formulation of weakly supervised learning. Care is taken to ensure that the loss is numerically well-posed. Additionally, we propose a neural network architecture which accounts for both patch dependence, through the use of Conditional Random Field layers, and shift-invariance, through the inclusion of anti-aliasing filters. Results: We demonstrate our method on the task of localizing thoracic diseases in chest X-ray images, achieving state-of-the-art performance on the ChestX-ray14 dataset. We further show that with a modicum of additional effort our technique can be extended from object localization to object detection, attaining high quality results on the Kaggle RSNA Pneumonia Detection Challenge. Conclusion: The technique presented in this paper has the potential to enable high accuracy localization in regimes in which annotated data is either scarce or expensive to acquire. Future work will focus on applying the ideas presented in this paper to the realm of semantic segmentation.
T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. Bronstein, PILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI, Journal of Machine Learning for Biomedical Imaging (MELBA), 2021 detailsPILOT: Physics-Informed Learned Optimal Trajectories for accelerated MRI
T. Weiss, O. Senouf, S. Vedula, O. Michailovich, M. Zibulevsky, A. M. BronsteinJournal of Machine Learning for Biomedical Imaging (MELBA), 2021Magnetic Resonance Imaging (MRI) has long been considered to be among “the gold standards” of diagnostic medical imaging. The long acquisition times, however, render MRI prone to motion artifacts, let alone their adverse contribution to the relatively high costs of MRI examination. Over the last few decades, multiple studies have focused on the development of both physical and post-processing methods for accelerated acquisition of MRI scans. These two approaches, however, have so far been addressed separately. On the other hand, recent works in optical computational imaging have demonstrated growing success of the concurrent learning-based design of data acquisition and image reconstruction schemes. Such schemes have already demonstrated substantial effectiveness, leading to considerably shorter acquisition times and improved quality of image reconstruction. Inspired by this initial success, in this work, we propose a novel approach to the learning of optimal schemes for conjoint acquisition and reconstruction of MRI scans, with the optimization, carried out simultaneously with respect to the time-efficiency of data acquisition and the quality of resulting reconstructions. To be of practical value, the schemes are encoded in the form of general k-space trajectories, whose associated magnetic gradients are constrained to obey a set of predefined hardware requirements (as defined in terms of, e.g., peak currents and maximum slew rates of magnetic gradients). With this proviso in mind, we propose a novel algorithm for the end-to-end training of a combined acquisition-reconstruction pipeline using a deep neural network with differentiable forward- and backpropagation operators. We also demonstrate the effectiveness of the proposed solution in application to both image reconstruction and image segmentation, reporting substantial improvements in terms of acceleration factors as well as the quality of these end tasks.
Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. Yaniv, Meeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis, Proc. US National Academy of Sciences (PNAS), 2021 detailsMeeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning-based ECG analysis
Y. Elul, A. Rosenberg, A. Schuster, A. M. Bronstein, Y. YanivProc. US National Academy of Sciences (PNAS), 2021Despite their great promise, artificial intelligence (AI) systems have yet to become ubiquitous in the daily practice of medicine largely due to several crucial unmet needs of healthcare practitioners. These include lack of explanations in clinically meaningful terms, handling the presence of unknown medical conditions, and transparency regarding the system’s limitations, both in terms of statistical performance as well as recognizing situations for which the system’s predictions are irrelevant. We articulate these unmet clinical needs as machine-learning (ML) problems and systematically address them with cutting-edge ML techniques. We focus on electrocardiogram (ECG) analysis as an example domain in which AI has great potential and tackle two challenging tasks: the detection of a heterogeneous mix of known and unknown arrhythmias from ECG and the identification of underlying cardio-pathology from segments annotated as normal sinus rhythm recorded in patients with an intermittent arrhythmia. We validate our methods by simulating a screening for arrhythmias in a large-scale population while adhering to statistical significance requirements. Specifically, our system 1) visualizes the relative importance of each part of an ECG segment for the final model decision; 2) upholds specified statistical constraints on its out-of-sample performance and provides uncertainty estimation for its predictions; 3) handles inputs containing unknown rhythm types; and 4) handles data from unseen patients while also flagging cases in which the model’s outputs are not usable for a specific patient. This work represents a significant step toward overcoming the limitations currently impeding the integration of AI into clinical practice in cardiology and medicine in general.
L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. Giryes, StarNet: towards weakly supervised few-shot detection and explainable few-shot classification, Proc. AAAI, 2021 detailsStarNet: towards weakly supervised few-shot detection and explainable few-shot classification
L. Karlinsky, J. Shtok, A. Alfassy, M. Lichtenstein, S. Harary, E. Schwartz, S. Doveh, P. Sattigeri, R. Feris, A. M. Bronstein, R. GiryesProc. AAAI, 2021In this paper, we propose a new few-shot learning method called StarNet, which is an end-to-end trainable non-parametric star-model few-shot classifier. While being meta-trained using only image-level class labels, StarNet learns not only to predict the class labels for each query image of a few-shot task, but also to localize (via a heatmap) what it believes to be the key image regions supporting its prediction, thus effectively detecting the instances of the novel categories. The localization is enabled by the StarNet’s ability to find large, arbitrarily shaped, semantically matching regions between all pairs of support and query images of a few-shot task. We evaluate StarNet on multiple few-shot classification benchmarks attaining significant state-of-the-art improvement on the CUB and ImageNetLOC-FS, and smaller improvements on other benchmarks. At the same time, in many cases, StarNet provides plausible explanations for its class label predictions, by highlighting the correctly paired novel category instances on the query and on its best matching support (for the predicted class). In addition, we test the proposed approach on the previously unexplored and challenging task of Weakly Supervised Few-Shot Object Detection (WS-FSOD), obtaining significant improvements over the baselines.
E. Amrani, R. Ben-Ari, D. Rotman, A. M. Bronstein, Noise estimation using density estimation for self-supervised multimodal learning, Proc. AAAI, 2021 detailsNoise estimation using density estimation for self-supervised multimodal learning