Publications
- A. Maddipatla, N. Bojan Sellam, M. Bojan, S. Vedula, P. Schanda, A. Marx, A. M. Bronstein, Inverse problems with experiment-guided AlphaFold, arXiv:2502.09372, 2025 details
Inverse problems with experiment-guided AlphaFold
A. Maddipatla, N. Bojan Sellam, M. Bojan, S. Vedula, P. Schanda, A. Marx, A. M. BronsteinarXiv:2502.09372, 2025Proteins exist as a dynamic ensemble of multiple conformations, and these motions are often crucial for their functions. However, current structure prediction methods predominantly yield a single conformation, overlooking the conformational heterogeneity revealed by diverse experimental modalities. Here, we present a framework for building experiment-grounded protein structure generative models that infer conformational ensembles consistent with measured experimental data. The key idea is to treat state-of-the-art protein structure predictors (e.g., AlphaFold3) as sequence-conditioned structural priors, and cast ensemble modeling as posterior inference of protein structures given experimental measurements. Through extensive real-data experiments, we demonstrate the generality of our method to incorporate a variety of experimental measurements. In particular, our framework uncovers previously unmodeled conformational heterogeneity from crystallographic densities, and generates high-accuracy NMR ensembles orders of magnitude faster than the status quo. Notably, we demonstrate that our ensembles outperform AlphaFold3 and sometimes better fit experimental data than publicly deposited structures to the Protein Data Bank (PDB). We believe that this approach will unlock building predictive models that fully embrace experimentally observed conformational diversity.
Y. Davidson, A. Philipp, S. Chakraborty, A. M. Bronstein, R. Gershoni-Poranne, How local is "local"? Deep learning reveals locality of the induced magnetic field of polycyclic aromatic hydrocarbons, chemrxiv 10.26434/chemrxiv-2025-pqmcc, 2025 detailsHow local is "local"? Deep learning reveals locality of the induced magnetic field of polycyclic aromatic hydrocarbons
Y. Davidson, A. Philipp, S. Chakraborty, A. M. Bronstein, R. Gershoni-Porannechemrxiv 10.26434/chemrxiv-2025-pqmcc, 2025We investigate the locality of magnetic response in polycyclic aromatic molecules using a novel deep-learning approach. Our method employs graph neural networks (GNNs) with a graph-of-rings representation to predict Nucleus-Independent Chemical Shifts in the space around the molecule. We train a series of models, each time reducing the size of the largest molecules used in training. The accuracy of prediction remains high (MAE < 0.5 ppm), even when training the model only on molecules with up to 4 rings, thus providing strong evidence for the locality of magnetic response. To overcome the known problem of generalization of GNNs, we implement a k-hop expansion strategy and succeed in achieving accurate predictions for molecules with up to 15 rings (almost 4 times the size of the largest training example). Our findings have implications for understanding the magnetic response in complex molecules and demonstrate a promising approach to overcoming GNN scalability limitations. Furthermore, the trained models enable rapid characterization, without the need for more expensive DFT calculations.
S. A. Maddipatla, N. Bojan Sellam, S. Vedula, A. Marx, A. M. Bronstein, Generative modeling of protein ensembles guided by crystallographic electron densities, NeurIPS Workshop on Machine Learning in Structural Biology (MLSB), 2024 detailsGenerative modeling of protein ensembles guided by crystallographic electron densities
S. A. Maddipatla, N. Bojan Sellam, S. Vedula, A. Marx, A. M. BronsteinNeurIPS Workshop on Machine Learning in Structural Biology (MLSB), 2024Proteins are dynamic, adopting ensembles of conformations. The nature of this conformational heterogenity is imprinted in the raw electron density measurements obtained from X-ray crystallography experiments. Fitting an ensemble of protein structures to these measurements is a challenging, ill-posed inverse problem. We propose a non-i.i.d. ensemble guidance approach to solve this problem using existing protein structure generative models and demonstrate that it accurately recovers complicated multi-modal alternate protein backbone conformations observed in certain single crystal measurements.
E. Amrani, L. Karlinsky, A. M. Bronstein, Sample- and parameter-efficient auto-regressive image models, arXiv:2411.15648, 2024 detailsSample- and parameter-efficient auto-regressive image models
E. Amrani, L. Karlinsky, A. M. BronsteinarXiv:2411.15648, 2024We introduce XTRA, a vision model pre-trained with a novel auto-regressive objective that significantly enhances both sample and parameter efficiency compared to previous auto-regressive image models. Unlike contrastive or masked image modeling methods, which have not been demonstrated as having consistent scaling behavior on unbalanced internet data, auto-regressive vision models exhibit scalable and promising performance as model and dataset size increase. In contrast to standard auto-regressive models, XTRA employs a Block Causal Mask, where each Block represents k × k tokens rather than relying on a standard causal mask. By reconstructing pixel values block by block, XTRA captures higher-level structural patterns over larger image regions. Predicting on blocks allows the model to learn relationships across broader areas of pixels, enabling more abstract and semantically meaningful representations than traditional next-token prediction. This simple modification yields two key results. First, XTRA is sample-efficient. Despite being trained on 152× fewer samples (13.1M vs. 2B), XTRA ViT-H/14 surpasses the top-1 average accuracy of the previous state-of-the-art auto-regressive model across 15 diverse image recognition benchmarks. Second, XTRA is parameter-efficient. Compared to auto-regressive models trained on ImageNet-1k, XTRA ViT-B/16 outperforms in linear and attentive probing tasks, using 7-16× fewer parameters (85M vs. 1.36B/0.63B).
T. Blau, M. Kimhi, Y. Belinkov, A. M. Bronstein, C. Baskin, Context-aware prompt tuning: advancing in-context learning with adversarial methods, arXiv:2410.17222, 2024 detailsContext-aware prompt tuning: advancing in-context learning with adversarial methods
T. Blau, M. Kimhi, Y. Belinkov, A. M. Bronstein, C. BaskinarXiv:2410.17222, 2024Fine-tuning Large Language Models (LLMs) typically involves updating at least a few billions of parameters. A more parameter-efficient approach is Prompt Tuning (PT), which updates only a few learnable tokens, and differently, In-Context Learning (ICL) adapts the model to a new task by simply including examples in the input without any training. When applying optimization-based methods, such as fine-tuning and PT for few-shot learning, the model is specifically adapted to the small set of training examples, whereas ICL leaves the model unchanged. This distinction makes traditional learning methods more prone to overfitting; in contrast, ICL is less sensitive to the few-shot scenario. While ICL is not prone to overfitting, it does not fully extract the information that exists in the training examples. This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks. We build on the ICL strategy of concatenating examples before the input, but we extend this by PT-like learning, refining the context embedding through iterative optimization to extract deeper insights from the training examples. We carefully modify specific context tokens, considering the unique structure of input and output formats. Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss. Moreover, we apply a projected gradient descent algorithm to keep token embeddings close to their original values, under the assumption that the user-provided data is inherently valuable. Our method has been shown to achieve superior accuracy across multiple classification tasks using various LLM models.
B. Gahtan, R. J. Sahala, A. M. Bronstein, R. Cohen, Exploring QUIC dynamics: a large-scale dataset for encrypted traffic analysis, arXiv:2410.03728, 2024 detailsExploring QUIC dynamics: a large-scale dataset for encrypted traffic analysis
B. Gahtan, R. J. Sahala, A. M. Bronstein, R. CohenarXiv:2410.03728, 2024QUIC, a new and increasingly used transport protocol, addresses and resolves the limitations of TCP by offering improved security, performance, and features such as stream multiplexing and connection migration. These features, however, also present challenges for network operators who need to monitor and analyze web traffic. In this paper, we introduce VisQUIC, a labeled dataset comprising over 100,000 QUIC traces from more than 44,000 websites (URLs), collected over a four-month period. These traces provide the foundation for generating more than seven million images, with configurable parameters of window length, pixel resolution, normalization, and labels. These images enable an observer looking at the interactions between a client and a server to analyze and gain insights about QUIC encrypted connections. To illustrate the dataset’s potential, we offer a use-case example of an observer estimating the number of HTTP/3 responses/requests pairs in a given QUIC, which can reveal server behavior, client–server interactions, and the load imposed by an observed connection. We formulate the problem as a discrete regression problem, train a machine learning (ML) model for it, and then evaluate it using the proposed dataset on an example use case.
B. Gahtan, S. Funk, E. Kodesh, I. Ketko, T. Kuflik, A. M. Bronstein, Automatic identification and visualization of group training activities using wearable data, arXiv:2410.05452, 2024 detailsAutomatic identification and visualization of group training activities using wearable data
B. Gahtan, S. Funk, E. Kodesh, I. Ketko, T. Kuflik, A. M. BronsteinarXiv:2410.05452, 2024Human Activity Recognition (HAR) identifies daily activities from time-series data collected by wearable devices like smartwatches. Recent advancements in Internet of Things (IoT), cloud computing, and low-cost sensors have broadened HAR applications across fields like healthcare, biometrics, sports, and personal fitness. However, challenges remain in efficiently processing the vast amounts of data generated by these devices and developing models that can accurately recognize a wide range of activities from continuous recordings, without relying on predefined activity training sessions. This paper presents a comprehensive framework for imputing, analyzing, and identifying activities from wearable data, specifically targeting group training scenarios without explicit activity sessions. Our approach is based on data collected from 135 soldiers wearing Garmin 55 smartwatches over six months. The framework integrates multiple data streams, handles missing data through cross-domain statistical methods, and identifies activities with high accuracy using machine learning (ML). Additionally, we utilized statistical analysis techniques to evaluate the performance of each individual within the group, providing valuable insights into their respective positions in the group in an easy-to-understand visualization. These visualizations facilitate easy understanding of performance metrics, enhancing group interactions and informing individualized training programs. We evaluate our framework through traditional train-test splits and out-of-sample scenarios, focusing on the model’s generalization capabilities. Additionally, we address sleep data imputation without relying on ML, improving recovery analysis. Our findings demonstrate the potential of wearable data for accurately identifying group activities, paving the way for intelligent, data-driven training solutions.
B. Gahtan, R. J. Shahla, R. Cohen, A. M. Bronstein, Estimating the number of HTTP/3 responses in QUIC using deep learning, arXiv:2410.06140, 2024 detailsEstimating the number of HTTP/3 responses in QUIC using deep learning
B. Gahtan, R. J. Shahla, R. Cohen, A. M. BronsteinarXiv:2410.06140, 2024QUIC, a new and increasingly used transport protocol, enhances TCP by providing better security, performance, and features like stream multiplexing. These features, however, also impose challenges for network middle-boxes that need to monitor and analyze web traffic. This paper proposes a novel solution for estimating the number of HTTP/3 responses in a given QUIC connection by an observer. This estimation reveals server behavior, client-server interactions, and data transmission efficiency, which is crucial for various applications such as designing a load balancing solution and detecting HTTP/3 flood attacks. The proposed scheme transforms QUIC connection traces into a sequence of images and trains machine learning (ML) models to predict the number of responses. Then, by aggregating images of a QUIC connection, an observer can estimate the total number of responses. As the problem is formulated as a discrete regression problem, we introduce a dedicated loss function. The proposed scheme is evaluated on a dataset of over seven million images, generated from 100,000 traces collected from over 44,000 websites over a four-month period, from various vantage points. The scheme achieves up to 97% cumulative accuracy in both known and unknown web server settings and 92% accuracy in estimating the total number of responses in unseen QUIC traces.
H. Abraham, B. Gahtan, A. Kobovich, O. Leitersdorf, A. M. Bronstein, E. Yaakobi, Beyond the alphabet: deep signal embedding for enhanced DNA clustering, arXiv:2410.06188, 2024 detailsBeyond the alphabet: deep signal embedding for enhanced DNA clustering
H. Abraham, B. Gahtan, A. Kobovich, O. Leitersdorf, A. M. Bronstein, E. YaakobiarXiv:2410.06188, 2024The emerging field of DNA storage employs strands of DNA bases (A/T/C/G) as a storage medium for digital information to enable massive density and durability. The DNA storage pipeline includes: (1) encoding the raw data into sequences of DNA bases; (2) synthesizing the sequences as DNA strands that are stored over time as an unordered set; (3) sequencing the DNA strands to generate DNA reads; and (4) deducing the original data. The DNA synthesis and sequencing stages each generate several independent error-prone duplicates of each strand which are then utilized in the final stage to reconstruct the best estimate for the original strand. Specifically, the reads are first clustered into groups likely originating from the same strand (based on their similarity to each other), and then each group approximates the strand that led to the reads of that group. This work improves the DNA clustering stage by embedding it as part of the DNA sequencing. Traditional DNA storage solutions begin after the DNA sequencing process generates discrete DNA reads (A/T/C/G), yet we identify that there is untapped potential in using the raw signals generated by the Nanopore DNA sequencing machine before they are discretized into bases, a process known as basecalling, which is done using a deep neural network. We propose a deep neural network that clusters these signals directly, demonstrating superior accuracy, and reduced computation times compared to current approaches that cluster after basecalling.
B. Gahtan, R. Cohen, A. M. Bronstein, E. Shapira, Data-driven cellular network selector for vehicle teleoperations, arXiv:2410.19791, 2024 detailsData-driven cellular network selector for vehicle teleoperations
B. Gahtan, R. Cohen, A. M. Bronstein, E. ShapiraarXiv:2410.19791, 2024Remote control of robotic systems, also known as teleoperation, is crucial for the development of autonomous vehicle (AV) technology. It allows a remote operator to view live video from AVs and, in some cases, to make real-time decisions. The effectiveness of video-based teleoperation systems is heavily influenced by the quality of the cellular network and, in particular, its packet loss rate and latency. To optimize these parameters, an AV can be connected to multiple cellular networks and determine in real time over which cellular network each video packet will be transmitted. We present an algorithm, called Active Network Selector (ANS), which uses a time series machine learning approach for solving this problem. We compare ANS to a baseline non-learning algorithm, which is used today in commercial systems, and show that ANS performs much better, with respect to both packet loss and packet latency.
T. Shor, C. Baskin, A. M. Bronstein, Leveraging latents for efficient thermography classification and segmentation, Proc. Medical Imaging with Deep Learning (MIDL), 2024 detailsLeveraging latents for efficient thermography classification and segmentation
T. Shor, C. Baskin, A. M. BronsteinProc. Medical Imaging with Deep Learning (MIDL), 2024Breast cancer is a prominent health concern worldwide, currently being the secondmost common and second-deadliest type of cancer in women. While current breast cancer diagnosis mainly relies on mammography imaging, in recent years the use of thermography for breast cancer imaging has been garnering growing popularity. Thermographic imaging relies on infrared cameras to capture body-emitted heat distributions. While these heat signatures have proven useful for computer-vision systems for accurate breast cancer segmentation and classification, prior work often relies on handcrafted feature engineering or complex architectures, potentially limiting the comparability and applicability of these methods. In this work, we present a novel algorithm for both breast cancer classification and segmentation. Rather than focusing efforts on manual feature and architecture engineering, our algorithm focuses on leveraging an informative, learned feature space, thus making our solution simpler to use and extend to other frameworks and downstream tasks, as well as more applicable to data-scarce settings. Our classification produces SOTA results, while we are the first work to produce segmentation regions studied in this paper.
A. A. Rosenberg, S. Vedula, A. M. Bronstein, A. Marx, Seeing Double: Molecular dynamics simulations reveal the stability of certain alternate protein conformations in crystal structures, bioRxiv 2024.08.31.610605, 2024 detailsSeeing Double: Molecular dynamics simulations reveal the stability of certain alternate protein conformations in crystal structures
A. A. Rosenberg, S. Vedula, A. M. Bronstein, A. MarxbioRxiv 2024.08.31.610605, 2024Proteins jiggle around, adopting ensembles of interchanging conformations. Here we show through a large-scale analysis of the Protein Data Bank and using molecular dynamics simulations, that segments of protein chains can also commonly adopt dual, transiently stable conformations which is not explained by direct interactions. Our analysis highlights how alternate conformations can be maintained as non-interchanging, separated states intrinsic to the protein chain, namely through steric barriers or the adoption of transient secondary structure elements. We further demonstrate that despite the commonality of the phenomenon, current structural ensemble prediction methods fail to capture these bimodal distributions of conformations.
S. Vedula, V. Maiorca, L. Basile, F. Locatello, A. M. Bronstein, Scalable unsupervised alignment of general metric and non-metric structures, arXiv preprint arXiv:2406.13507, 2024 (also in Proc. ICML Workshop on AI4Science) detailsScalable unsupervised alignment of general metric and non-metric structures
S. Vedula, V. Maiorca, L. Basile, F. Locatello, A. M. BronsteinarXiv preprint arXiv:2406.13507, 2024 (also in Proc. ICML Workshop on AI4Science)Aligning data from different domains is a fundamental problem in machine learning with broad applications across very different areas, most notably aligning experimental readouts in single-cell multiomics. Mathematically, this problem can be formulated as the minimization of disagreement of pair-wise quantities such as distances and is related to the Gromov-Hausdorff and Gromov-Wasserstein distances. Computationally, it is a quadratic assignment problem (QAP) that is known to be NP-hard. Prior works attempted to solve the QAP directly with entropic or low-rank regularization on the permutation, which is computationally tractable only for modestly-sized inputs, and encode only limited inductive bias related to the domains being aligned. We consider the alignment of metric structures formulated as a discrete Gromov-Wasserstein problem and instead of solving the QAP directly, we propose to learn a related well-scalable linear assignment problem (LAP) whose solution is also a minimizer of the QAP. We also show a flexible extension of the proposed framework to general non-metric dissimilarities through differentiable ranks. We extensively evaluate our approach on synthetic and real datasets from single-cell multiomics and neural latent spaces, achieving state-of-the-art performance while being conceptually and computationally simple.
A. A. Rosenberg, A. Marx, A. M. Bronstein, A dataset of alternately located segments in protein crystal structures, Scientific Data, 11 (783), 2024 detailsA dataset of alternately located segments in protein crystal structures
A. A. Rosenberg, A. Marx, A. M. BronsteinScientific Data, 11 (783), 2024Protein Data Bank (PDB) files list the relative spatial location of atoms in a protein structure as the final output of the process of fitting and refining to experimentally determined electron density measurements. Where experimental evidence exists for multiple conformations, atoms are modelled in alternate locations. Programs reading PDB files commonly ignore these alternate conformations by default leaving users oblivious to the presence of alternate conformations in the structures they analyze. This has led to underappreciation of their prevalence, under characterisation of their features and limited the accessibility to this high-resolution data representing structural ensembles. We have trawled PDB files to extract structural features of residues with alternately located atoms. The output includes the distance between alternate conformations and identifies the location of these segments within the protein chain and in proximity of all other atoms within a defined radius. This dataset should be of use in efforts to predict multiple structures from a single sequence and support studies investigating protein flexibility and the association with protein function.
D. Freedman, E. Rozenberg, A. M. Bronstein, A theoretical framework for an efficient normalizing flow-based solution to the Schrödinger equation, arXiv preprint arXiv:2406.00047, 2024 detailsA theoretical framework for an efficient normalizing flow-based solution to the Schrödinger equation
D. Freedman, E. Rozenberg, A. M. BronsteinarXiv preprint arXiv:2406.00047, 2024A central problem in quantum mechanics involves solving the Electronic Schrödinger Equation for a molecule or material. The Variational Monte Carlo approach to this problem approximates a particular variational objective via sampling, and then optimizes this approximated objective over a chosen parameterized family of wavefunctions, known as the ansatz. Recently neural networks have been used as the ansatz, with accompanying success. However, sampling from such wavefunctions has required the use of a Markov Chain Monte Carlo approach, which is inherently inefficient. In this work, we propose a solution to this problem via an ansatz which is cheap to sample from, yet satisfies the requisite quantum mechanical properties. We prove that a normalizing flow using the following two essential ingredients satisfies our requirements: (a) a base distribution which is constructed from Determinantal Point Processes; (b) flow layers which are equivariant to a particular subgroup of the permutation group. We then show how to construct both continuous and discrete normalizing flows which satisfy the requisite equivariance. We further demonstrate the manner in which the non-smooth nature (“cusps”) of the wavefunction may be captured, and how the framework may be generalized to provide induction across multiple molecules. The resulting theoretical framework entails an efficient approach to solving the Electronic Schrödinger Equation.
Y. Elul, E. Rozenberg, A. Boyarski, Y. Yaniv, A. Schuster, A. M. Bronstein , Data-driven modeling of interrelated dynamical systems, Nature Communications Physics (7), 144, 2024 detailsData-driven modeling of interrelated dynamical systems
Y. Elul, E. Rozenberg, A. Boyarski, Y. Yaniv, A. Schuster, A. M. BronsteinNature Communications Physics (7), 144, 2024Non-linear dynamical systems describe numerous real-world phenomena, ranging from the weather, to financial markets and disease progression. Individual systems may share substantial common information, for example patients’ anatomy. Lately, deep-learning has emerged as a leading method for data-driven modeling of non-linear dynamical systems. Yet, despite recent breakthroughs, prior works largely ignored the existence of shared information between different systems. However, such cases are quite common, for example, in medicine: we may wish to have a patient-specific model for some disease, but the data collected from a single patient is usually too small to train a deep-learning model. Hence, we must properly utilize data gathered from other patients. Here, we explicitly consider such cases by jointly modeling multiple systems. We show that the current single-system models consistently fail when trying to learn simultaneously from multiple systems. We suggest a framework for jointly approximating the Koopman operators of multiple systems, while intrinsically exploiting common information. We demonstrate how we can adapt to a new system using order-of-magnitude less new data and show the superiority of our model over competing methods, in terms of both forecasting ability and statistical fidelity, across chaotic, cardiac, and climate systems.
O. Wengrowicz, A. M. Bronstein, O. Cohen, Unsupervised physics-informed deep learning-based reconstruction for time-resolved imaging by multiplexed ptychography, Optics Express 32(6), pp. 8791-8803, 2024 detailsUnsupervised physics-informed deep learning-based reconstruction for time-resolved imaging by multiplexed ptychography
O. Wengrowicz, A. M. Bronstein, O. CohenOptics Express 32(6), pp. 8791-8803, 2024We explore numerically an unsupervised, physics-informed, deep learning-based reconstruction technique for time-resolved imaging by multiplexed ptychography. In our method, the untrained deep learning model replaces the iterative algorithm’s update step, yielding superior reconstructions of multiple dynamic object frames compared to conventional methodologies. More precisely, we demonstrate improvements in image quality and resolution, while reducing sensitivity to the number of recorded frames, the mutual orthogonality of different probe modes, overlap between neighboring probe beams and the cutoff frequency of the ptychographic microscope – properties that are generally of paramount importance for ptychographic reconstruction algorithms.
G. Serussi, T. Shor, T. Hirshberg, C. Baskin, A. M. Bronstein, Active propulsion noise shaping for multi-rotor aircraft localization, Proc. Int'l Conf. on Intelligent Robots and Systems (IROS), 2024 detailsActive propulsion noise shaping for multi-rotor aircraft localization
G. Serussi, T. Shor, T. Hirshberg, C. Baskin, A. M. BronsteinProc. Int'l Conf. on Intelligent Robots and Systems (IROS), 2024Multi-rotor aerial autonomous vehicles (MAVs) primarily rely on vision for navigation purposes. However, visual localization and odometry techniques suffer from poor performance in low or direct sunlight, a limited field of view, and vulnerability to occlusions. Acoustic sensing can serve as a complementary or even alternative modality for vision in many situations, and it also has the added benefits of lower system cost and energy footprint, which is especially important for micro aircraft. This paper proposes actively controlling and shaping the aircraft propulsion noise generated by the rotors to benefit localization tasks, rather than considering it a harmful nuisance. We present a neural network architecture for selfnoise-based localization in a known environment. We show that training it simultaneously with learning time-varying rotor phase modulation achieves accurate and robust localization. The proposed methods are evaluated using a computationally affordable simulation of MAV rotor noise in 2D acoustic environments that is fitted to real recordings of rotor pressure fields.
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein, Vector quantile regression on manifolds, Proc. AIStats, 2024 detailsVector quantile regression on manifolds
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. BronsteinProc. AIStats, 2024Quantile regression (QR) is a statistical tool for distribution-free estimation of conditional quantiles of a target variable given explanatory features. QR is limited by the assumption that the target distribution is univariate and defined on an Euclidean domain. Although the notion of quantiles was recently extended to multi-variate distributions, QR for multi-variate distributions on manifolds remains underexplored, even though many important applications inherently involve data distributed on, e.g., spheres (climate and geological phenomena), and tori (dihedral angles in proteins). By leveraging optimal transport theory and c-concave functions, we meaningfully define conditional vector quantile functions of high-dimensional variables on manifolds (M-CVQFs). Our approach allows for quantile estimation, regression, and computation of conditional confidence sets and likelihoods. We demonstrate the approach’s efficacy and provide insights regarding the meaning of non-Euclidean quantiles through synthetic and real data experiments.
Y. Chen, H. Ye, S. Vedula, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati, Demystifying graph sparsification algorithms in graph properties preservation, Proc.Int'l Conf. on Very Large Databases (VLDB), 2024 detailsDemystifying graph sparsification algorithms in graph properties preservation
Y. Chen, H. Ye, S. Vedula, A. M. Bronstein, R. Dreslinski, T. Mudge, N. TalatiProc.Int'l Conf. on Very Large Databases (VLDB), 2024Graph sparsification is a technique that approximates a given graph by a sparse graph with a subset of vertices and/or edges. The goal of an effective sparsification algorithm is to maintain specific graph properties relevant to the downstream task while minimizing the graph’s size. Graph algorithms often suffer from long execution time due to the irregularity and the large real-world graph size. Graph sparsification can be applied to greatly reduce the run time of graph algorithms by substituting the full graph with a much smaller sparsified graph, without significantly degrading the output quality. However, the interaction between numerous sparsifiers and graph properties is not widely explored, and the potential of graph sparsification is not fully understood.
In this work, we cover 16 widely-used graph metrics, 12 representative graph sparsification algorithms, and 14 real-world input graphs spanning various categories, exhibiting diverse characteristics, sizes, and densities. We developed a framework to extensively assess the performance of these sparsification algorithms against graph metrics, and provide insights to the results. Our study shows that there is no one sparsifier that performs the best in preserving all graph properties, e.g. sparsifiers that preserve distance-related graph properties (eccentricity) struggle to perform well on Graph Neural Networks (GNN). This paper presents a comprehensive experimental study evaluating the performance of sparsification algorithms in preserving essential graph metrics. The insights inform future research in incorporating matching graph sparsification to graph algorithms to maximize benefits while minimizing quality degradation. Furthermore, we provide a framework to facilitate the future evaluation of evolving sparsification algorithms, graph metrics, and ever-growing graph data.A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein, An amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units, Proc. US National Academy of Sciences (PNAS), 2023 detailsAn amino-domino model described by a cross-peptide-bond Ramachandran plot defines amino acid pairs as local structural units
A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. BronsteinProc. US National Academy of Sciences (PNAS), 2023Protein structure, both at the global and local level, dictates function. Proteins fold from chains of amino acids, forming secondary structures, α-helices and β-strands, that, at least for globular proteins, subsequently fold into a three-dimensional structure. Here, we show that a Ramachandran-type plot focusing on the two dihedral angles separated by the peptide bond, and entirely contained within an amino acid pair, defines a local structural unit. We further demonstrate the usefulness of this cross-peptide-bond Ramachandran plot by showing that it captures β-turn conformations in coil regions, that traditional Ramachandran plot outliers fall into occupied regions of our plot, and that thermophilic proteins prefer specific amino acid pair conformations. Further, we demonstrate experimentally that the effect of a point mutation on backbone conformation and protein stability depends on the amino acid pair context, i.e., the identity of the adjacent amino acid, in a manner predictable by our method.
T. Weiss, L. Cosmo, E. Mayo Yanes, S. Chakraborty, A. M. Bronstein, R. Gershoni-Poranne, Guided diffusion for inverse molecular design, Nature Computational Science 3(10), 873–882, 2023 detailsGuided diffusion for inverse molecular design
T. Weiss, L. Cosmo, E. Mayo Yanes, S. Chakraborty, A. M. Bronstein, R. Gershoni-PoranneNature Computational Science 3(10), 873–882, 2023The holy grail of materials science is de novo molecular design — i.e., the ability to engineer molecules with desired characteristics. Recently, this goal has become increasingly achievable thanks to developments such as equivariant graph neural networks that can better predict molecular properties, and to the improved performance of generation tasks, in particular of conditional generation, in text-to-image generators and large language models. Herein, we introduce GaUDI, a guided diffusion model for inverse molecular design, which combines these advances and can generate novel molecules with desired properties. GaUDI decouples the generator and the property-predicting models and can be guided using both point-wise targets and open-ended targets (e.g., minimum/maximum). We demonstrate GaUDI’s effectiveness using single- and multiple-objective tasks applied to newly-generated data sets of polycyclic aromatic systems, achieving nearly 100% validity of generated molecules. Further, for some tasks, GaUDI discovers better molecules than those present in our data set of 475k molecules.
E. Schwartz, A. M. Bronstein, R. Giryes, ISP distillation, IEEE Open Journal of Signal Processing 4, 12-20, 2023 detailsISP distillation
E. Schwartz, A. M. Bronstein, R. GiryesIEEE Open Journal of Signal Processing 4, 12-20, 2023Nowadays, many of the images captured are ‘observed’ by machines only and not by humans, e.g., in autonomous systems. High-level machine vision models, such as object recognition or semantic segmentation, assume images are transformed into some canonical image space by the camera Image Signal Processor (ISP). However, the camera ISP is optimized for producing visually pleasing images for human observers and not for machines. Therefore, one may spare the ISP compute time and apply vision models directly to RAW images. Yet, it has been shown that training such models directly on RAW images results in a performance drop. To mitigate this drop, we use a RAW and RGB image pairs dataset, which can be easily acquired with no human labeling. We then train a model that is applied directly to the RAW data by using knowledge distillation such that the model predictions for RAW images will be aligned with the predictions of an off-the-shelf pre-trained model for processed RGB images. Our experiments show that our performance on RAW images for object classification and semantic segmentation is significantly better than models trained on labeled RAW images. It also reasonably matches the predictions of a pre-trained model on processed RGB images, while saving the ISP compute overhead.
T. Blau, R. Ganz, C. Baskin, M. Elad, A. M. Bronstein, Classifier robustness enhancement via test-time transformation, arXiv preprint arXiv:2303.15409 2023 detailsClassifier robustness enhancement via test-time transformation
T. Blau, R. Ganz, C. Baskin, M. Elad, A. M. BronsteinarXiv preprint arXiv:2303.15409 2023It has been recently discovered that adversarially trained classifiers exhibit an intriguing property, referred to as perceptually aligned gradients (PAG). PAG implies that the gradients of such classifiers possess a meaningful structure, aligned with human perception. Adversarial training is currently the best-known way to achieve classification robustness under adversarial attacks. The PAG property, however, has yet to be leveraged for further improving classifier robustness. In this work, we introduce Classifier Robustness Enhancement Via Test-Time Transformation (TETRA) — a novel defense method that utilizes PAG, enhancing the performance of trained robust classifiers. Our method operates in two phases. First, it modifies the input image via a designated targeted adversarial attack into each of the dataset’s classes. Then, it classifies the input image based on the distance to each of the modified instances, with the assumption that the shortest distance relates to the true class. We show that the proposed method achieves state-of-the-art results and validate our claim through extensive experiments on a variety of defense methods, classifier architectures, and datasets. We also empirically demonstrate that TETRA can boost the accuracy of any differentiable adversarial training classifier across a variety of attacks, including ones unseen at training. Specifically, applying TETRA leads to substantial improvement of up to +23%, +20%, and +26% on CIFAR10, CIFAR100, and ImageNet, respectively.
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie, Designing nonlinear photonic crystals for high-dimensional quantum state engineering, ICLR Workshop on Machine Learning for Materials, 2023 detailsDesigning nonlinear photonic crystals for high-dimensional quantum state engineering
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. ArieICLR Workshop on Machine Learning for Materials, 2023We propose a novel, physically-constrained and differentiable approach for the generation of D-dimensional qudit states via spontaneous parametric downconversion (SPDC) in quantum optics. We circumvent any limitations imposed by the inherently stochastic nature of the physical process and incorporate a set of stochastic dynamical equations governing its evolution under the SPDC Hamiltonian. We demonstrate the effectiveness of our model through the design of
structured nonlinear photonic crystals (NLPCs) and shaped pump beams; and show, theoretically and experimentally, how to generate maximally entangled states in the spatial degree of freedom. The learning of NLPC structures offers a promising new avenue for shaping and controlling arbitrary quantum states and enables all-optical coherent control of the generated states. We believe that this approach can readily be extended from bulky crystals to thin Metasurfaces and potentially applied to other quantum systems sharing a similar Hamiltonian structures, such as superfluids and superconductors.E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. Arie, A machine learning approach to generate quantum light, ICLR Workshop on Physics for Machine Learning, 2023 detailsA machine learning approach to generate quantum light
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, S. Mishra, S. Prabhakar, R. P. Singh, D. Freedman, A. M. Bronstein, A. ArieICLR Workshop on Physics for Machine Learning, 2023Spontaneous parametric down-conversion (SPDC) is a key technique in quantum optics used to generate entangled photon pairs. However, generating a desirable D-dimensional qudit state in the SPDC process remains a challenge. In this paper, we introduce a physically-constrained and differentiable model to overcome this challenge, and demonstrate its effectiveness through the design of shaped pump beams and structured nonlinear photonic crystals. We avoid any restrictions induced by the stochastic nature of our physical process and integrate a set of stochastic dynamical equations governing its evolution under the SPDC Hamiltonian. Our model is capable of learning the relevant interaction parameters and designing nonlinear quantum optical systems that achieve desired quantum states. We show, theoretically and experimentally, how to generate maximally entangled states in the spatial degree of freedom. Additionally, we demonstrate all-optical coherent control of the generated state by reshaping the pump beam. Our work has potential applications in high-dimensional quantum key distribution and quantum information processing.
H. Ye, S. Vedula, Y. Chen, Y. Yang, A. M. Bronstein, R. Dreslinski, T. Mudge, N. Talati, GRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference, Proc. ACM Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023 detailsGRACE: A Scalable Graph-Based Approach to Accelerating Recommendation Model Inference
H. Ye, S. Vedula, Y. Chen, Y. Yang, A. M. Bronstein, R. Dreslinski, T. Mudge, N. TalatiProc. ACM Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2023The high memory bandwidth demand of sparse embedding layers continues to be a critical challenge in scaling the performance of recommendation models. While prior works have exploited heterogeneous memory system designs and partial embedding sum memoization techniques, they offer limited benefits. This is because prior designs either target a very small subset of embeddings to simplify their analysis or incur a high processing cost to account for all embeddings, which does not scale with the large sizes of modern embedding tables. This paper proposes GRACE—a lightweight and scalable graph-based algorithm-system co-design framework to significantly improve the embedding layer performance of recommendation models. GRACE proposes a novel Item Co-occurrence Graph (ICG) that scalably records item co-occurrences. GRACE then presents a new system-aware ICG clustering algorithm to find frequently accessed item combinations of arbitrary lengths to compute and memoize their partial sums. High-frequency partial sums are stored in a software-managed cache space to reduce memory traffic and improve the throughput of computing sparse features. We further present a cache data layout and low-cost address computation logic to efficiently lookup item embeddings and their partial sums. Our evaluation shows that GRACE significantly outperforms the state-of-the-art techniques SPACE and MERCI by 1.5× and 1.4×, respectively.
S. Vedula, I. Tallini, A. A. Rosenberg, M. Pegoraro, E. Rodolà, Y. Romano, A. M. Bronstein, Continuous vector quantile regression, Proc. ICML Workshop Frontiers4LCD, 2023 detailsContinuous vector quantile regression
S. Vedula, I. Tallini, A. A. Rosenberg, M. Pegoraro, E. Rodolà, Y. Romano, A. M. BronsteinProc. ICML Workshop Frontiers4LCD, 2023Vector quantile regression (VQR) estimates the conditional vector quantile function (CVQF), a fundamental quantity which fully represents the conditional distribution of Y|X. VQR is formulated as an optimal transport (OT) problem between a uniform U~μ and the target (X,Y)~ν, the solution of which is a unique transport map, co-monotonic with U. Recently NL-VQR has been proposed to estimate support non-linear CVQFs, together with fast solvers which enabled the use of this tool in practical applications. Despite its utility, the scalability and estimation quality of NL-VQR is limited due to a discretization of the OT problem onto a grid of quantile levels. We propose a novel continuous formulation and parametrization of VQR using partial input-convex neural networks (PICNNs). Our approach allows for accurate, scalable, differentiable and invertible estimation of non-linear CVQFs. We further demonstrate, theoretically and experimentally, how continuous CVQFs can be used for general statistical inference tasks: estimation of likelihoods, CDFs, confidence sets, coverage, sampling, and more. This work is an important step towards unlocking the full potential of VQR.
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. Bronstein, Vector quantile regression on manifolds, ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023 detailsVector quantile regression on manifolds
M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, A. M. BronsteinICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023Quantile regression (QR) is a statistical tool for distribution-free
estimation of conditional quantiles of a target variable given explanatory
features. QR is limited by the assumption that the target distribution is
univariate and defined on an Euclidean domain. Although the notion of quantiles
was recently extended to multi-variate distributions, QR for multi-variate
distributions on manifolds remains underexplored, even though many important
applications inherently involve data distributed on, e.g., spheres (climate
measurements), tori (dihedral angles in proteins), or Lie groups (attitude in
navigation). By leveraging optimal transport theory and the notion of
c-concave functions, we meaningfully define conditional vector quantile
functions of high-dimensional variables on manifolds (M-CVQFs). Our approach
allows for quantile estimation, regression, and computation of conditional
confidence sets. We demonstrate the approach’s efficacy and provide insights
regarding the meaning of non-Euclidean quantiles through preliminary synthetic
data experiments.T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-Poranne, Interpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons, Journal of Organic Chemistry, 2023 detailsInterpretable deep learning unveils structure-property relationships in polybenzenoid hydrocarbons
T. Weiss, A. Wahab, A. M. Bronstein, R. Gershoni-PoranneJournal of Organic Chemistry, 2023In this work, interpretable deep learning was used to identify structure-property relationships governing the HOMO-LUMO gap and relative stability of polybenzenoid hydrocarbons (PBHs). To this end, a ring-based graph representation was used. In addition to affording reduced training times and excellent predictive ability, this representation could be combined with a subunit-based perception of PBHs, allowing chemical insights to be presented in terms of intuitive and simple structural motifs. The resulting insights agree with conventional organic chemistry knowledge and electronic structure-based analyses, and also reveal new behaviors and identify influential structural motifs. In particular, we evaluated and compared the effects of linear, angular, and branching motifs on these two molecular properties, as well as explored the role of dispersion in mitigating torsional strain inherent in non-planar PBHs. Hence, the observed regularities and the proposed analysis contribute to a deeper understanding of the behavior of PBHs and form the foundation for design strategies for new functional PBHs.
A. A. Rosenberg, S. Vedula, Y. Romano, A. M. Bronstein, Fast nonlinear vector quantile regression, Proc. ICML, 2023 detailsFast nonlinear vector quantile regression
A. A. Rosenberg, S. Vedula, Y. Romano, A. M. BronsteinProc. ICML, 2023Quantile regression (QR) is a powerful tool for estimating one or more conditional quantiles of a target variable Y given explanatory features X. A limitation of QR is that it is only defined for scalar target variables, due to the formulation of its objective function, and since the notion of quantiles has no standard definition for multivariate distributions. Recently, vector quantile regression (VQR) was proposed as an extension of QR for high-dimensional target variables, thanks to a meaningful generalization of the notion of quantiles to multivariate distributions. Despite its elegance, VQR is arguably not applicable in practice due to several limitations: (i) it assumes a linear model for the quantiles of the target Y given the features X; (ii) its exact formulation is intractable even for modestly-sized problems in terms of target dimensions, the number of regressed quantile levels, or the number of features, and its relaxed dual formulation may violate the monotonicity of the estimated quantiles; (iii) no fast or scalable solvers for VQR currently exist. In this work we fully address these limitations, namely: (i) We extend VQR to the non-linear case, showing substantial improvement over linear VQR; (ii) We propose vector monotone rearrangement, a method which ensures the estimates obtained by VQR relaxations are monotone functions; (iii) We provide fast, GPU-accelerated solvers for linear and nonlinear VQR which maintain a fixed memory footprint with the number of samples and quantile levels, and demonstrate that they scale to millions of samples and thousands of quantile levels; (iv) We release an optimized python package of our solvers as to widespread the use of VQR in real-world applications.
D. Zadok, O. Salzman, A. Wolf, A. M. Bronstein, Towards predicting fine finger motions from ultrasound images via kinematic representation, Proc. ICRA, 2023 detailsTowards predicting fine finger motions from ultrasound images via kinematic representation
D. Zadok, O. Salzman, A. Wolf, A. M. BronsteinProc. ICRA, 2023A central challenge in building robotic prostheses is the creation of a sensor-based system able to read physiological signals from the lower limb and instruct a robotic hand to perform various tasks. Existing systems typically perform discrete gestures such as pointing or grasping, by employing electromyography (EMG) or ultrasound (US) technologies to analyze the state of the muscles. In this work, we study the inference problem of identifying the activation of specific fingers from a sequence of US images when performing dexterous tasks such as keyboard typing or playing the piano. While estimating finger gestures has been done in the past by detecting prominent gestures, we are interested in classification done in the context of fine motions that evolve over time. We consider this task as an important step towards higher adoption rates of robotic prostheses among arm amputees, as it has the potential to dramatically increase functionality in performing daily tasks. Our key observation, motivating this work, is that modeling the hand as a robotic manipulator allows to encode an intermediate representation wherein US images are mapped to said configurations. Given a sequence of such learned configurations, coupled with a neural-network architecture that exploits temporal coherence, we are able to infer fine finger motions. We evaluated our method by collecting data from a group of subjects and demonstrating how our framework can be used to replay music played or text typed. To the best of our knowledge, this is the first study demonstrating these downstream tasks within an end-to-end system.
A. M. Bronstein, A. Marx, Water stabilizes an alternate turn conformation in horse heart myoglobin, Nature Scientific Reports, 2023 detailsWater stabilizes an alternate turn conformation in horse heart myoglobin
A. M. Bronstein, A. MarxNature Scientific Reports, 2023Comparison of myoglobin structures reveals that protein isolated from horse heart consistently adopts an alternate turn conformation in comparison to its homologues. Analysis of hundreds of high-resolution structures discounts crystallization conditions or the surrounding amino acid protein environment as explaining this difference, that is also not captured by the AlphaFold prediction. Rather, a water molecule is identified as stabilizing the conformation in the horse heart structure, which immediately reverts to the whale conformation in molecular dynamics simulations excluding that structural water.
B. Gahtan, R. Cohen, A. M. Bronstein, G. Kedar, Using deep reinforcement learning for mmWave real-time scheduling, Proc. Int'l Conf. Network of the Future (NoF), 2023 detailsUsing deep reinforcement learning for mmWave real-time scheduling
B. Gahtan, R. Cohen, A. M. Bronstein, G. KedarProc. Int'l Conf. Network of the Future (NoF), 2023We study the problem of real-time scheduling in a multi-hop millimeter-wave (mmWave) mesh. We develop a model-free deep reinforcement learning algorithm called Adaptive Activator RL (AARL), which determines the subset of mmWave links that should be activated during each time slot and the power level for each link. The most important property of AARL is its ability to make scheduling decisions within the strict time slot constraints of typical 5G mmWave networks. AARL can handle a variety of network topologies, network loads, and interference models, it can also adapt to different workloads. We demonstrate the operation of AARL on several topologies: a small topology with 10 links, a moderately-sized mesh with 48 links, and a large topology with 96 links. We show that for each topology, we compare the throughput obtained by AARL to that of a benchmark algorithm called RPMA (Residual Profit Maximizer Algorithm). The most important advantage of AARL compared to RPMA is that it is much faster and can make the necessary scheduling decisions very rapidly during every time slot, while RPMA cannot. In addition, the quality of the scheduling decisions made by AARL outperforms those made by RPMA.
T. Shor, T. Weiss, D. Noti, A. M. Bronstein, Multi PILOT: Feasible learned multiple acquisition trajectories for dynamic MRI, Proc. Medical Imaging with Deep Learning (MIDL), 2023 detailsMulti PILOT: Feasible learned multiple acquisition trajectories for dynamic MRI
T. Shor, T. Weiss, D. Noti, A. M. BronsteinProc. Medical Imaging with Deep Learning (MIDL), 2023Dynamic Magnetic Resonance Imaging (MRI) is known to be a powerful and reliable technique for the dynamic imaging of internal organs and tissues, making it a leading diagnostic tool. A major difficulty in using MRI in this setting is the relatively long acquisition time (and, hence, increased cost) required for imaging in high spatio-temporal resolution,
leading to the appearance of related motion artifacts and decrease in resolution. Compressed Sensing (CS) techniques have become a common tool to reduce MRI acquisition time by subsampling images in the k-space according to some acquisition trajectory. Several studies have particularly focused on applying deep learning techniques to learn these acquisition trajectories in order to attain better image reconstruction, rather than using some predefined set of trajectories. To the best of our knowledge, learning acquisition trajectories has been only explored in the context of static MRI. In this study, we consider acquisition trajectory learning in the dynamic imaging setting. We design an end-to-end pipeline for the joint optimization of multiple per-frame acquisition trajectories along with a reconstruction neural network, and demonstrate improved image reconstruction quality in shorter acquisition times.A. B. Bainson, J. Hermanns, P. Petsinis, N. Aavad, C. Dam Larsen, T. Swayne, A. Boyarski, D. Mottin, A. M. Bronstein, P. Karras, Spectral subgraph localization, Proc. Learning on Graphs Conference, 2023 detailsSpectral subgraph localization
A. B. Bainson, J. Hermanns, P. Petsinis, N. Aavad, C. Dam Larsen, T. Swayne, A. Boyarski, D. Mottin, A. M. Bronstein, P. KarrasProc. Learning on Graphs Conference, 2023Several graph analysis problems are based on some variant of subgraph isomorphism: Given two graphs, G and Q, does G contain a subgraph isomorphic to Q? As this problem is NP-complete, past work usually avoids addressing it explicitly. In this paper, we propose a method that localizes, i.e., finds the best-match position of, Q in G, by aligning their Laplacian spectra and enhance its stability via bagging strategies; we relegate the finding of an exact node correspondence from Q to G to a subsequent and separate graph alignment task. We demonstrate that our localization strategy outperforms a baseline based on the state-of-the-art method for graph alignment in terms of accuracy on real graphs and scales to hundreds of nodes as no other method does.
J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. Karras, GRASP: Graph Alignment through Spectral Signatures, Proc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, 2022 detailsGRASP: Graph Alignment through Spectral Signatures
J. Hermanns, A. Tsitsulin, M. Munkhoeva, A. M. Bronstein, D. Mottin, P. KarrasProc. Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, 2022What is the best way to match the nodes of two graphs? This graph alignment problem generalizes graph isomorphism and arises in applications from social network analysis to bioinformatics. Some solutions assume that auxiliary information on known matches or node or edge attributes is available, or utilize arbitrary graph features. Such methods fare poorly in the pure form of the problem, in which only graph structures are given. Other proposals translate the problem to one of aligning node embeddings, yet, by doing so, provide only a single-scale view of the graph. In this paper, we transfer the shape-analysis concept of functional maps from the continuous to the discrete case, and treat the graph alignment problem as a special case of the problem of finding a mapping between functions on graphs. We present GRASP, a method that first establishes a correspondence between functions derived from Laplacian matrix eigenvectors, which capture multiscale structural characteristics, and then exploits this correspondence to align nodes. Our experimental study, featuring noise levels higher than anything used in previous studies, shows that GRASP outperforms state-of-the-art methods for graph alignment across noise levels and graph types.
P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. Liu, Deep fused two-step cross-modal hashing with multiple semantic supervision, Multimedia Tools and Applications, 2022 detailsDeep fused two-step cross-modal hashing with multiple semantic supervision
P. Kang, Z. Lin, Z. Yang, A. M. Bronstein, Q. Li, W. LiuMultimedia Tools and Applications, 2022Existing cross-modal hashing methods ignore the informative multimodal joint information and cannot fully exploit the semantic labels. In this paper, we propose a deep fused two-step cross-modal hashing (DFTH) framework with multiple semantic supervision. In the first step, DFTH learns unified hash codes for instances by a fusion network. Semantic label and similarity reconstruction have been introduced to acquire binary codes that are informative, discriminative and semantic similarity preserving. In the second step, two modality-specific hash networks are learned under the supervision of common hash codes reconstruction, label reconstruction, and intra-modal and inter-modal semantic similarity reconstruction. The modality-specific hash networks can generate semantic preserving binary codes for out-of-sample queries. To deal with the vanishing gradients of binarization, continuous differentiable tanh is introduced to approximate the discrete sign function, making the networks able to back-propagate by automatic gradient computation. Extensive experiments on MIRFlickr25K and NUS-WIDE show the superiority of DFTH over state-of-the-art methods.
P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. Liu, Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval, Applied Intelligence, 52(1), pp. 33-54, 2022 detailsIntra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval
P. Kang, Z. Lin, Z. Yang, X. Fang, A. M. Bronstein, Q. Li, W. LiuApplied Intelligence, 52(1), pp. 33-54, 2022Cross-modal retrieval aims to retrieve related items across different modalities, for example, using an image query to retrieve related text. The existing deep methods ignore both the intra-modal and inter-modal intra-class low-rank structures when fusing various modalities, which decreases the retrieval performance. In this paper, two deep models (denoted as ILCMR and Semi-ILCMR) based on intra-class low-rank regularization are proposed for supervised and semi-supervised cross-modal retrieval, respectively. Specifically, ILCMR integrates the image network and text network into a unified framework to learn a common feature space by imposing three regularization terms to fuse the cross-modal data. First, to align them in the label space, we utilize semantic consistency regularization to convert the data representations to probability distributions over the classes. Second, we introduce an intra-modal low-rank regularization, which encourages the intra-class samples that originate from the same space to be more relevant in the common feature space. Third, an inter-modal low-rank regularization is applied to reduce the cross-modal discrepancy. To enable the low-rank regularization to be optimized using automatic gradients during network back-propagation, we propose the rank-r approximation and specify the explicit gradients for theoretical completeness. In addition to the three regularization terms that rely on label information incorporated by ILCMR, we propose Semi-ILCMR in the semi-supervised regime, which introduces a low-rank constraint before projecting the general representations into the common feature space. Extensive experiments on four public cross-modal datasets demonstrate the superiority of ILCMR and Semi-ILCMR over other state-of-the-art methods.
Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. Baskin, Physical passive patch adversarial attacks on visual odometry systems, Proc. ACCV, 2022 detailsPhysical passive patch adversarial attacks on visual odometry systems
Y. Nemcovsky, M. Jacoby, A. M. Bronstein, C. BaskinProc. ACCV, 2022Deep neural networks are known to be susceptible to adversarial perturbations — small perturbations that alter the output of the network and exist under strict norm limitations. While such perturbations are usually discussed as tailored to a specific input, a universal perturbation can be constructed to alter the model’s output on a set of inputs. Universal perturbations present a more realistic case of adversarial attacks, as awareness of the model’s exact input is not required. In addition, the universal attack setting raises the subject of generalization to unseen data, where given a set of inputs, the universal perturbations aim to alter the model’s output on out-of-sample data. In this work, we study physical passive patch adversarial attacks on visual odometry-based autonomous navigation systems. A visual odometry system aims to infer the relative camera motion between two corresponding viewpoints, and is frequently used by vision-based autonomous navigation systems to estimate their state. For such navigation systems, a patch adversarial perturbation poses a severe security issue, as it can be used to mislead a system onto some collision course. To the best of our knowledge, we show for the first time that the error margin of a visual odometry model can be significantly increased by deploying patch adversarial attacks in the scene. We provide evaluation on synthetic closed-loop drone navigation data and demonstrate that a comparable vulnerability exists in real data.
L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. Bronstein, Machine learning approaches demonstrate that protein structures carry information about their genetic coding, Nature Scientific Reports, 2022 detailsMachine learning approaches demonstrate that protein structures carry information about their genetic coding
L. Ackerman-Schraier, A. A. Rosenberg, A. Marx, A. M. BronsteinNature Scientific Reports, 2022Synonymous codons translate into the same amino acid. Although the identity of synonymous codons is often considered
inconsequential to the final protein structure there is mounting evidence for an association between the two. Our study
examined this association using regression and classification models, finding that codon sequences predict protein backbone dihedral angles with a lower error than amino acid sequences, and that models trained with true dihedral angles have better classification of synonymous codons given structural information than models trained with random dihedral angles. Using this classification approach, we investigated local codon-codon dependencies and tested whether synonymous codon identity can be predicted more accurately from codon context than amino acid context alone, and most specifically which codon context position carries the most predictive power.A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. Bronstein, Defining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues, biorXiv/2022/513383, 2022 detailsDefining amino acid pairs as structural units suggests mutation sensitivity to adjacent residues
A. A. Rosenberg, N. Yehishalom, A. Marx, A. M. BronsteinbiorXiv/2022/513383, 2022Proteins fold from chains of amino acids, forming secondary structures, α-helices and β-strands, that, at least for globular proteins, subsequently fold into a three-dimensional structure. A large-scale analysis of high-resolution protein structures suggests that amino acid pairs constitute another layer of ordered structure, more local than these conventionally defined secondary structures. We develop a cross-peptide-bond Ramachandran plot that captures the 15 conformational preferences of the amino acid pairs and show that the effect of a particular mutation on the stability of a protein depends in a predictable manner on the adjacent amino acid context.
A. Rosenberg, A. Marx, A. M. Bronstein, Codon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon, Nature Communications, 2022 detailsCodon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codon
A. Rosenberg, A. Marx, A. M. BronsteinNature Communications, 2022Synonymous codons translate into chemically identical amino acids. Once considered inconsequential to the formation of the protein product, there is now significant evidence to suggest that codon usage affects co-translational protein folding and the final structure of the expressed protein. Here we develop a method for computing and comparing codon-specific Ramachandran plots and demonstrate that the backbone dihedral angle distributions of some synonymous codons are distinguishable with statistical significance for some secondary structures. This shows that there exists a dependence between codon identity and backbone torsion of the translated amino acid. Although these findings cannot pinpoint the causal direction of this dependence, we discuss the vast biological implications should coding be shown to directly shape protein conformation and demonstrate the usefulness of this method as a tool for probing associations between codon usage and protein structure. Finally, we urge for the inclusion of exact genetic information into structural databases.
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. Arie, Inverse design of spontaneous parametric downconversion for generation of high-dimensional qudits, Optica 9, 602-615, 2022 detailsInverse design of spontaneous parametric downconversion for generation of high-dimensional qudits
E. Rozenberg, A. Karnieli, O. Yesharim, J. Foley-Comer, S. Trajtenberg-Mills, D. Freedman, A. M. Bronstein, A. ArieOptica 9, 602-615, 2022Spontaneous parametric down-conversion in quantum optics is an invaluable resource for the realization of high-dimensional qudits with spatial modes of light. One of the main open challenges is how to directly generate a desirable qudit state in the SPDC process. This problem can be addressed through advanced computational learning methods; however, due to difficulties in modeling the SPDC process by a fully differentiable algorithm that takes into account all interaction effects, progress has been limited. Here, we overcome these limitations and introduce a physically-constrained and differentiable model, validated against experimental results for shaped pump beams and structured crystals, capable of learning every interaction parameter in the process. We avoid any restrictions induced by the stochastic nature of our physical model and integrate the dynamic equations governing the evolution under the SPDC Hamiltonian. We solve the inverse problem of designing a nonlinear quantum optical system that achieves the desired quantum state of down-converted photon pairs. The desired states are defined using either the second-order correlations between different spatial modes or by specifying the required density matrix. By learning nonlinear volume holograms as well as different pump shapes, we successfully show how to generate maximally entangled states. Furthermore, we simulate all-optical coherent control over the generated quantum state by actively changing the profile of the pump beam. Our work can be useful for applications such as novel designs of high-dimensional quantum key distribution and quantum information processing protocols. In addition, our method can be readily applied for controlling other degrees of freedom of light in the SPDC process, such as the spectral and temporal properties, and may even be used in condensed-matter systems having a similar interaction Hamiltonian.
N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. Dreslinski, Mint: An Accelerator For Mining Temporal Motifs, Proc. MICRO, 2022 detailsMint: An Accelerator For Mining Temporal Motifs
N. Talati, H. Ye, S. Vedula, K.-Y. Chen, Y. Chen, D. Liu, Y. Yuan, D. Blaauw, A. M. Bronstein, T. Mudge, R. DreslinskiProc. MICRO, 2022A variety of complex systems, including social and communication networks, financial markets, biology, and neuroscience are modeled using temporal graphs that contain a set of nodes and directed timestamped edges. Temporal motifs in temporal graphs are generalized from subgraph patterns in static graphs in that they also account for edge ordering and time duration, in addition to the graph structure. Mining temporal motifs is a fundamental problem used in several application domains. However, existing software frameworks offer suboptimal performance due to high algorithmic complexity and irregular memory accesses of temporal motif mining. This paper presents Mint—a novel accelerator architecture and a programming model for mining temporal motifs efficiently. We first divide this workload into three fundamental tasks: search, book-keeping, and backtracking. Based on this, we propose a task–centric programming model that enables decoupled, asynchronous execution. This model unlocks massive opportunities for parallelism, and allows storing task context information on-chip. To best utilize the proposed programming model, we design a domain-specific hardware accelerator using its data path and memory subsystem design to cater to the unique workload characteristics of temporal motif mining. To further improve performance, we propose a novel optimization called search index memoization that significantly reduces memory traffic. We comprehensively compare the performance of Mint with state-of-the-art temporal motif mining software frameworks (both approximate and exact) running on both CPU and GPU, and show 9×–2576× benefit in performance.
E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. Litany, Contrast to divide: Self-supervised pre-training for learning with noisy labels, Proc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022 detailsContrast to divide: Self-supervised pre-training for learning with noisy labels
E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, O. LitanyProc. of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022The success of learning with noisy labels (LNL) methods relies heavily on the success of a warm-up stage where standard supervised training is performed using the full (noisy) training set. In this paper, we identify a” warm-up obstacle”: the inability of standard warm-up stages to train high quality feature extractors and avert memorization of noisy labels. We propose” Contrast to Divide”(C2D), a simple framework that solves this problem by pre-training the feature extractor in a self-supervised fashion. Using self-supervised pre-training boosts the performance of existing LNL approaches by drastically reducing the warm-up stage’s susceptibility to noise level, shortening its duration, and improving extracted feature quality. C2D works out of the box with existing methods and demonstrates markedly improved performance, especially in the high noise regime, where we get a boost of more than 27% for CIFAR-100 with 90% noise over the previous state of the art. In real-life noise settings, C2D trained on mini-WebVision outperforms previous works both in WebVision and ImageNet validation sets by 3% top-1 accuracy. We perform an in-depth analysis of the framework, including investigating the performance of different pre-training approaches and estimating the effective upper bound of the LNL performance with semi-supervised learning.
N. Diamant, N. Shandor, A. M. Bronstein, Delta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples, arXiv:2111.08419, 2022 detailsDelta-GAN-Encoder: Encoding semantic changes for explicit image editing, using few synthetic samples
N. Diamant, N. Shandor, A. M. BronsteinarXiv:2111.08419, 2022Understating and controlling generative models’ latent space is a complex task. In this paper, we propose a novel method for learning to control any desired attribute in a pre-trained GAN’s latent space, for the purpose of editing synthesized and real-world data samples accordingly. We perform Sim2Real learning, relying on minimal samples to achieve an unlimited amount of continuous precise edits. We present an Autoencoder-based model that learns to encode the semantics of changes between images as a basis for editing new samples later on, achieving precise desired results – example shown in Fig. 1. While previous editing methods rely on a known structure of latent spaces (e.g., linearity of some semantics in StyleGAN), our method inherently does not require any structural constraints. We demonstrate our method in the domain of facial imagery: editing different expressions, poses, and lighting attributes, achieving state-of-the-art results.
T. Blau, R. Ganz, B. Kawar, A. M. Bronstein, M. Elad , Threat model-agnostic adversarial defense using diffusion models, arXiv preprint arXiv:2207.08089, 2022 detailsThreat model-agnostic adversarial defense using diffusion models
T. Blau, R. Ganz, B. Kawar, A. M. Bronstein, M. EladarXiv preprint arXiv:2207.08089, 2022Deep Neural Networks (DNNs) are highly sensitive to imperceptible malicious perturbations, known as adversarial attacks. Following the discovery of this vulnerability in real-world imaging and vision applications, the associated safety concerns have attracted vast research attention, and many defense techniques have been developed. Most of these defense methods rely on adversarial training (AT) — training the classification network on images perturbed according to a specific threat model, which defines the magnitude of the allowed modification. Although AT leads to promising results, training on a specific threat model fails to generalize to other types of perturbations. A different approach utilizes a preprocessing step to remove the adversarial perturbation from the attacked image. In this work, we follow the latter path and aim to develop a technique that leads to robust classifiers across various realizations of threat models. To this end, we harness the recent advances in stochastic generative modeling, and means to leverage these for sampling from conditional distributions. Our defense relies on an addition of Gaussian i.i.d noise to the attacked image, followed by a pretrained diffusion process — an architecture that performs a stochastic iterative process over a denoising network, yielding a high perceptual quality denoised outcome. The obtained robustness with this stochastic preprocessing step is validated through extensive experiments on the CIFAR-10 dataset, showing that our method outperforms the leading defense methods under various threat models.
A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. Karlinsky, Detector-free weakly supervised grounding by separation, Proc. CVPR, 2022 detailsDetector-free weakly supervised grounding by separation
A. Arbelle, S. Doveh, A. Alfassy, J. Shtok, G. Lev, E. Schwartz, H. Kuehne, H. Barak Levi, P. Sattigeri, R. Panda, C.-F. Chen, A. M. Bronstein, K. Saenko, S. Ullman, R. Giryes, R. Feris, L. KarlinskyProc. CVPR, 2022Nowadays, there is an abundance of data involving images and surrounding free-form text weakly corresponding to those images. Weakly Supervised phrase-Grounding (WSG) deals with the task of using this data to learn to localize (or to ground) arbitrary text phrases in images without any additional annotations. However, most recent SotA methods for WSG assume the existence of a pre-trained object detector, relying on it to produce the ROIs for localization. In this work, we focus on the task of Detector-Free WSG (DF-WSG) to solve WSG without relying on a pre-trained detector. We directly learn everything from the images and associated free-form text pairs, thus potentially gaining an advantage on the categories unsupported by the detector. The key idea behind our proposed Grounding by Separation (GbS) method is synthesizing `text to image-regions’ associations by random alpha-blending of arbitrary image pairs and using the corresponding texts of the pair as conditions to recover the alpha map from the blended image via a segmentation network. At test time, this allows using the query phrase as a condition for a non-blended query image, thus interpreting the test image as a composition of a region corresponding to the phrase and the complement region. Using this approach we demonstrate a significant accuracy improvement, of up to 8.5% over previous DF-WSG SotA, for a range of benchmarks including Flickr30K, Visual Genome, and ReferIt, as well as a significant complementary improvement (above 7%) over the detector-based approaches for WSG.
D. E. Fordham, D. Rosentraub, A. L. Polsky, T. Aviram, Y. Wolf, O. Perl, A. Devir, S. Rosentraub, D. H. Silver, Y. Gold Zamir, A. M. Bronstein, M. Lara Lara, J. Ben Nagi, A. Alvarez, S. Munné, Embryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?, Human Reproduction, Volume 37, Issue 10, Pages 2275–2290, 2022 detailsEmbryologist agreement when assessing blastocyst implantation probability: is data-driven prediction the solution to embryo assessment subjectivity?
D. E. Fordham, D. Rosentraub, A. L. Polsky, T. Aviram, Y. Wolf, O. Perl, A. Devir, S. Rosentraub, D. H. Silver, Y. Gold Zamir, A. M. Bronstein, M. Lara Lara, J. Ben Nagi, A. Alvarez, S. MunnéHuman Reproduction, Volume 37, Issue 10, Pages 2275–2290, 2022STUDY QUESTION
What is the accuracy and agreement of embryologists when assessing the implantation probability of blastocysts using time-lapse imaging (TLI), and can it be improved with a data-driven algorithm?SUMMARY ANSWER
The overall interobserver agreement of a large panel of embryologists was moderate and prediction accuracy was modest, while the purpose-built artificial intelligence model generally resulted in higher performance metrics.WHAT IS KNOWN ALREADY
Previous studies have demonstrated significant interobserver variability amongst embryologists when assessing embryo quality. However, data concerning embryologists’ ability to predict implantation probability using TLI is still lacking. Emerging technologies based on data-driven tools have shown great promise for improving embryo selection and predicting clinical outcomes.STUDY DESIGN, SIZE, DURATION
TLI video files of 136 embryos with known implantation data were retrospectively collected from two clinical sites between 2018 and 2019 for the performance assessment of 36 embryologists and comparison with a deep neural network (DNN).PARTICIPANTS/MATERIALS, SETTING, METHODS
We recruited 39 embryologists from 13 different countries. All participants were blinded to clinical outcomes. A total of 136 TLI videos of embryos that reached the blastocyst stage were used for this experiment. Each embryo’s likelihood of successfully implanting was assessed by 36 embryologists, providing implantation probability grades (IPGs) from 1 to 5, where 1 indicates a very low likelihood of implantation and 5 indicates a very high likelihood. Subsequently, three embryologists with over 5 years of experience provided Gardner scores. All 136 blastocysts were categorized into three quality groups based on their Gardner scores. Embryologist predictions were then converted into predictions of implantation (IPG ≥ 3) and no implantation (IPG ≤ 2). Embryologists’ performance and agreement were assessed using Fleiss kappa coefficient. A 10-fold cross-validation DNN was developed to provide IPGs for TLI video files. The model’s performance was compared to that of the embryologists.MAIN RESULTS AND THE ROLE OF CHANCE
Logistic regression was employed for the following confounding variables: country of residence, academic level, embryo scoring system, log years of experience and experience using TLI. None were found to have a statistically significant impact on embryologist performance at α = 0.05. The average implantation prediction accuracy for the embryologists was 51.9% for all embryos (N = 136). The average accuracy of the embryologists when assessing top quality and poor quality embryos (according to the Gardner score categorizations) was 57.5% and 57.4%, respectively, and 44.6% for fair quality embryos. Overall interobserver agreement was moderate (κ = 0.56, N = 136). The best agreement was achieved in the poor + top quality group (κ = 0.65, N = 77), while the agreement in the fair quality group was lower (κ = 0.25, N = 59). The DNN showed an overall accuracy rate of 62.5%, with accuracies of 62.2%, 61% and 65.6% for the poor, fair and top quality groups, respectively. The AUC for the DNN was higher than that of the embryologists overall (0.70 DNN vs 0.61 embryologists) as well as in all of the Gardner groups (DNN vs embryologists—Poor: 0.69 vs 0.62; Fair: 0.67 vs 0.53; Top: 0.77 vs 0.54).LIMITATIONS, REASONS FOR CAUTION
Blastocyst assessment was performed using video files acquired from time-lapse incubators, where each video contained data from a single focal plane. Clinical data regarding the underlying cause of infertility and endometrial thickness before the transfer was not available, yet may explain implantation failure and lower accuracy of IPGs. Implantation was defined as the presence of a gestational sac, whereas the detection of fetal heartbeat is a more robust marker of embryo viability. The raw data were anonymized to the extent that it was not possible to quantify the number of unique patients and cycles included in the study, potentially masking the effect of bias from a limited patient pool. Furthermore, the lack of demographic data makes it difficult to draw conclusions on how representative the dataset was of the wider population. Finally, embryologists were required to assess the implantation potential, not embryo quality. Although this is not the traditional approach to embryo evaluation, morphology/morphokinetics as a means of assessing embryo quality is believed to be strongly correlated with viability and, for some methods, implantation potential.WIDER IMPLICATIONS OF THE FINDINGS
Embryo selection is a key element in IVF success and continues to be a challenge. Improving the predictive ability could assist in optimizing implantation success rates and other clinical outcomes and could minimize the financial and emotional burden on the patient. This study demonstrates moderate agreement rates between embryologists, likely due to the subjective nature of embryo assessment. In particular, we foun