Arxiv

arXiv is a free distribution service and open-access archive for scholarly articles, operated by Cornell University. It's where researchers share preprints of their work before peer review.

Revisiting Generalization Across Difficulty Levels

Revisiting Generalization Across Difficulty Levels: It's Not So Easy

We investigate how well large language models (LLMs) generalize across different task difficulties, a key question for effective data curation and evaluation. Existing research is mixed regarding whether training on easier or harder data leads to bet...

Read More β†’ πŸ“„ PDF
Canvas-to-Image: Compositional Image Generation wi

Canvas-to-Image: Compositional Image Generation with Multimodal Controls

While modern diffusion models excel at generating high-quality and diverse images, they still struggle with high-fidelity compositional and multimodal control, particularly when users simultaneously specify text prompts, subject references, spatial a...

Read More β†’ πŸ“„ PDF
TraceGen: World Modeling in 3D Trace Space Enables

TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos

Learning new robot tasks on new platforms and in new scenes from only a handful of demonstrations remains challenging. While videos of other embodiments - humans and different robots - are abundant, differences in embodiment, camera, and environment...

Read More β†’ πŸ“„ PDF
ToolOrchestra: Elevating Intelligence via Efficien

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Large language models are powerful generalists, yet solving deep and complex problems such as those of the Humanity's Last Exam (HLE) remains both conceptually challenging and computationally expensive. We show that small orchestrators managing other...

Read More β†’ πŸ“„ PDF
G$^2$VLM: Geometry Grounded Vision Language Model

G$^2$VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

Vision-Language Models (VLMs) still lack robustness in spatial intelligence, demonstrating poor performance on spatial understanding and reasoning tasks. We attribute this gap to the absence of a visual geometry learning process capable of reconstruc...

Read More β†’ πŸ“„ PDF
Matrix: Peer-to-Peer Multi-Agent Synthetic Data Ge

Matrix: Peer-to-Peer Multi-Agent Synthetic Data Generation Framework

Synthetic data has become increasingly important for training large language models, especially when real data is scarce, expensive, or privacy-sensitive. Many such generation tasks require coordinated multi-agent workflows, where specialized agents...

Read More β†’ πŸ“„ PDF
Holographically Emergent Gauge Theory in Symmetric

Holographically Emergent Gauge Theory in Symmetric Quantum Circuits

We develop a novel holographic framework for mixed-state phases in random quantum circuits, both unitary and non-unitary, with a global symmetry $G$. Viewing the circuit as a tensor network, we decompose it into two parts: a symmetric layer, which de...

Read More β†’ πŸ“„ PDF
Seeing without Pixels: Perception from Camera Traj

Seeing without Pixels: Perception from Camera Trajectories

Can one perceive a video's content without seeing its pixels, just from the camera trajectory-the path it carves through space? This paper is the first to systematically investigate this seemingly implausible question. Towards this end, we propose a...

Read More β†’ πŸ“„ PDF
Agentic Learner with Grow-and-Refine Multimodal Se

Agentic Learner with Grow-and-Refine Multimodal Semantic Memory

MLLMs exhibit strong reasoning on isolated queries, yet they operate de novo -- solving each problem independently and often repeating the same mistakes. Existing memory-augmented agents mainly store past trajectories for reuse. However, trajectory-b...

Read More β†’ πŸ“„ PDF
Lesion-Independent Thalamic Degeneration Identifie

Lesion-Independent Thalamic Degeneration Identifies Intrinsically Vulnerable Nuclei Associated with Cognitive Impairment in Multiple Sclerosis

Cognitive impairment in multiple sclerosis (MS) is driven by both focal inflammation and compartmentalized neurodegeneration, yet the relative effect of lesion-independent thalamic atrophy on information processing speed (IPS) remains unclear. This r...

Read More β†’ πŸ“„ PDF
On Evolution-Based Models for Experimentation Unde

On Evolution-Based Models for Experimentation Under Interference

Causal effect estimation in networked systems is central to data-driven decision making. In such settings, interventions on one unit can spill over to others, and in complex physical or social systems, the interaction pathways driving these interfere...

Read More β†’ πŸ“„ PDF
Event-driven eligibility propagation in large spar

Event-driven eligibility propagation in large sparse networks: efficiency shaped by biological realism

Despite remarkable technological advances, AI systems may still benefit from biological principles, such as recurrent connectivity and energy-efficient mechanisms. Drawing inspiration from the brain, we present a biologically plausible extension of t...

Read More β†’ πŸ“„ PDF
Revolutionizing Glioma Segmentation & Grading Usin

Revolutionizing Glioma Segmentation & Grading Using 3D MRI - Guided Hybrid Deep Learning Models

Gliomas are brain tumor types that have a high mortality rate which means early and accurate diagnosis is important for therapeutic intervention for the tumors. To address this difficulty, the proposed research will develop a hybrid deep learning mod...

Read More β†’ πŸ“„ PDF
DSD: A Distributed Speculative Decoding Solution f

DSD: A Distributed Speculative Decoding Solution for Edge-Cloud Agile Large Model Serving

Large language model (LLM) inference often suffers from high decoding latency and limited scalability across heterogeneous edge-cloud environments. Existing speculative decoding (SD) techniques accelerate token generation but remain confined to singl...

Read More β†’ πŸ“„ PDF
Through the telecom lens: Are all training samples

Through the telecom lens: Are all training samples important?

The rise of AI in telecommunications, from optimizing Radio Access Networks to managing user experience, has sharply increased data volumes and training demands. Telecom data is often noisy, high-dimensional, costly to store, process, and label. Desp...

Read More β†’ πŸ“„ PDF
Escaping the Verifier: Learning to Reason via Demo

Escaping the Verifier: Learning to Reason via Demonstrations

Training Large Language Models (LLMs) to reason often relies on Reinforcement Learning (RL) with task-specific verifiers. However, many real-world reasoning-intensive tasks lack verifiers, despite offering abundant expert demonstrations that remain u...

Read More β†’ πŸ“„ PDF
Uncertainty Quantification for Visual Object Pose

Uncertainty Quantification for Visual Object Pose Estimation

Quantifying the uncertainty of an object's pose estimate is essential for robust control and planning. Although pose estimation is a well-studied robotics problem, attaching statistically rigorous uncertainty is not well understood without strict dis...

Read More β†’ πŸ“„ PDF
Finite Size Analysis of Decoy-State BB84 with Adva

Finite Size Analysis of Decoy-State BB84 with Advantage Distillation

Advantage Distillation (AD) is a classical post-processing technique that enhances Quantum Key Distribution (QKD) protocols by increasing the maximum acceptable Quantum Bit Error Rate (QBER) and thus extending the distance at which QKD links can be s...

Read More β†’ πŸ“„ PDF
Attention-Guided Patch-Wise Sparse Adversarial Att

Attention-Guided Patch-Wise Sparse Adversarial Attacks on Vision-Language-Action Models

In recent years, Vision-Language-Action (VLA) models in embodied intelligence have developed rapidly. However, existing adversarial attack methods require costly end-to-end training and often generate noticeable perturbation patches. To address these...

Read More β†’ πŸ“„ PDF
Multi-Crit: Benchmarking Multimodal Judges on Plur

Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following

Large multimodal models (LMMs) are increasingly adopted as judges in multimodal evaluation systems due to their strong instruction following and consistency with human preferences. However, their ability to follow diverse, fine-grained evaluation cri...

Read More β†’ πŸ“„ PDF
FPGA-tailored algorithms for real-time decoding of

FPGA-tailored algorithms for real-time decoding of quantum LDPC codes

Real-time decoding is crucial for fault-tolerant quantum computing but likely requires specialized hardware such as field-programmable gate arrays (FPGAs), whose parallelism can alter relative algorithmic performance. We analyze FPGA-tailored version...

Read More β†’ πŸ“„ PDF
EvilGenie: A Reward Hacking Benchmark

EvilGenie: A Reward Hacking Benchmark

We introduce EvilGenie, a benchmark for reward hacking in programming settings. We source problems from LiveCodeBench and create an environment in which agents can easily reward hack, such as by hardcoding test cases or editing the testing files. We...

Read More β†’ πŸ“„ PDF
CaFlow: Enhancing Long-Term Action Quality Assessm

CaFlow: Enhancing Long-Term Action Quality Assessment with Causal Counterfactual Flow

Action Quality Assessment (AQA) predicts fine-grained execution scores from action videos and is widely applied in sports, rehabilitation, and skill evaluation. Long-term AQA, as in figure skating or rhythmic gymnastics, is especially challenging sin...

Read More β†’ πŸ“„ PDF
Continual Error Correction on Low-Resource Devices

Continual Error Correction on Low-Resource Devices

The proliferation of AI models in everyday devices has highlighted a critical challenge: prediction errors that degrade user experience. While existing solutions focus on error detection, they rarely provide efficient correction mechanisms, especiall...

Read More β†’ πŸ“„ PDF
Rapid ground state energy estimation with a Sparse

Rapid ground state energy estimation with a Sparse Pauli Dynamics-enabled Variational Double Bracket Flow

Ground state energy estimation for strongly correlated quantum systems remains a central challenge in computational physics and chemistry. While tensor network methods like DMRG provide efficient solutions for one-dimensional systems, higher-dimensio...

Read More β†’ πŸ“„ PDF
Factorisation conditions and causality for local m

Factorisation conditions and causality for local measurements in QFT

Quantum operations that are perfectly admissible in non-relativistic quantum theory can enable signalling between spacelike separated regions when naively imported into quantum field theory (QFT). Prominent examples of such "impossible measurements",...

Read More β†’ πŸ“„ PDF
Aligning LLMs Toward Multi-Turn Conversational Out

Aligning LLMs Toward Multi-Turn Conversational Outcomes Using Iterative PPO

Optimizing large language models (LLMs) for multi-turn conversational outcomes remains a significant challenge, especially in goal-oriented settings like AI marketing or sales agents who facilitate transactions via messaging platforms. The difficulty...

Read More β†’ πŸ“„ PDF
Bridging the Unavoidable A Priori: A Framework for

Bridging the Unavoidable A Priori: A Framework for Comparative Causal Modeling

AI/ML models have rapidly gained prominence as innovations for solving previously unsolved problems and their unintended consequences from amplifying human biases. Advocates for responsible AI/ML have sought ways to draw on the richer causal models o...

Read More β†’ πŸ“„ PDF
Mechanisms of Non-Monotonic Scaling in Vision Tran

Mechanisms of Non-Monotonic Scaling in Vision Transformers

Deeper Vision Transformers often perform worse than shallower ones, which challenges common scaling assumptions. Through a systematic empirical analysis of ViT-S, ViT-B, and ViT-L on ImageNet, we identify a consistent three-phase Cliff-Plateau-Climb...

Read More β†’ πŸ“„ PDF
Qwen3-VL Technical Report

Qwen3-VL Technical Report

We introduce Qwen3-VL, the most capable vision-language model in the Qwen series to date, achieving superior performance across a broad range of multimodal benchmarks. It natively supports interleaved contexts of up to 256K tokens, seamlessly integra...

Read More β†’ πŸ“„ PDF
Tunable WS$_2$ Micro-Dome Open Cavity Single Photo

Tunable WS$_2$ Micro-Dome Open Cavity Single Photon Source

Versatile, tunable, and potentially scalable single-photon sources are a key asset in emergent photonic quantum technologies. In this work, a single-photon source based on WS$_2$ micro-domes, created via hydrogen ion irradiation, is realized and inte...

Read More β†’ πŸ“„ PDF
The author is dead, but what if they never lived?

The author is dead, but what if they never lived? A reception experiment on Czech AI- and human-authored poetry

Large language models are increasingly capable of producing creative texts, yet most studies on AI-generated poetry focus on English -- a language that dominates training data. In this paper, we examine the perception of AI- and human-written Czech p...

Read More β†’ πŸ“„ PDF
Evidence For A Correlation Between Astrophysical N

Evidence For A Correlation Between Astrophysical Neutrinos and Radio Flares

We use data from the first two epochs of the Very Large Array Sky Survey (VLASS) and the IceCube Neutrino Observatory to search for evidence of a correlation between radio variability and the detection of astrophysical neutrinos. We find an excess nu...

Read More β†’ πŸ“„ PDF
Scale-Agnostic Kolmogorov-Arnold Geometry in Neura

Scale-Agnostic Kolmogorov-Arnold Geometry in Neural Networks

Recent work by Freedman and Mulligan demonstrated that shallow multilayer perceptrons spontaneously develop Kolmogorov-Arnold geometric (KAG) structure during training on synthetic three-dimensional tasks. However, it remained unclear whether this ph...

Read More β†’ πŸ“„ PDF
Active Learning for GCN-based Action Recognition

Active Learning for GCN-based Action Recognition

Despite the notable success of graph convolutional networks (GCNs) in skeleton-based action recognition, their performance often depends on large volumes of labeled data, which are frequently scarce in practical settings. To address this limitation,...

Read More β†’ πŸ“„ PDF
TAGFN: A Text-Attributed Graph Dataset for Fake Ne

TAGFN: A Text-Attributed Graph Dataset for Fake News Detection in the Age of LLMs

Large Language Models (LLMs) have recently revolutionized machine learning on text-attributed graphs, but the application of LLMs to graph outlier detection, particularly in the context of fake news detection, remains significantly underexplored. One...

Read More β†’ πŸ“„ PDF
On the Origin of Algorithmic Progress in AI

On the Origin of Algorithmic Progress in AI

Algorithms have been estimated to increase AI training FLOP efficiency by a factor of 22,000 between 2012 and 2023 [Ho et al., 2024]. Running small-scale ablation experiments on key innovations from this time period, we are able to account for less t...

Read More β†’ πŸ“„ PDF
Beyond URLs: Metadata Diversity and Position for E

Beyond URLs: Metadata Diversity and Position for Efficient LLM Pretraining

Incorporating metadata in Large Language Models (LLMs) pretraining has recently emerged as a promising approach to accelerate training. However prior work highlighted only one useful signal-URLs, leaving open the question of whether other forms of me...

Read More β†’ πŸ“„ PDF
Auxiliary Metrics Help Decoding Skill Neurons in t

Auxiliary Metrics Help Decoding Skill Neurons in the Wild

Large language models (LLMs) exhibit remarkable capabilities across a wide range of tasks, yet their internal mechanisms remain largely opaque. In this paper, we introduce a simple, lightweight, and broadly applicable method with a focus on isolating...

Read More β†’ πŸ“„ PDF
Lazy Quantum Walks with Native Multiqubit Gates

Lazy Quantum Walks with Native Multiqubit Gates

Quantum walks, the quantum analogue to the classical random walk, have been shown to model fluid dynamics. Neutral atom hardware is a promising choice of platform for implementing quantum walks due to its ability to implement native multiqubit ($\geq...

Read More β†’ πŸ“„ PDF
Beyond Accuracy: An Empirical Study of Uncertainty

Beyond Accuracy: An Empirical Study of Uncertainty Estimation in Imputation

Handling missing data is a central challenge in data-driven analysis. Modern imputation methods not only aim for accurate reconstruction but also differ in how they represent and quantify uncertainty. Yet, the reliability and calibration of these unc...

Read More β†’ πŸ“„ PDF
ReSAM: Refine, Requery, and Reinforce: Self-Prompt

ReSAM: Refine, Requery, and Reinforce: Self-Prompting Point-Supervised Segmentation for Remote Sensing Images

Interactive segmentation models such as the Segment Anything Model (SAM) have demonstrated remarkable generalization on natural images, but perform suboptimally on remote sensing imagery (RSI) due to severe domain shift and the scarcity of dense anno...

Read More β†’ πŸ“„ PDF
Detecting absence: A dedicated prediction-error si

Detecting absence: A dedicated prediction-error signal emerging in the auditory thalamus

How does the brain know what is out there and what is not? Living organisms cannot rely solely on sensory signals for perception because they are noisy and ambiguous. To transform sensory signals into stable percepts, the brain uses its prior knowled...

Read More β†’ πŸ“„ PDF
The derivation of the Liouville equation from the

The derivation of the Liouville equation from the Schrodinger equation and its implications

We present a new way of deriving classical mechanics from quantum mechanics. A key feature of the method is its compatibility with the standard approach used to derive transition rates between quantum states due to interactions. We apply the develope...

Read More β†’ πŸ“„ PDF
TAB-DRW: A DFT-based Robust Watermark for Generati

TAB-DRW: A DFT-based Robust Watermark for Generative Tabular Data

The rise of generative AI has enabled the production of high-fidelity synthetic tabular data across fields such as healthcare, finance, and public policy, raising growing concerns about data provenance and misuse. Watermarking offers a promising solu...

Read More β†’ πŸ“„ PDF
Dichroism from Thermoelectric Chiral Drives: Gener

Dichroism from Thermoelectric Chiral Drives: Generalized Sum Rules for Orbital and Heat Magnetizations

We introduce a unified framework that relates orbital and heat magnetizations to experimentally accessible excitation spectra, through thermoelectric probes and generalized sum rules. By analyzing zero-temperature transport coefficients and applying...

Read More β†’ πŸ“„ PDF
Unconventional orders in the maple-leaf ferro-anti

Unconventional orders in the maple-leaf ferro-antiferromagnetic Heisenberg model

Motivated by the search for unconventional orders in frustrated quantum magnets, we present a multi-method investigation into the nature of the quantum phase diagram of the spin-$1/2$ Heisenberg model on the maple-leaf lattice with three symmetry-ine...

Read More β†’ πŸ“„ PDF
Visualizing LLM Latent Space Geometry Through Dime

Visualizing LLM Latent Space Geometry Through Dimensionality Reduction

Large language models (LLMs) achieve state-of-the-art results across many natural language tasks, but their internal mechanisms remain difficult to interpret. In this work, we extract, process, and visualize latent state geometries in Transformer-bas...

Read More β†’ πŸ“„ PDF
MoGAN: Improving Motion Quality in Video Diffusion

MoGAN: Improving Motion Quality in Video Diffusion via Few-Step Motion Adversarial Post-Training

Video diffusion models achieve strong frame-level fidelity but still struggle with motion coherence, dynamics and realism, often producing jitter, ghosting, or implausible dynamics. A key limitation is that the standard denoising MSE objective provid...

Read More β†’ πŸ“„ PDF
On the Limits of Innate Planning in Large Language

On the Limits of Innate Planning in Large Language Models

Large language models (LLMs) achieve impressive results on many benchmarks, yet their capacity for planning and stateful reasoning remains unclear. We study these abilities directly, without code execution or other tools, using the 8-puzzle: a classi...

Read More β†’ πŸ“„ PDF
An AI-Enabled Hybrid Cyber-Physical Framework for

An AI-Enabled Hybrid Cyber-Physical Framework for Adaptive Control in Smart Grids

Smart grids are a fusion of classical power infrastructure and advanced communication networks and smart control, to create a cyber-physical environment that is more efficient and flexible than ever before. This integration causes vulnerabilities tha...

Read More β†’ πŸ“„ PDF
Model-Based Policy Adaptation for Closed-Loop End-

Model-Based Policy Adaptation for Closed-Loop End-to-End Autonomous Driving

End-to-end (E2E) autonomous driving models have demonstrated strong performance in open-loop evaluations but often suffer from cascading errors and poor generalization in closed-loop settings. To address this gap, we propose Model-based Policy Adapta...

Read More β†’ πŸ“„ PDF
Deep Learning-Based Multiclass Classification of O

Deep Learning-Based Multiclass Classification of Oral Lesions with Stratified Augmentation

Oral cancer is highly common across the globe and is mostly diagnosed during the later stages due to the close visual similarity to benign, precancerous, and malignant lesions in the oral cavity. Implementing computer aided diagnosis systems early on...

Read More β†’ πŸ“„ PDF
Learning When to Stop: Adaptive Latent Reasoning v

Learning When to Stop: Adaptive Latent Reasoning via Reinforcement Learning

Latent reasoning represents a new development in Transformer language models that has shown potential in compressing reasoning lengths compared to chain-of-thought reasoning. By directly passing the information-rich previous final latent state into t...

Read More β†’ πŸ“„ PDF
Harmony: Harmonizing Audio and Video Generation th

Harmony: Harmonizing Audio and Video Generation through Cross-Task Synergy

The synthesis of synchronized audio-visual content is a key challenge in generative AI, with open-source models facing challenges in robust audio-video alignment. Our analysis reveals that this issue is rooted in three fundamental challenges of the j...

Read More β†’ πŸ“„ PDF
HarmonicAttack: An Adaptive Cross-Domain Audio Wat

HarmonicAttack: An Adaptive Cross-Domain Audio Watermark Removal

The availability of high-quality, AI-generated audio raises security challenges such as misinformation campaigns and voice-cloning fraud. A key defense against the misuse of AI-generated audio is by watermarking it, so that it can be easily distingui...

Read More β†’ πŸ“„ PDF
Quantum Latent Gauge and Coherence Selective Force

Quantum Latent Gauge and Coherence Selective Forces

We propose a hidden U(1) gauge interaction that couples exclusively to quantum coherence in massive systems. The central innovation is a conserved coherence current operator constructed from the Noether mass current via operator-level coarse-graining...

Read More β†’ πŸ“„ PDF
Enhanced Landmark Detection Model in Pelvic Fluoro

Enhanced Landmark Detection Model in Pelvic Fluoroscopy using 2D/3D Registration Loss

Automated landmark detection offers an efficient approach for medical professionals to understand patient anatomic structure and positioning using intra-operative imaging. While current detection methods for pelvic fluoroscopy demonstrate promising a...

Read More β†’ πŸ“„ PDF
Multimodal Robust Prompt Distillation for 3D Point

Multimodal Robust Prompt Distillation for 3D Point Cloud Models

Adversarial attacks pose a significant threat to learning-based 3D point cloud models, critically undermining their reliability in security-sensitive applications. Existing defense methods often suffer from (1) high computational overhead and (2) poo...

Read More β†’ πŸ“„ PDF
BAMAS: Structuring Budget-Aware Multi-Agent System

BAMAS: Structuring Budget-Aware Multi-Agent Systems

Large language model (LLM)-based multi-agent systems have emerged as a powerful paradigm for enabling autonomous agents to solve complex tasks. As these systems scale in complexity, cost becomes an important consideration for practical deployment. Ho...

Read More β†’ πŸ“„ PDF
From Prediction to Foresight: The Role of AI in De

From Prediction to Foresight: The Role of AI in Designing Responsible Futures

In an era marked by rapid technological advancements and complex global challenges, responsible foresight has emerged as an essential framework for policymakers aiming to navigate future uncertainties and shape the future. Responsible foresight entai...

Read More β†’ πŸ“„ PDF
Self-Transparency Failures in Expert-Persona LLMs:

Self-Transparency Failures in Expert-Persona LLMs: A Large-Scale Behavioral Audit

If a language model cannot reliably disclose its AI identity in expert contexts, users cannot trust its competence boundaries. This study examines self-transparency in models assigned professional personas within high-stakes domains where false exper...

Read More β†’ πŸ“„ PDF
RoParQ: Paraphrase-Aware Alignment of Large Langua

RoParQ: Paraphrase-Aware Alignment of Large Language Models Towards Robustness to Paraphrased Questions

Large Language Models (LLMs) often exhibit inconsistent behavior when answering paraphrased questions, suggesting a reliance on surface-level patterns rather than true semantic understanding. To address this limitation, we introduce RoParQ, a benchma...

Read More β†’ πŸ“„ PDF
Enhanced antineutrino emission from $Ξ²$ decay in c

Enhanced antineutrino emission from $Ξ²$ decay in core-collapse supernovae with self-consistent weak decay rates

Nuclear weak-interaction rates are known to exert a prominent effect in the late-stages of stellar collapse. Despite their importance, most studies to date on core-collapse supernovae (CCSNe) have focused primarily on the effects of electron captures...

Read More β†’ πŸ“„ PDF
A decoupled alignment kernel for peptide membrane

A decoupled alignment kernel for peptide membrane permeability predictions

Cyclic peptides are promising modalities for targeting intracellular sites; however, cell-membrane permeability remains a key bottleneck, exacerbated by limited public data and the need for well-calibrated uncertainty. Instead of relying on data-eage...

Read More β†’ πŸ“„ PDF
UAVLight: A Benchmark for Illumination-Robust 3D R

UAVLight: A Benchmark for Illumination-Robust 3D Reconstruction in Unmanned Aerial Vehicle (UAV) Scenes

Illumination inconsistency is a fundamental challenge in multi-view 3D reconstruction. Variations in sunlight direction, cloud cover, and shadows break the constant-lighting assumption underlying both classical multi-view stereo (MVS) and structure f...

Read More β†’ πŸ“„ PDF
Some aspects of robustness in modern Markov Chain

Some aspects of robustness in modern Markov Chain Monte Carlo

Markov Chain Monte Carlo (MCMC) is a flexible approach to approximate sampling from intractable probability distributions, with a rich theoretical foundation and comprising a wealth of exemplar algorithms. While the qualitative correctness of MCMC al...

Read More β†’ πŸ“„ PDF
Machine Learning Approaches to Clinical Risk Predi

Machine Learning Approaches to Clinical Risk Prediction: Multi-Scale Temporal Alignment in Electronic Health Records

This study proposes a risk prediction method based on a Multi-Scale Temporal Alignment Network (MSTAN) to address the challenges of temporal irregularity, sampling interval differences, and multi-scale dynamic dependencies in Electronic Health Record...

Read More β†’ πŸ“„ PDF
Computing Strategic Responses to Non-Linear Classi

Computing Strategic Responses to Non-Linear Classifiers

We consider the problem of strategic classification, where the act of deploying a classifier leads to strategic behaviour that induces a distribution shift on subsequent observations. Current approaches to learning classifiers in strategic settings a...

Read More β†’ πŸ“„ PDF
VacuumVLA: Boosting VLA Capabilities via a Unified

VacuumVLA: Boosting VLA Capabilities via a Unified Suction and Gripping Tool for Complex Robotic Manipulation

Vision Language Action models have significantly advanced general purpose robotic manipulation by harnessing large scale pretrained vision and language representations. Among existing approaches, a majority of current VLA systems employ parallel two...

Read More β†’ πŸ“„ PDF
MAD-DAG: Protecting Blockchain Consensus from MEV

MAD-DAG: Protecting Blockchain Consensus from MEV

Blockchain security is threatened by selfish mining, where a miner (operator) deviates from the protocol to increase their revenue. Selfish mining is exacerbated by adverse conditions: rushing (network propagation advantage for the selfish miner), va...

Read More β†’ πŸ“„ PDF
MMA: A Momentum Mamba Architecture for Human Activ

MMA: A Momentum Mamba Architecture for Human Activity Recognition with Inertial Sensors

Human activity recognition (HAR) from inertial sensors is essential for ubiquitous computing, mobile health, and ambient intelligence. Conventional deep models such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and transf...

Read More β†’ πŸ“„ PDF
Seeing Twice: How Side-by-Side T2I Comparison Chan

Seeing Twice: How Side-by-Side T2I Comparison Changes Auditing Strategies

While generative AI systems have gained popularity in diverse applications, their potential to produce harmful outputs limits their trustworthiness and utility. A small but growing line of research has explored tools and processes to better engage no...

Read More β†’ πŸ“„ PDF
$\mathcal{E}_0$: Enhancing Generalization and Fine

$\mathcal{E}_0$: Enhancing Generalization and Fine-Grained Control in VLA Models via Continuized Discrete Diffusion

Vision-Language-Action (VLA) models offer a unified framework for robotic manipulation by integrating visual perception, language understanding, and control generation. Yet existing VLA models still struggle to generalize across diverse tasks, scenes...

Read More β†’ πŸ“„ PDF
Video Generation Models Are Good Latent Reward Mod

Video Generation Models Are Good Latent Reward Models

Reward feedback learning (ReFL) has proven effective for aligning image generation with human preferences. However, its extension to video generation faces significant challenges. Existing video reward models rely on vision-language models designed f...

Read More β†’ πŸ“„ PDF
Context-Specific Causal Graph Discovery with Unobs

Context-Specific Causal Graph Discovery with Unobserved Contexts: Non-Stationarity, Regimes and Spatio-Temporal Patterns

Real-world data, for example in climate applications, often consists of spatially gridded time series data or data with comparable structure. While the underlying system is often believed to behave similar at different points in space and time, those...

Read More β†’ πŸ“„ PDF
Bangla Sign Language Translation: Dataset Creation

Bangla Sign Language Translation: Dataset Creation Challenges, Benchmarking and Prospects

Bangla Sign Language Translation (BdSLT) has been severely constrained so far as the language itself is very low resource. Standard sentence level dataset creation for BdSLT is of immense importance for developing AI based assistive tools for deaf an...

Read More β†’ πŸ“„ PDF
Simulations of high-energy neutrino emissions from

Simulations of high-energy neutrino emissions from blazars with the LeHa-Paris code

The identification of astrophysical sources responsible for high-energy cosmic neutrinos has long been a challenge. A significant milestone was achieved with the blazar TXS 0506+056, which was found to be in a flaring state of high gamma-ray emission...

Read More β†’ πŸ“„ PDF
Predictive Safety Shield for Dyna-Q Reinforcement

Predictive Safety Shield for Dyna-Q Reinforcement Learning

Obtaining safety guarantees for reinforcement learning is a major challenge to achieve applicability for real-world tasks. Safety shields extend standard reinforcement learning and achieve hard safety guarantees. However, existing safety shields comm...

Read More β†’ πŸ“„ PDF
The Age-specific Alzheimer 's Disease Prediction w

The Age-specific Alzheimer 's Disease Prediction with Characteristic Constraints in Nonuniform Time Span

Alzheimer's disease is a debilitating disorder marked by a decline in cognitive function. Timely identification of the disease is essential for the development of personalized treatment strategies that aim to mitigate its progression. The application...

Read More β†’ πŸ“„ PDF
Phase Transition for Stochastic Block Model with m

Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities (II)

A fundamental theoretical question in network analysis is to determine under which conditions community recovery is possible in polynomial time in the Stochastic Block Model (SBM). When the number $K$ of communities remains smaller than $\sqrt{n}$ --...

Read More β†’ πŸ“„ PDF
EoS-FM: Can an Ensemble of Specialist Models act a

EoS-FM: Can an Ensemble of Specialist Models act as a Generalist Feature Extractor?

Recent advances in foundation models have shown great promise in domains such as natural language processing and computer vision, and similar efforts are now emerging in the Earth Observation community. These models aim to generalize across tasks wit...

Read More β†’ πŸ“„ PDF
Pessimistic Verification for Open Ended Math Quest

Pessimistic Verification for Open Ended Math Questions

The key limitation of the verification performance lies in the ability of error detection. With this intuition we designed several variants of pessimistic verification, which are simple workflows that could significantly improve the verification of o...

Read More β†’ πŸ“„ PDF
Self-Paced Learning for Images of Antinuclear Anti

Self-Paced Learning for Images of Antinuclear Antibodies

Antinuclear antibody (ANA) testing is a crucial method for diagnosing autoimmune disorders, including lupus, SjΓΆgren's syndrome, and scleroderma. Despite its importance, manual ANA detection is slow, labor-intensive, and demands years of training. AN...

Read More β†’ πŸ“„ PDF
Voice, Bias, and Coreference: An Interpretability

Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation

Unlike text, speech conveys information about the speaker, such as gender, through acoustic cues like pitch. This gives rise to modality-specific bias concerns. For example, in speech translation (ST), when translating from languages with notional ge...

Read More β†’ πŸ“„ PDF
Mechanistic Interpretability for Transformer-based

Mechanistic Interpretability for Transformer-based Time Series Classification

Transformer-based models have become state-of-the-art tools in various machine learning tasks, including time series classification, yet their complexity makes understanding their internal decision-making challenging. Existing explainability methods...

Read More β†’ πŸ“„ PDF
IntAttention: A Fully Integer Attention Pipeline f

IntAttention: A Fully Integer Attention Pipeline for Efficient Edge Inference

Deploying Transformer models on edge devices is limited by latency and energy budgets. While INT8 quantization effectively accelerates the primary matrix multiplications, it exposes the softmax as the dominant bottleneck. This stage incurs a costly d...

Read More β†’ πŸ“„ PDF
Tool-RoCo: An Agent-as-Tool Self-organization Larg

Tool-RoCo: An Agent-as-Tool Self-organization Large Language Model Benchmark in Multi-robot Cooperation

This study proposes Tool-RoCo, a novel benchmark for evaluating large language models (LLMs) in long-term multi-agent cooperation based on RoCo, a multi-robot cooperative benchmark. Recent research on LLM-based multi-agent systems has relied on prede...

Read More β†’ πŸ“„ PDF
Metastability in the Dissipative Quantum Rabi Mode

Metastability in the Dissipative Quantum Rabi Model

The dissipative quantum Rabi model exhibits rich non-equilibrium physics, including a dissipative phase transition from the normal phase to the superradiant phase. In this work, we investigate the stability of the superradiant phase in the presence o...

Read More β†’ πŸ“„ PDF
Generalized Design Choices for Deepfake Detectors

Generalized Design Choices for Deepfake Detectors

The effectiveness of deepfake detection methods often depends less on their core design and more on implementation details such as data preprocessing, augmentation strategies, and optimization techniques. These factors make it difficult to fairly com...

Read More β†’ πŸ“„ PDF
CanKD: Cross-Attention-based Non-local operation f

CanKD: Cross-Attention-based Non-local operation for Feature-based Knowledge Distillation

We propose Cross-Attention-based Non-local Knowledge Distillation (CanKD), a novel feature-based knowledge distillation framework that leverages cross-attention mechanisms to enhance the knowledge transfer process. Unlike traditional self-attention-b...

Read More β†’ πŸ“„ PDF
Modeling dissipation in quantum active matter

Modeling dissipation in quantum active matter

Active matter denotes a system of particles immersed in an external environment, from which the particles extract energy continuously in order to perform motion. Extending the paradigm of active matter to a quantum framework requires an open quantum...

Read More β†’ πŸ“„ PDF
The Feasibility of Using Fe XXIII Metastable Trans

The Feasibility of Using Fe XXIII Metastable Transitions as a Density Diagnostic for LMXB Disk Winds

Low mass X-ray binaries (LMXBs) occasionally show signs of outflowing material from the accretion disk. Studying these outflows can inform the understanding of the geometry of the systems, as well as the dynamics and energetics of accretion. One key...

Read More β†’ πŸ“„ PDF
Lost in Time? A Meta-Learning Framework for Time-S

Lost in Time? A Meta-Learning Framework for Time-Shift-Tolerant Physiological Signal Transformation

Translating non-invasive signals such as photoplethysmography (PPG) and ballistocardiography (BCG) into clinically meaningful signals like arterial blood pressure (ABP) is vital for continuous, low-cost healthcare monitoring. However, temporal misali...

Read More β†’ πŸ“„ PDF
The relativistic tidal tensor: general solutions f

The relativistic tidal tensor: general solutions for stationary axisymmetric spacetimes and the Hills mass of naked singularities

The tidal forces experienced on an orbit contain, in principle, information about the underlying spacetime an object is moving through. Astronomical observations often probe the properties of tidal forces in the relativistic regime, and could thus in...

Read More β†’ πŸ“„ PDF
Quantum theory of electrically levitated nanoparti

Quantum theory of electrically levitated nanoparticle-ion systems: Motional dynamics and sympathetic cooling

We develop the theory describing the quantum coupled dynamics of the center-of-mass motion of a nanoparticle and an ensemble of ions co-trapped in a dual-frequency linear Paul trap. We first derive analytical expressions for the motional frequencies...

Read More β†’ πŸ“„ PDF
Merge and Bound: Direct Manipulations on Weights f

Merge and Bound: Direct Manipulations on Weights for Class Incremental Learning

We present a novel training approach, named Merge-and-Bound (M&B) for Class Incremental Learning (CIL), which directly manipulates model weights in the parameter space for optimization. Our algorithm involves two types of weight merging: inter-task w...

Read More β†’ πŸ“„ PDF
Bayesian Analysis of the Complex Singlet Model wit

Bayesian Analysis of the Complex Singlet Model with Phase Transition Gravitational Waves

We explore the prospects of probing the Complex Singlet Extension of the Standard Model (CxSM) with gravitational waves from the Electroweak phase transition. The study establishes a connection of the scalar potential parameters, the thermodynamic pr...

Read More β†’ πŸ“„ PDF
Magic spreading under unitary Clifford dynamics

Magic spreading under unitary Clifford dynamics

Nonstabilizerness, or quantum magic, presents a valuable resource in quantum error correction and computation. We study the dynamics of locally injected magic in unitary Clifford circuits, where the total magic is conserved. However, the absence of p...

Read More β†’ πŸ“„ PDF
Frequency-Aware Token Reduction for Efficient Visi

Frequency-Aware Token Reduction for Efficient Vision Transformer

Vision Transformers have demonstrated exceptional performance across various computer vision tasks, yet their quadratic computational complexity concerning token length remains a significant challenge. To address this, token reduction methods have be...

Read More β†’ πŸ“„ PDF